Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 28:10:e2092.
doi: 10.7717/peerj-cs.2092. eCollection 2024.

FLMatchQA: a recursive neural network-based question answering with customized federated learning model

Affiliations

FLMatchQA: a recursive neural network-based question answering with customized federated learning model

Saranya M et al. PeerJ Comput Sci. .

Abstract

More sophisticated data access is possible with artificial intelligence (AI) techniques such as question answering (QA), but regulations and privacy concerns have limited their use. Federated learning (FL) deals with these problems, and QA is a viable substitute for AI. The utilization of hierarchical FL systems is examined in this research, along with an ideal method for developing client-specific adapters. The User Modified Hierarchical Federated Learning Model (UMHFLM) selects local models for users' tasks. The article suggests employing recurrent neural network (RNN) as a neural network (NN) technique for learning automatically and categorizing questions based on natural language into the appropriate templates. Together, local and global models are developed, with the worldwide model influencing local models, which are, in turn, combined for personalization. The method is applied in natural language processing pipelines for phrase matching employing template exact match, segmentation, and answer type detection. The (SQuAD-2.0), a DL-based QA method for acquiring knowledge of complicated SPARQL test questions and their accompanying SPARQL queries across the DBpedia dataset, was used to train and assess the model. The SQuAD2.0 datasets evaluate the model, which identifies 38 distinct templates. Considering the top two most likely templates, the RNN model achieves template classification accuracy of 92.8% and 61.8% on the SQuAD2.0 and QALD-7 datasets. A study on data scarcity among participants found that FL Match outperformed BERT significantly. A MAP margin of 2.60% exists between BERT and FL Match at a 100% data ratio and an MRR margin of 7.23% at a 20% data ratio.

Keywords: Accuracy; Artificial intelligence; Data science; Exact match; F1 score; Federated learning; Machine learning; Natural language processing; Neural network; Question answering.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

Figure 1
Figure 1. UMHFLM’s conceptual illustration.
Data distributions on clients to overcome restrictions. Instead of fitting a single global adapter to all heterogeneous data distributions, hyper networks are used to build adapter parameters for each client based on client data distribution information.
Figure 2
Figure 2. A comprehensive FL model that aligns with data flow.
Federated learning matching system for QA, “FL Match,” to address FL in heterogeneous settings by quantifying question-response relevance using dispersed QA datasets confidentially.
Figure 3
Figure 3. SQuAD2.0 dataset generation workflow.
In order to train their models, the clients are distinct participants in quality assurance who use data that is privately held.
Figure 4
Figure 4. SQuAD2.0 task based on sequence-to-sequence model.
This models are utilized for tasks that involve the generation of new sentences based on a given input. These tasks include summarization, translation, and generative question answering.
Figure 5
Figure 5. Variations in the ground truth answer length (up to 15) for CCHNS.
Figure 6
Figure 6. Comparisons of performance for various question heads.
Figure 7
Figure 7. Comparisons of performance for questions that are non-interrogative.
Figure 8
Figure 8. Result analysis F1 and EM(a).
Figure 9
Figure 9. Result analysis losses and global Norm(a).

References

    1. Abebe Fenta A. Vector representation of amharic idioms for natural language processing applications using machine learning approach. Machine Learning Research. 2023;8(2):17–22. doi: 10.11648/j.mlr.20230802.11. - DOI
    1. Azad HK, Deepak A. Query expansion techniques for information retrieval: a survey. Information Processing & Management. 2019;56(5):1698–1735. doi: 10.1016/j.ipm.2019.05.009. - DOI
    1. Bao X, Su C, Xiong Y, Huang W, Hu Y. FLChain: a blockchain for auditable federated learning with trust and incentive. 2019 5th international conference on big data computing and communications (BIGCOM); 2019. - DOI
    1. Bonawitz K, Kairouz P, McMahan B, Ramage D. Federated learning and privacy. Queue. 2021;19(5):87–114. doi: 10.1145/3494834.3500240. - DOI
    1. Casado FE, Lema D, Criado MF, Iglesias R, Regueiro CV, Barro S. Concept drift detection and adaptation for federated and continual learning. Multimedia Tools and Applications. 2021;81:3397–3419. doi: 10.1007/s11042-021-11219-x. - DOI

LinkOut - more resources