Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Oct 18;14(10):e0211844.
doi: 10.1371/journal.pone.0211844. eCollection 2019.

Predicting diabetes second-line therapy initiation in the Australian population via time span-guided neural attention network

Affiliations

Predicting diabetes second-line therapy initiation in the Australian population via time span-guided neural attention network

Samuele Fiorini et al. PLoS One. .

Abstract

Introduction: The first line of treatment for people with Diabetes mellitus is metformin. However, over the course of the disease metformin may fail to achieve appropriate glycemic control, and a second-line therapy may become necessary. In this paper we introduce Tangle, a time span-guided neural attention model that can accurately and timely predict the upcoming need for a second-line diabetes therapy from administrative data in the Australian adult population. The method is suitable for designing automatic therapy review recommendations for patients and their providers without the need to collect clinical measures.

Data: We analyzed seven years of de-identified records (2008-2014) of the 10% publicly available linked sample of Medicare Benefits Schedule (MBS) and Pharmaceutical Benefits Scheme (PBS) electronic databases of Australia.

Methods: By design, Tangle inherits the representational power of pre-trained word embedding, such as GloVe, to encode sequences of claims with the related MBS codes. Moreover, the proposed attention mechanism natively exploits the information hidden in the time span between two successive claims (measured in number of days). We compared the proposed method against state-of-the-art sequence classification methods.

Results: Tangle outperforms state-of-the-art recurrent neural networks, including attention-based models. In particular, when the proposed time span-guided attention strategy is coupled with pre-trained embedding methods, the model performance reaches an Area Under the ROC Curve of 90%, an improvement of almost 10 percentage points over an attentionless recurrent architecture.

Implementation: Tangle is implemented in Python using Keras and it is hosted on GitHub at https://github.com/samuelefiorini/tangle.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. LSTM for sequence classification.
A visual representation of a simple bidirectional LSTM for sequences classification. This architecture is used in this work for the sake of comparison, and it is referred to as baseline. In this work we adopted LSTM recurrent cells, in order to exploit their ability to learn long-time relationship in the sequences. However, similar architectures can be devised with vanilla RNN, Gated Recurrent Units (GRU) [17] or other types of temporal architectures.
Fig 2
Fig 2. Neural attention model.
A visual representation of the attention mechanism for sequences classification. When λ = 1 this corresponds to a standard bidirectional attention model, whereas when λ ≠ 1 the time span sequence τ1, …, τT can guide the model to focus on the most relevant elements of the sequence. We call Tangle the case in which the value of λ is jointly learned during the training process. A blue dashed line highlights the timestamps attention guiding mechanism.
Fig 3
Fig 3. MBS item embedding.
A schematic representation of our word embedding strategy to achieve meaningful representations of MBS items. Here, we consider a 10-words textual representation of the MBS item no. 66551. To each word is associated the corresponding word-embedding, which, in this picture, is a 5-dimensional vector to guarantee readability. The final representation of the considered item is achieved by averaging.
Fig 4
Fig 4. Average ROC curves.
ROC curves obtained averaging the 10 Monte Carlo cross-validation iterations for best and worst method: i.e. Tangle and 1-LR 1-BOW respectively. Shaded area corresponds to ±3σ, where σ is the standard deviation. For ease of readability, only ROC curves corresponding to best and worst performing models are shown.
Fig 5
Fig 5. t-SNE embedding.
3D scatter-plot of a random extraction of 500 samples projected on a low-dimensional embedding, estimated by t-SNE [30], from the sample representation learned by Tangle. Samples belonging to the two classes, represented with green circles and red triangles, can be seen as slightly overlapping clusters.
Fig 6
Fig 6. Attention contribution.
Manhattan plot of the attention contribution ω estimated by Tangle on the test set. As we can see, the model correctly focuses its attention on the most recent claims, which have nonzero contributions. From this plot we can also appreciate the different representations learned for the two classes.

References

    1. Australian Government—Australian Institute of Health and Welfare. Diabetes snapshot; 2018. https://www.aihw.gov.au/reports/diabetes/diabetes-compendium/contents/de....
    1. Diabetes Australia. Living with diabetes;. https://www.diabetesaustralia.com.au/managing-type-2.
    1. Gottlieb A, Yanover C, Cahan A, Goldschmidt Y. Estimating the effects of second-line therapy for type 2 diabetes mellitus: retrospective cohort study. BMJ Open Diabetes Research and Care. 2017;5(1):e000435 10.1136/bmjdrc-2017-000435 - DOI - PMC - PubMed
    1. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine learning and data mining methods in diabetes research. Computational and structural biotechnology journal. 2017;15:104–116. 10.1016/j.csbj.2016.12.005 - DOI - PMC - PubMed
    1. Xing Z, Pei J, Keogh E. A brief survey on sequence classification. ACM Sigkdd Explorations Newsletter. 2010;12(1):40–48. 10.1145/1882471.1882478 - DOI

Publication types