Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014;33(2):65-78.
doi: 10.12938/bmfh.33.65. Epub 2014 Apr 29.

Applying Data Mining to Classify Age by Intestinal Microbiota in 92 Healthy Men Using a Combination of Several Restriction Enzymes for T-RFLP Experiments

Affiliations

Applying Data Mining to Classify Age by Intestinal Microbiota in 92 Healthy Men Using a Combination of Several Restriction Enzymes for T-RFLP Experiments

Toshio Kobayashi et al. Biosci Microbiota Food Health. 2014.

Abstract

The composition of the intestinal microbiota was measured following consumption of identical meals for 3 days in 92 Japanese men, and terminal restriction fragment length polymorphism (T-RFLP) was used to analyze their feces. The obtained operational taxonomic units (OTUs) and the subjects' ages were classified by using Data mining (DM) software that compared these data with continuous data and for 5 partitions for age divided at 5 years intervals between the ages of 30 and 50. The DM provided Decision trees in which the selected OTUs were closely related to the ages of the subjects. DM was also used to compare the OTUs from the T-RFLP data with seven restriction enzymes (two enzymes of 516f-BslI and 516f-HaeIII, two enzymes of 27f-MspI and 27f-AluI, three enzymes of 35f-HhaI, 35f-MspI and 35f-AluI) and their various combinations. The OTUs delivered from the five enzyme-digested partitions were analyzed to classify their age clusters. For use in future DM processing, we discussed the enzymes that were effective for accurate classification. We selected two OTUs (HA624 and HA995) that were useful for classifying the subject's ages. Depending on the 16S rRNA sequences of the OTUs, Ruminicoccus obeum clones 1-4 were present in 18 of 36 bacterial candidates in the older age group-related OTU (HA624). On the other hand, Ruminicoccus obeum clones 1-33 were present in 65 of 269 candidates in the younger age group-related OUT (HA995).

Keywords: Key wordshuman intestinal microbiota; Ruminicoccus obeum; classification of age; data mining analysis; decision tree; operational taxonomic unit; terminal restriction fragment length polymorphism.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Cumulative frequency of the ages of the 92 Subjects Each dot represents a subject.
Fig. 2.
Fig. 2.
Dt obtained by DM with unpartitioned age Node-0: starting point of Dt construction. Node-1 to Node-14: subject groups divided by DM processing and classification. n: number of subjects. Average: average age at each node. σ: standard deviation of age at each node. Node-0 was divided into Node-1 and Node-2 by HA624 at 9.32 with the optimization of the Gini coefficient, and similar steps were repeated for constructing the Dt.
Fig. 3.
Fig. 3.
Dt obtained by DM with 2-NP partitioned at 40/41 The 92 subjects were divided at Node-1 and Node-2 by HA323, with a Gini coefficient cutoff value of 2.86. The following divisions were made in an analogous way. The numbers of subjects at each node are shown. The arrow at Node-19 indicates that a subject was falsely classified; this was subject #21, who was 55 years of age. His OTUs corresponded to those of someone younger than 40. As for Terminal nodes, there are 11; there are four nodes for the younger group, which are shaded for the age range of 21-40, and there are 7 nodes for older group, which are not shaded for age range. Four of these nodes contain only one subject.
Fig. 4.
Fig. 4.
Scatter diagram of major OTUs for age Data features for 4 major OTUs related to age, which are shown at the Dt 1st step in Table 3.
None
None

References

    1. Kobayashi T, Fujiwara K. 2013. Identification of heavy smokers through their intestinal microbiota by data mining analysis. Biosci Microb Food Health 32: 77–80 - PMC - PubMed
    1. Kobayashi T, Jin J, Kibe R, Toyama M, Tanaka Y, Benno Y, Fujiwara K, Shimakawa M, Maruo T, Toda T, Matsuda I, Tagami H, Matsumoto M, Seo G, Sato N, Chounan O, Benno Y. 2013. Identification of human intestinal microbiota of 92 men by data mining for 5 characteristics, i.e. age, BMI, smoking habit, cessation period of previous smokers and drinking habit. Biosci Microb Food Health 32: 129–137 - PMC - PubMed
    1. Kobayashi T, Fujiwara K, 2013. Comparison of the accuracy and mechanism of data mining identification of the intestinal microbiota with 7 restriction enzymes. Biosci Microb Food Health 32: 139–148 - PMC - PubMed
    1. Jin JS, Touyama M, Kibe R, Tanaka Y, Benno Y, Kobayashi T, Shimakawa M, Maruo T, Toda T, Matsuda I, Tagami H, Matsumoto M, Seo G, Chonan O, Benno Y, Benno Y. 2013. Analysis of the human intestinal microbiota from 92 volunteers after ingestion of identical meals. Benef Microbes 4: 187–193 - PubMed
    1. http://mica.ibest.uidaho.edu/pat.php

LinkOut - more resources