Mixed-membership models of scientific publications
- PMID: 15020766
- PMCID: PMC387299
- DOI: 10.1073/pnas.0307760101
Mixed-membership models of scientific publications
Abstract
PNAS is one of world's most cited multidisciplinary scientific journals. The PNAS official classification structure of subjects is reflected in topic labels submitted by the authors of articles, largely related to traditionally established disciplines. These include broad field classifications into physical sciences, biological sciences, social sciences, and further subtopic classifications within the fields. Focusing on biological sciences, we explore an internal soft-classification structure of articles based only on semantic decompositions of abstracts and bibliographies and compare it with the formal discipline classifications. Our model assumes that there is a fixed number of internal categories, each characterized by multinomial distributions over words (in abstracts) and references (in bibliographies). Soft classification for each article is based on proportions of the article's content coming from each category. We discuss the appropriateness of the model for the PNAS database as well as other features of the data relevant to soft classification.
Figures
Similar articles
-
Reconceptualizing the classification of PNAS articles.Proc Natl Acad Sci U S A. 2010 Dec 7;107(49):20899-904. doi: 10.1073/pnas.1013452107. Epub 2010 Nov 15. Proc Natl Acad Sci U S A. 2010. PMID: 21078953 Free PMC article.
-
Systematic differences in impact across publication tracks at PNAS.PLoS One. 2009 Dec 1;4(12):e8092. doi: 10.1371/journal.pone.0008092. PLoS One. 2009. PMID: 19956649 Free PMC article.
-
The distribution of forensic journals, reflections on authorship practices, peer-review and role of the impact factor.Forensic Sci Int. 2007 Jan 17;165(2-3):115-28. doi: 10.1016/j.forsciint.2006.05.013. Epub 2006 Jun 19. Forensic Sci Int. 2007. PMID: 16784827 Review.
-
Unavailability of online supplementary scientific information from articles published in major journals.FASEB J. 2005 Dec;19(14):1943-4. doi: 10.1096/fj.05-4784lsf. FASEB J. 2005. PMID: 16319137
-
Specialist Bibliographic Databases.J Korean Med Sci. 2016 May;31(5):660-73. doi: 10.3346/jkms.2016.31.5.660. Epub 2016 Feb 23. J Korean Med Sci. 2016. PMID: 27134485 Free PMC article. Review.
Cited by
-
A general population-genetic model for the production by population structure of spurious genotype-phenotype associations in discrete, admixed or spatially distributed populations.Genetics. 2006 Jul;173(3):1665-78. doi: 10.1534/genetics.105.055335. Epub 2006 Apr 2. Genetics. 2006. PMID: 16582435 Free PMC article.
-
Topic model for Chinese medicine diagnosis and prescription regularities analysis: case on diabetes.Chin J Integr Med. 2011 Apr;17(4):307-13. doi: 10.1007/s11655-011-0699-x. Epub 2011 Apr 21. Chin J Integr Med. 2011. PMID: 21509676
-
Estimating Identification Disclosure Risk Using Mixed Membership Models.J Am Stat Assoc. 2012 Dec 1;107(500):1385-1394. doi: 10.1080/01621459.2012.710508. J Am Stat Assoc. 2012. PMID: 25214699 Free PMC article.
-
Data representations and -analyses of binary diary data in pursuit of stratifying children based on common childhood illnesses.PLoS One. 2018 Nov 29;13(11):e0207177. doi: 10.1371/journal.pone.0207177. eCollection 2018. PLoS One. 2018. PMID: 30496197 Free PMC article.
-
MULTIVARIATE KERNEL PARTITION PROCESS MIXTURES.Stat Sin. 2010 Oct 10;20(4):1395-1422. Stat Sin. 2010. PMID: 24478563 Free PMC article.
References
-
- MacLane, S. (1997) Proc. Natl. Acad. Sci. USA 94, 5983-5985. - PubMed
-
- Rosenberg, N. A., Pritchard, J. K., Weber, J. L., Cann, H. M., Kidd, K. K., Zhivotovsky, L. A. & Feldman, M. W. (2002) Science 298, 2381-2385. - PubMed
-
- Woodbury, M. A., Clive, J. & Garson, A. (1978) Comput. Biomed. Res. 11, 277-298. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources