Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 4;21(2):205-211.
doi: 10.1093/ntr/nty014.

Inferring Smoking Status from User Generated Content in an Online Cessation Community

Affiliations

Inferring Smoking Status from User Generated Content in an Online Cessation Community

Michael S Amato et al. Nicotine Tob Res. .

Abstract

Introduction: User generated content (UGC) is a valuable but underutilized source of information about individuals who participate in online cessation interventions. This study represents a first effort to passively detect smoking status among members of an online cessation program using UGC.

Methods: Secondary data analysis was performed on data from 826 participants in a web-based smoking cessation randomized trial that included an online community. Domain experts from the online community reviewed each post and comment written by participants and attempted to infer the author's smoking status at the time it was written. Inferences from UGC were validated by comparison with self-reported 30-day point prevalence abstinence (PPA). Following validation, the impact of this method was evaluated across all individuals and time points in the study period.

Results: Of the 826 participants in the analytic sample, 719 had written at least one post from which content inference was possible. Among participants for whom unambiguous smoking status was inferred during the 30 days preceding their 3-month follow-up survey, concordance with self-report was almost perfect (kappa = 0.94). Posts indicating abstinence tended to be written shortly after enrollment (median = 14 days).

Conclusions: Passive inference of smoking status from UGC in online cessation communities is possible and highly reliable for smokers who actively produce content. These results lay the groundwork for further development of observational research tools and intervention innovations.

Implications: A proof-of-concept methodology for inferring smoking status from user generated content in online cessation communities is presented and validated. Content inference of smoking status makes a key cessation variable available for use in observational designs. This method provides a powerful tool for researchers interested in online cessation interventions and establishes a foundation for larger scale application via machine learning.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Content-inferred smoking status, for participants who self-reported as ABSTINENT at 3-month follow-up.
Figure 2.
Figure 2.
Content-inferred smoking status, for participants who self-reported as SMOKING at 3-month follow-up.

References

    1. Fox S. Health Topics. Washington D.C: Pew Research Center; 2011. http://pewinternet.org/Reports/2011/HealthTopics.aspx. Accessed July 14, 2016.
    1. Zhao K, Wang X, Cha S, et al. . A multirelational social network analysis of an online health community for smoking cessation. J Med Internet Res. 2016;18(8):e233. - PMC - PubMed
    1. Healthways. QuitNet Tobacco Cessation Fact Sheet http://www.healthways.com/hs-fs/hub/162029/file-691487149-pdf/Fact_Sheet.... Accessed October 6, 2016.
    1. Shahab L, McEwen A. Online support for smoking cessation: a systematic review of the literature. Addiction. 2009;104(11):1792–1804. - PubMed
    1. Civljak M, Stead LF, Hartmann-Boyce J, Sheikh A, Car J. Internet-based interventions for smoking cessation. Cochrane Database Syst Rev. 2013;(7):CD007078. doi:10.1002/14651858.CD007078.pub4. - PubMed

Publication types