The myth of reproducibility: A review of event tracking evaluations on Twitter

Nicholas Mamo¹, Joel Azzopardi¹, Colin Layfield²

Affiliations

¹ Department of Artificial Intelligence, Faculty of Information and Communication Technology, University of Malta, Msida, Malta.
² Department of Computer Information Systems, Faculty of Information and Communication Technology, University of Malta, Msida, Malta.

PMID: 37091456
PMCID: PMC10113524
DOI: 10.3389/fdata.2023.1067335

Review

The myth of reproducibility: A review of event tracking evaluations on Twitter

Nicholas Mamo et al. Front Big Data. 2023.

. 2023 Apr 5:6:1067335.

doi: 10.3389/fdata.2023.1067335. eCollection 2023.

Authors

Nicholas Mamo¹, Joel Azzopardi¹, Colin Layfield²

Affiliations

¹ Department of Artificial Intelligence, Faculty of Information and Communication Technology, University of Malta, Msida, Malta.
² Department of Computer Information Systems, Faculty of Information and Communication Technology, University of Malta, Msida, Malta.

PMID: 37091456
PMCID: PMC10113524
DOI: 10.3389/fdata.2023.1067335

Abstract

Event tracking literature based on Twitter does not have a state-of-the-art. What it does have is a plethora of manual evaluation methodologies and inventive automatic alternatives: incomparable and irreproducible studies incongruous with the idea of a state-of-the-art. Many researchers blame Twitter's data sharing policy for the lack of common datasets and a universal ground truth-for the lack of reproducibility-but many other issues stem from the conscious decisions of those same researchers. In this paper, we present the most comprehensive review yet on event tracking literature's evaluations on Twitter. We explore the challenges of manual experiments, the insufficiencies of automatic analyses and the misguided notions on reproducibility. Crucially, we discredit the widely-held belief that reusing tweet datasets could induce reproducibility. We reveal how tweet datasets self-sanitize over time; how spam and noise become unavailable at much higher rates than legitimate content, rendering downloaded datasets incomparable with the original. Nevertheless, we argue that Twitter's policy can be a hindrance without being an insurmountable barrier, and propose how the research community can make its evaluations more reproducible. A state-of-the-art remains attainable for event tracking research.

Keywords: Topic Detection and Tracking; Twitter; evaluation methodologies; event modeling and mining; event tracking; reproducibility.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

1. Adedoyin-Olowe M., Gaber M. M., Dancausa C. M., Stahl F., Gomes J. B. (2016). A rule dynamics approach to event detection in Twitter with its application to sports and politics. Expert. Syst. Appl. 55, 351–360. 10.1016/j.eswa.2016.02.028 - DOI
1. Aiello L. M., Petkos G., Martin C., Corney D., Papadopoulos S., Skraba R., et al. . (2013). Sensing trending topics in Twitter. IEEE Trans. Multimedia 15, 1268–1282. 10.1109/TMM.2013.2265080 - DOI
1. Akhtar N., Siddique B. (2017). Hierarchical visualization of sport events using Twitter. J. Intell. Fuzzy Syst. 32, 2953–2961. 10.3233/JIFS-169238 - DOI
1. Allan J., Carbonell J. G., Doddington G., Yamron J., Yang Y. (1998a). Topic detection and tracking pilot study final report, in Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop (Lansdowne, VA: ), 194–218.
1. Allan J., Lavrenko V., Swan R. (2002). Explorations Within Topic Tracking and Detection, Vol. 12. Boston, MA: Springer.

Publication types

Actions

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The myth of reproducibility: A review of event tracking evaluations on Twitter

Affiliations

The myth of reproducibility: A review of event tracking evaluations on Twitter

Authors

Affiliations

Abstract

Conflict of interest statement

References

Publication types

LinkOut - more resources

Full Text Sources

Research Materials