Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Apr 14;22(4):441.
doi: 10.3390/e22040441.

Entropy-Based Approach for the Detection of Changes in Arabic Newspapers' Content

Affiliations

Entropy-Based Approach for the Detection of Changes in Arabic Newspapers' Content

Olga Bernikova et al. Entropy (Basel). .

Abstract

A new method for the recognition of meaningful changes in social state based on transformations of the linguistic content in Arabic newspapers is suggested. The detected alterations of the linguistic material in Arabic newspapers play an indicator role. The currently proposed approach acts in an "online" fashion and uses pre-trained vector representations of Arabic words. After a pre-processing stage, the words in the issues' texts are substituted by vectors obtained within a word embedding methodology. The approach typifies the consistent linguistic template by the similarity of the embedded vectors. A change in the distributions of the issue-grounded samples indicates a difference in the underlying newspaper template. A two-step procedure implements the concept, where the first step compares the similarity distribution of the current issue versus the union of ones corresponding to several of its predecessors. A repeating under-sampling approach accompanied by a two-sample test stabilizes the sampling and returns a collection of the resultant p-values. In the second stage, the entropy of these sets is sequentially calculated, such that the change points of the time series obtained in this way indicate the changes in the newspaper content. Numerical experiments provided on the following issues of several Arabic newspapers published in the Arab Spring period demonstrate the high reliability of the method.

Keywords: anomaly detection; publishing model modeling; word embedding.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The overall entropy graph of the “Al-Ahraam” newspaper in the first time frame.
Figure 2
Figure 2
Examples of the similarity distributions.
Figure 3
Figure 3
The overall entropy graph of the “Al-Ahraam” newspaper in the second time frame.
Figure 4
Figure 4
The overall entropy graph of the “Akhbaar Al-Khaleej” newspaper.
Figure 5
Figure 5
The overall entropy graph of the “Al-Ghad” newspaper.

References

    1. Franch F. (Wisdom of the Crowds)2: 2010 UK Election Prediction with Social Media. J. Inf. Technol. Politics. 2013;10:57–71. doi: 10.1080/19331681.2012.705080. - DOI
    1. Leiter D., Murr A., Ramrez E.R., Stegmaier M. Social networks and citizen election forecasting: The more friends the better. Int. J. Forecast. 2018;34:235–248. doi: 10.1016/j.ijforecast.2017.11.006. - DOI
    1. Wang X., Brown D.E., Gerber M.S. Spatio-temporal modeling of criminal incidents using geographic, demographic, and twitter-derived information; Proceedings of the 2012 IEEE International Conference on Intelligence and Security Informatics; Arlington, VA, USA. 11–14 June 2012.
    1. Gerber M. Predicting Crime using Twitter and Kernel Density Estimation. Decis. Support Syst. 2014;61:115–125. doi: 10.1016/j.dss.2014.02.003. - DOI
    1. Korolov R., Lu D., Wang J., Zhou G., Bonial C., Voss C., Kaplan L., Wallace W., Han J., Ji H. On predicting social unrest using social media; Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016; San Francisco, CA, USA. 18–21 August 2016; pp. 89–95.

LinkOut - more resources