Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun 6;8(6):e64846.
doi: 10.1371/journal.pone.0064846. Print 2013.

High quality topic extraction from business news explains abnormal financial market volatility

Affiliations

High quality topic extraction from business news explains abnormal financial market volatility

Ryohei Hisano et al. PLoS One. .

Abstract

Understanding the mutual relationships between information flows and social activity in society today is one of the cornerstones of the social sciences. In financial economics, the key issue in this regard is understanding and quantifying how news of all possible types (geopolitical, environmental, social, financial, economic, etc.) affects trading and the pricing of firms in organized stock markets. In this article, we seek to address this issue by performing an analysis of more than 24 million news records provided by Thompson Reuters and of their relationship with trading activity for 206 major stocks in the S&P US stock index. We show that the whole landscape of news that affects stock price movements can be automatically summarized via simple regularized regressions between trading activity and news information pieces decomposed, with the help of simple topic modeling techniques, into their "thematic" features. Using these methods, we are able to estimate and quantify the impacts of news on trading. We introduce network-based visualization techniques to represent the whole landscape of news information associated with a basket of stocks. The examination of the words that are representative of the topic distributions confirms that our method is able to extract the significant pieces of information influencing the stock market. Our results show that one of the most puzzling stylized facts in financial economies, namely that at certain times trading volumes appear to be "abnormally large," can be partially explained by the flow of news. In this sense, our results prove that there is no "excess trading," when restricting to times when news is genuinely novel and provides relevant financial information.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Comparison between the time evolution of trading volume and aggregate news volume for Toyota.
Black continuous line plots trading volume and red dashed line plots aggregate news volume. The inset plots the trading volume as a function of the concomitant news volume.
Figure 2
Figure 2. Flowchart summarizing the procedure followed in our analyses.
The number in parentheses indexes the step. Step (1) selects the news records associated with a given term, here the name of a company, such as Toyota. Step (2-a) applies the Latent Dirichlet Allocation (LDA) that decomposes any document as a mixture of different topics. Step (3-a) implements a constrained LASSO regression. The percentage shown in step (3-b) denotes the estimated impact of each topic. The percentage shown in step (4) is the “fraction of (trading volume) peaks explained” (FPE) by news, which is our metric to assess the quality of our methodology (see text).
Figure 3
Figure 3. Selected topic learned by LDA for Toyota.
Selected topics learned by LDA and the associated news volume estimated using equation (1) for the term “Toyota.” The top three words for these topics were: (A) Toyota, recall, safety; (B) financial, crisis, economy; (C) Japan, production, earthquake; (D) team, F1, race.
Figure 4
Figure 4. Pictorial illustration of “peak days” of normalized trading volume.
The black line shows the de-trended trading volume of Toyota stock for the period from January 2003 to June 2011. The red dots indicate the “peak days” selected by the method described in the text. There are 119 “peak days” for the entire period from January 2003 to June 2011.
Figure 5
Figure 5. Comparison between estimated and actual trading volume.
Estimated (red dashed line) and actual (black continuous line) trading volume for the four companies: (A) Toyota, (B) Yahoo, (C) Best Buy, and (D) BP. The number K of sufficient selected topics is 9 for Toyota, 4 for Yahoo, 3 for Best Buy, and 5 for BP.
Figure 6
Figure 6. Result of stress testing.
(A) Comparison between the estimated and actual trading volume when using topics from BP when trying to explain Yahoo trading volume. (B) Comparison when using topics from Yahoo when trying to explain Best Buy trading volume. Notice the much reduced quality of the regressions compared with those presented in Fig. 6, illustrated by their FPEs, which are exactly 0 in both cases.
Figure 7
Figure 7. Relationship between FPE and number of news records.
The “fraction of peaks explained” (FPE) as a function of the number of news records for the 206 stocks in the S&P 500 for which there were more than 5,000 news records during the period from January 2003 to June 2011. Black diamond shows the FPE value using the 715 topics extracted from our procedure. Blue circle shows the FPE value restricting the number of topic distributions to 637 after manual reading. The data point for Toyota, which as a foreign company of course is not a component of the S&P 500, has been added and is shown as the red triangle and circle.
Figure 8
Figure 8. Network extracted for Microsoft and Yahoo.
Nodes are topics and links between two topics quantify the degree of similarity associated with their word distributions.
Figure 9
Figure 9. Network of topics extracted for the 206 US companies.
The links between two topics quantifying the degree of similarity associated with their word distributions, as explained in the text. The six red arrows depict the zones that are magnified in Fig. 10.
Figure 10
Figure 10. Magnifications of Figure 9 .
Six magnifications of the “islands” indicated by the arrows in the network of topics shown in Fig. 10, with links between two topics quantifying the degree of similarity associated with their word distributions. Each node is accompanied by the name of the company and its top three most frequent words, as quantified by the topic distribution. The size of a node is set to be proportional to the “fraction of volume explained” (FVE) by that topic and the thickness of a link is equal to 1 minus the JSD metric for the two linked topics. Panel (a) shows the network associated with retail sales of clothing companies; panel (b) that associated with drug and patents; panel (c) that associated with products in telecommunication business; panel (d) that associated with tobacco law suit; panel (e) that associated with national defense budget; panel (f) that associated with the potential Comcast Disney merger in 2004.

Similar articles

Cited by

References

    1. Cutler D, Poterba J, Summers L (1989) What moves stock prices? Journal of Portfolio Management 15: 4–12.
    1. McQueen G, Roley VV (1993) Stock prices, news, and business conditions. Review of Fin Studies 6(3): 683–707.
    1. Fleming MJ, Remolona EM (1997)What moves the bond market. Journal of Portfolio Management : 28–38.
    1. Fair R (2002) Events that shook the market. Journal of Business 75(4): 713–731.
    1. Joulin A, Lefevre A, Grunberg D, Bouchaud JP (2008) Stock price jumps: news and volume play a minor role. Wilmott Magazine Sep/Oct: 46.

Publication types

LinkOut - more resources