International Journal of Information Management Data Insights (Nov 2022)

Extraction and classification of risk-related sentences from securities reports

  • Motomasa Fujii,
  • Hiroki Sakaji,
  • Shigeru Masuyama,
  • Hajime Sasaki

Journal volume & issue
Vol. 2, no. 2
p. 100096

Abstract

Read online

With the drastically changing business environment, it is difficult even for experts to properly extract and classify risk statements from securities reports, which contain large volumes and unstructured information. Several methods have been proposed, but the existing methods face difficulty in dealing with flexible risk expressions. This study presents an open-domain risk-analysis framework that combine the strengths of both humans and machines. They include defining appropriate business risks and constructing supervised data based on those definitions. Risks were then extracted and classified from the securities reports of a representative group of Japanese companies. We confirmed the limitations of pattern matching, and the usefulness of contextual analysis methods. We also confirmed the importance of constructing supervised data based on appropriate guidelines of data classification. This study presents a framework that quickly and effectively derives the risk structure of a given industry or company from vast and unstructured information.

Keywords