NATURAL LANGUAGE PROCESSING ON FINANCIAL STATEMENTS
- Sanjana Rajesh
- Apr 14, 2020
- 1 min read
The code retrieved 10-k filings from the SEC website. Following this, the data was cleaned up and lemmatized.
A sentiment analysis was conducted on the data using the Loughran and McDonald sentiment word lists. The sentiments Negative, Positive, Uncertainty, Litigious, Constraining, Superfluous, and Modal were covered.
The code evaluates alpha factors using sentiment TFDIF generated from sentiment lists and cosine similarity from the TFDIF.
Retrieve Jupyter notebook below -
Comments