GitHub

Stock market forcasting using news headlines

Data set : Reuter's, NASDAQ Headlines downloaded from : https://github.com/philipperemy/Reuters-full-data-set NASDAQ prices under ./data/

Overview : Preprocess headlines using gensim and nltk. Preprocessing involves removing stop words, stemming and lematizing.

Create a bag of words from the headlines, each headline being a document, across all dates.

Generate a tfidf probability using the bag of words data.

Train LDA using the tfid distribution.

Instructions:

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
NLP_Final_Report.pdf		NLP_Final_Report.pdf
README.md		README.md
cleandata.py		cleandata.py
data.csv		data.csv
ldavis15.html		ldavis15.html
ldavis20.html		ldavis20.html
topic_modeling.ipynb		topic_modeling.ipynb

Provide feedback