Skip to content

DeekshaD/topic_modeling_stock_data

Repository files navigation

Stock market forcasting using news headlines

Data set : Reuter's, NASDAQ Headlines downloaded from : https://github.com/philipperemy/Reuters-full-data-set NASDAQ prices under ./data/

Overview : Preprocess headlines using gensim and nltk. Preprocessing involves removing stop words, stemming and lematizing.

Create a bag of words from the headlines, each headline being a document, across all dates.

Generate a tfidf probability using the bag of words data.

Train LDA using the tfid distribution.

Instructions:

  1. Change path in cell 2 accordingly
  2. Run all cells sequentially

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published