Classification of Genetic Mutations based Clinical text so as to predict their effects on Personalized Medicine
- Numpy - Version: 1.14.3
- Pandas - Version: 0.22.0
- Seaborn - Version: 0.8.1
- Matplotlib - Version: 2.2.2
- NLTK, Regex, Sklearn, and lightgbm modules
Download the repository and run the Genetic_Variations.py file.
Change the path to the dataset as per your system!
The dataset is in the repository:
- Training Variants - Contains the description of the genetic mutations for training
- Training Text - Contains the clinical evidence to classify the mutations.
- Test variants - Contains the description of the genetic mutations for testing. Test Text - Contains the clinical evidence to classify the mutations. The sample submissions file holds the classification results.
Dataset - Personalized Medicine - Redefining Cancer Treatment on Kaggle