This repository contains submission for the CS779 Machine Translation Competition (English → Hindi / Bengali). It reflects an engineering-focused approach to Neural Machine Translation, emphasizing iterative experimentation, empirical validation, and robustness to real-world noisy multilingual corpora — beginning with GRU/LSTM Seq2Seq baselines and culminating in a 4-layer Transformer encoder–decoder architecture enhanced with pretrained embeddings.
| Metric | Score |
|---|---|
| BLEU | 0.175 |
| chrF++ | 0.417 |
| ROUGE | 0.44 |
word2vec-project/
├── data/
│ └── dataset.txt # Training corpus (if applicable)
├── notebooks/
│ └── dp-word2vec-t4_final.ipynb # Main Word2Vec implementation notebook
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore file
└── README.md # This file