This repository contains my implementation of a GPT-style Transformer trained for causal language modeling on a public domain text corpus ("Twenty Thousand Leagues Under the Sea"). The goal was to understand the inner workings of Transformer architecture, sampling strategies, and inference optimizations like KV caching.
- Custom GPT-style Transformer built in PyTorch
- Causal self-attention with rotary positional embeddings (RoPE)
- Support for:
- Temperature scaling
- Top-k sampling
- Key-Value (KV) caching for faster autoregressive generation
- Training on GPU using Slurm-compatible scripts
python my_gpt.py
python generate.py --temperature 1.0 --top_k 40
python generate.py --temperature 0.1 --top_k 5
python generate.py --use_kv_cache --temperature 1.0 --top_k 40
python generate.py --use_kv_cache --temperature 0.1 --top_k 5
By Zoha Khan