Transformer Attention โข Masked Token Prediction โข Financial NLP โข Embedding Intelligence โข Model Explainability
BERTokenScope is a transformer explainability and NLP intelligence platform built as an enhanced, production-style extension of the Harvard CS50AI Attention project.
The original CS50AI Attention assignment focuses on using BERT to predict masked words and generate static attention diagrams. BERTokenScope builds on that foundation to create a full portfolio-grade system for exploring how transformer models understand language, context, token relationships, and finance-specific text signals.
At its core, BERTokenScope answers one powerful question:
How does a transformer decide what words matter?
Instead of treating BERT like a black box, this project opens the model up.
It helps users inspect:
- which tokens receive the most attention
- how attention changes across layers and heads
- what BERT predicts for masked words
- how financial tone, risk, uncertainty, and executive language appear in text
- how sentence/document embeddings can be compared in semantic space
- how different transformer models behave on the same input
- how removing important tokens changes model outputs
This project combines CS50AI foundations, university-level machine learning and deep learning concepts, and my broader experience building full-stack, AI, cloud, and software engineering projects.
The result is not just a course assignment.
It is an AI interpretability platform.
Modern AI systems are powered by transformers.
Large language models, search systems, summarizers, chatbots, coding assistants, recommendation systems, and AI agents all rely on the same core idea:
Tokens should not be understood alone.
Tokens should be understood in context.
That is where attention comes in.
Attention allows a model to decide which words in a sequence matter most when understanding another word. For example, in a sentence like:
The company reduced guidance because demand weakened.
A transformer might pay strong attention between:
reducedandguidancedemandandweakenedbecauseand the explanation that follows
For financial text, this becomes especially useful.
Earnings calls, investor reports, filings, and analyst transcripts often contain subtle signals. A model may pick up risk language, uncertainty, confidence, caution, or forward-looking sentiment.
BERTokenScope was built to make those signals visible.
BERTokenScope is organized into six major dashboard sections. Each section focuses on a different part of transformer explainability, NLP analysis, or model intelligence.
The Masked Word Lab extends the original CS50AI Attention projectโs [MASK] prediction workflow into an interactive dashboard experience.
Users enter a sentence containing a masked token, select a Top K value, and run prediction. BERTokenScope then returns the most likely replacement tokens along with probabilities and reconstructed sentence outputs.
This section demonstrates how BERT-style masked language models use surrounding context to infer missing words.
- BERT-style masked-token prediction
- Top-k token probability ranking
- Reconstructed sentences for each predicted token
- Deterministic fallback behaviour for offline portfolio demos
- Clear bridge from CS50AI Attention to real NLP model exploration
The Attention Explorer visualizes how tokens attend to each other across transformer layers and heads.
Users can input a sentence, choose a layer and attention head, and inspect token-to-token attention patterns through heatmaps and rollout visualizations. This makes transformer internals easier to understand instead of treating the model like a black box.
- Layer-by-layer attention inspection
- Head-by-head transformer analysis
- Token-to-token attention heatmaps
- Attention rollout visualization
- Strongest token link extraction
- Head diagnostics such as entropy and focus score
This section answers the core question behind BERTokenScope:
What is the model paying attention to?
The Explainability Lab helps interpret model behaviour through token attribution and counterfactual impact analysis.
Instead of only showing a model prediction, this section highlights which words contributed most to the output and how the prediction changes when important tokens are removed.
- Prediction label and confidence score
- Token-level attribution scores
- Important financial and contextual words
- Counterfactual token impact
- Prediction score changes after token removal
- Explainable AI workflow for transformer-style NLP systems
This makes the model easier to reason about.
Not just what did it predict?
But why did it predict that?
The Financial NLP Intelligence section applies NLP analysis to financial text, earnings-style language, and executive communication.
It analyzes sentiment, risk language, uncertainty, optimism, and financial signal strength from user-provided text. This turns BERTokenScope from a general NLP demo into a more domain-aware financial AI tool.
- Financial sentiment classification
- Risk language scoring
- Uncertainty scoring
- Optimism scoring
- Financial signal visualization
- Executive tone and business-language analysis
The Transcript Drift Analysis view expands the Financial NLP section by comparing language across reporting periods.
This is useful for analyzing how a companyโs tone changes between quarters, earnings calls, or financial updates. BERTokenScope can show whether sentiment weakened, risk language increased, or uncertainty became more prominent over time.
- Period-over-period financial tone comparison
- Sentiment drift
- Risk-language increase or decrease
- Uncertainty trend analysis
- Transcript chunk diagnostics
- Executive summary generation for financial language changes
Together, the Financial NLP views show how transformer-inspired NLP systems can support business intelligence and analyst workflows.
The Embedding Explorer uses semantic embeddings to compare documents, transcript excerpts, or company text samples.
Embeddings convert text into numerical vectors, allowing BERTokenScope to measure meaning-based similarity between documents. This supports semantic search, clustering, document comparison, and future retrieval-augmented analysis workflows.
- Semantic embedding map
- Document similarity matrix
- Closest document pair ranking
- Company or transcript similarity analysis
- Foundation for semantic search and retrieval workflows
This section shows how language can be transformed into vector space.
That is the same foundation behind modern search, recommendation, and RAG systems.
The Model Comparison section benchmarks multiple model families across runtime, confidence, and output behavior.
It compares masked-language models, financial-sentiment models, and embedding models. This helps evaluate tradeoffs between speed, confidence, model type, and task suitability.
- Model runtime comparison
- Model confidence comparison
- Masked-language model outputs
- Financial sentiment model outputs
- Embedding model outputs
- Latency and confidence benchmarking
- Practical model-selection workflow
This section reflects a real production concern:
The best model is not always the biggest model.
The best model is the one that fits the task, latency, cost, and reliability needs.
The Attention Explorer helps inspect the transformer attention layer by layer and head by head.
It allows users to study how tokens attend to one another across the model.
- Which tokens are receiving the strongest attention?
- Which token relationships dominate a specific layer?
- Which attention heads appear interpretable?
- Do certain heads focus on nearby tokens?
- Do certain heads focus on important domain-specific words?
- Explore how BERT attends to verbs and objects.
- Inspect how financial risk words connect to the surrounding context.
- Compare attention patterns between neutral and negative statements.
- Use attention as a teaching tool for transformer internals.
The Masked Word Lab uses BERT-style masked language modelling.
Users provide a sentence containing [MASK], and the model predicts the most likely replacement words.
The company reported strong [MASK] growth this quarter.
Possible predictions might include:
revenue
sales
earnings
profit
Masked language modelling helps show how BERT understands context.
The model is not just guessing a random word.
It uses surrounding tokens to infer what word best fits the sentence.
This is the same foundational idea behind many modern NLP systems.
BERTokenScope identifies strong token-to-token attention links.
Instead of only showing a heatmap, it extracts meaningful relationships between tokens.
risk โ increased
revenue โ declined
guidance โ lowered
demand โ weakened
margin โ compressed
This makes attention easier to understand.
A heatmap is useful.
But a ranked list of token relationships is faster to interpret.
BERTokenScope includes finance-aware text analysis features.
It can inspect financial language for:
- sentiment
- risk language
- uncertainty
- executive tone
- forward-looking statements
- positive business signals
- negative business signals
- cautious or defensive wording
Revenue increased, but management warned of margin pressure and weaker demand.
BERTokenScope can surface signals like:
- positive:
revenue increased - risk:
margin pressure - negative demand:
weaker demand - cautious tone:
warned
This makes the project more than a general NLP demo.
It becomes useful for financial text intelligence.
Financial communication changes over time.
A company might sound confident in one quarter and cautious in the next.
BERTokenScope includes transcript drift analysis ideas for comparing tone across periods.
Q1: "We expect strong growth across all segments."
Q2: "We remain cautious due to demand uncertainty."
The system can help compare:
- tone change
- risk language increase
- uncertainty increase
- sentiment drift
- executive confidence shift
This is especially useful for:
- earnings call analysis
- investor research
- financial NLP dashboards
- analyst workflow tools
The Explainability Lab provides scaffolding for understanding why a model output may have occurred.
It can support:
- token importance
- attention-based attribution
- strongest token links
- prediction rationale
- confidence scoring
- counterfactual analysis
The goal is not just to show the modelโs answer.
The goal is to explain the modelโs behaviour.
Counterfactual analysis asks:
What changes if we remove or modify an important token?
For example:
Original: The company reported weak demand.
Modified: The company reported demand.
If removing weak changes the sentiment or prediction score, then weak was likely important.
This helps make model behaviour more understandable.
BERTokenScope includes embedding exploration hooks.
Embeddings convert text into numerical vectors that represent meaning.
This allows text to be compared mathematically.
- Compare two financial statements.
- Cluster similar transcript excerpts.
- Map companies by semantic similarity.
- Identify related risk disclosures.
- Build retrieval or search features later.
Using embeddings, BERTokenScope can support company or document similarity analysis.
For example:
- Which companies discuss similar risks?
- Which transcript excerpts sound alike?
- Which filings are semantically close?
- Which documents cluster together?
This connects transformer NLP with real-world document intelligence.
BERTokenScope is designed to compare multiple transformer model families.
Potential compatible models include:
- BERT
- DistilBERT
- RoBERTa
- FinBERT
- sentence-transformer models
- predicted tokens
- confidence scores
- latency
- output differences
- finance-specific relevance
- embedding similarity
- interpretability quality
This turns the project into a model experimentation platform.
BERTokenScope includes benchmarking ideas for comparing model behavior and performance.
It can compare:
- inference latency
- confidence distribution
- top-k prediction differences
- fallback vs live model behavior
- model family performance
This is important because production AI systems are not only judged by accuracy.
They are also judged by speed, reliability, cost, and stability.
BERTokenScope includes a backend API layer for serving NLP analysis.
The FastAPI service supports a more production-ready architecture where the dashboard and API are separated.
- masked-token prediction
- financial text analysis
- health checks
- request validation
- structured JSON responses
- API-key protected routes
- versioned endpoints
- safe error envelopes
This makes the project feel more like a real AI service rather than just a notebook or script.
BERTokenScope follows a modular architecture.
User
โ
โผ
Streamlit Dashboard
โ
โ Interactive UI for demos, analysis, charts, and explainability
โ
โผ
FastAPI Service
โ
โ Versioned API routes, validation, auth, health checks
โ
โผ
NLP Service Layer
โ
โโโ Masked Token Prediction
โโโ Attention Extraction
โโโ Financial NLP Analysis
โโโ Embedding Generation
โโโ Model Comparison
โโโ Explainability Reports
โ
โผ
Model Adapter Layer
โ
โโโ BERT
โโโ DistilBERT
โโโ RoBERTa
โโโ FinBERT
โโโ Fallback/Demo Mode
โ
โผ
Local Artifacts + Run Tracking
โ
โโโ JSON outputs
โโโ SQLite metadata
โโโ Logs
โโโ Analysis history
A typical BERTokenScope workflow looks like this:
1. User enters a sentence or financial text
2. Text is cleaned and tokenized
3. Model or fallback service processes the input
4. BERTokenScope extracts predictions, attention links, and language signals
5. Results are converted into structured outputs
6. Streamlit displays charts, tables, explanations, and insights
7. Optional API artifacts are saved for run history and reproducibility
The company reported strong [MASK] growth this quarter.
Management lowered guidance due to weaker demand and continued margin pressure.
We remain confident in our long-term strategy, although near-term conditions remain uncertain.
Q1: We expect strong demand across all regions.
Q2: We are seeing slower demand and increased customer caution.
BERTokenScope can produce outputs such as:
Top Mask Predictions:
1. revenue
2. sales
3. earnings
4. profit
5. margin
Strongest Attention Links:
revenue โ growth
guidance โ lowered
demand โ weaker
margin โ pressure
Financial NLP Signals:
Sentiment: Cautious
Risk Level: Elevated
Uncertainty: Medium
Executive Tone: Defensive
Counterfactual Impact:
Removing "weaker" reduced the negative tone score by 34%.
Attention allows a model to decide how much each token should focus on every other token.
This is the heart of transformer-based language understanding.
Masked language modelling trains a model to predict missing words from context.
BERT was trained using this objective.
BERTokenScope uses this idea to show how context shapes prediction.
Instead of only seeing the final model output, BERTokenScope exposes token relationships.
This helps explain the modelโs internal behaviour.
Financial text has domain-specific language.
Words like guidance, margin, demand, headwinds, liquidity, and uncertainty carry important business meaning.
BERTokenScope adds finance-aware analysis to make transformer outputs more useful in real-world contexts.
Embeddings allow text to be represented as vectors.
This enables:
- semantic search
- clustering
- similarity scoring
- document comparison
- retrieval systems
Explainable AI focuses on making model behaviour understandable to humans.
BERTokenScope supports this through attention analysis, token attribution, counterfactuals, and structured reports.
BERTokenScope/
โ
โโโ api/
โ โโโ main.py
โ โโโ routes/
โ
โโโ app/
โ โโโ streamlit_app.py
โ
โโโ attention/
โ โโโ extraction.py
โ โโโ heatmaps.py
โ โโโ rollout.py
โ โโโ token_links.py
โ
โโโ ber_tokenscope/
โ โโโ config.py
โ โโโ schemas.py
โ โโโ settings.py
โ โโโ model_adapters.py
โ
โโโ embeddings/
โ โโโ encode.py
โ โโโ reduce.py
โ โโโ similarity.py
โ
โโโ explainability/
โ โโโ attribution.py
โ โโโ counterfactuals.py
โ โโโ reports.py
โ
โโโ financial_nlp/
โ โโโ sentiment.py
โ โโโ risk_signals.py
โ โโโ uncertainty.py
โ โโโ transcript_drift.py
โ
โโโ configs/
โ โโโ default.yaml
โ
โโโ docs/
โ โโโ architecture.md
โ โโโ cs50ai-extension.md
โ โโโ portfolio-deployment.md
โ โโโ streamlit-cloud.md
โ
โโโ tests/
โ โโโ test_api.py
โ โโโ test_attention.py
โ โโโ test_financial_nlp.py
โ โโโ test_fallbacks.py
โ
โโโ assets/
โ โโโ bertokenscope-banner.png
โ โโโ dashboard-preview.gif
โ โโโ architecture-diagram.png
โ
โโโ docker-compose.yml
โโโ Dockerfile
โโโ requirements.txt
โโโ requirements-models.txt
โโโ pyproject.toml
โโโ README.md
โโโ LICENSE
git clone https://github.com/YOUR_USERNAME/BERTokenScope.git
cd BERTokenScopepython -m venv .venvActivate it on Windows PowerShell:
.\.venv\Scripts\Activate.ps1Activate it on macOS/Linux:
source .venv/bin/activatepip install -r requirements.txtstreamlit run app/streamlit_app.pyThen open the local URL shown in your terminal.
Usually:
http://localhost:8501
Set an API key:
$env:BERTSCOPE_API_KEY="replace-with-a-long-random-secret"Run the FastAPI server:
uvicorn api.main:app --reloadAPI will usually be available at:
http://127.0.0.1:8000
Interactive API docs:
http://127.0.0.1:8000/docs
GET /healthExample response:
{
"status": "ok",
"service": "BERTokenScope"
}POST /api/v1/mask/predictExample request:
{
"text": "The company reported strong [MASK] growth this quarter.",
"top_k": 5
}Example response:
{
"predictions": [
{
"token": "revenue",
"score": 0.41
},
{
"token": "sales",
"score": 0.23
},
{
"token": "earnings",
"score": 0.14
}
]
}POST /api/v1/finance/analyzeExample request:
{
"text": "Management lowered guidance due to weaker demand and margin pressure."
}Example response:
{
"sentiment": "cautious",
"risk_level": "elevated",
"uncertainty": "medium",
"signals": [
"lowered guidance",
"weaker demand",
"margin pressure"
]
}docker compose up --builddocker compose --profile gateway up --buildBERTokenScope is designed to be portfolio-friendly and reliable.
By default, the public dashboard can run in an offline-safe fallback mode.
This means the demo can still work without:
- GPU access
- live Hugging Face downloads
- large model cache files
- unstable cloud inference dependencies
For live transformer inference, install optional model dependencies:
pip install -r requirements-models.txtThen enable model downloads:
BERTSCOPE_ALLOW_MODEL_DOWNLOADS=true
BERTokenScope is prepared for a practical portfolio deployment strategy.
- Deploy the Streamlit dashboard on Streamlit Community Cloud.
- Keep the dashboard in offline/fallback mode for reliability.
- Use the GitHub repo to showcase the full FastAPI backend architecture.
- Deploy the FastAPI backend separately later if live transformer serving is needed.
Main file path: app/streamlit_app.py
Required secrets: none
Recommended mode: offline-safe demo
For more details, see:
docs/streamlit-cloud.md
docs/portfolio-deployment.md
Run the test suite:
pytestRun tests with verbose output:
pytest -vRun a specific test file:
pytest tests/test_attention.pyBERTokenScope includes several features that make it more realistic than a simple course script.
- FastAPI backend
/api/v1route versioning- API-key authentication
- health checks
- structured responses
- safe error messages
- request validation
- idempotency key support scaffolding
- pagination scaffolding
- deterministic fallback behavior
- offline-safe demo mode
- model lazy-loading
- model warmup endpoint scaffolding
- testable components without model downloads
- request IDs
- structured JSON logs
- Prometheus-style metrics scaffolding
- run history
- local artifacts
- API key protection
- role-aware route protection scaffolding
- CORS configuration
- security headers
- request size limits
- rate limiting scaffolding
- safe error envelopes
- redaction hooks
- audit logs
- retention cleanup
- financial-use disclaimers
- local artifact control
- Dockerfile
- Docker Compose
- optional gateway profile
- CI checks
- linting
- formatting checks
- type-checking scaffolding
- security scan scaffolding
- release image workflow scaffolding
BERTokenScope demonstrates my ability to move from classroom AI to production-style AI engineering.
It shows:
- transformer understanding
- NLP interpretability
- financial text analytics
- dashboard development
- API-first design
- software architecture
- model-serving awareness
- testing and deployment readiness
This project sits at the intersection of:
Artificial Intelligence
+
Natural Language Processing
+
Financial Analytics
+
Explainable AI
+
Full-Stack ML Engineering
Here are strong resume-ready bullets for this project:
Engineered BERTokenScope, a transformer explainability platform extending Harvard CS50AI Attention into a production-style NLP system for BERT masked-token prediction, attention visualization, financial text analytics, and model comparison.
Built an interactive Streamlit and FastAPI-based NLP intelligence system with token-level attention exploration, finance-aware sentiment/risk analysis, embedding similarity hooks, deterministic fallback mode, Docker Compose support, and testable service components.
Designed a portfolio-ready transformer interpretability workflow using BERT-compatible models, attention head analysis, counterfactual token explanations, structured API responses, local run tracking, and offline-safe deployment patterns.
Potential future upgrades include:
- live hosted FastAPI backend
- full Hugging Face model serving
- FinBERT sentiment integration
- persistent PostgreSQL run history
- vector database support
- transcript upload pipeline
- PDF/filing parser
- earnings-call dashboard
- SHAP/LIME-style attribution
- WebSocket streaming for inference jobs
- Kubernetes deployment manifests
- AWS deployment using ECS or Lambda containers
- CI/CD deployment to cloud infrastructure
BERTokenScope is an educational and portfolio project.
Financial NLP outputs should not be treated as investment advice. Sentiment, risk, uncertainty, and tone analysis are model-assisted signals intended for research, learning, and demonstration purposes only.
This project directly extends the transformer-based NLP and attention concepts introduced in the CS50AI Attention project.
| CS50AI Attention | BERTokenScope |
|---|---|
[MASK] Token Prediction |
BERT Masked-Language Intelligence |
| Tokenization | Token-Level Context Exploration |
| Self-Attention Scores | Attention Maps and Token Relationship Analysis |
| 12 BERT Layers | Layer-by-Layer Transformer Inspection |
| 12 Attention Heads per Layer | Head-Level Interpretability |
| Static Attention Diagrams | Interactive Attention Visualization |
| Attention Head Analysis | Explainability Reports and Token Insights |
| Natural Language Sentences | Financial Text, Earnings Language, and Risk Signals |
| Hugging Face Transformers | Modular Model Adapter Layer |
| Single-Purpose Python Script | Streamlit Dashboard + FastAPI Backend |
| Manual Interpretation | Structured NLP Intelligence Workflow |
| Course Assignment | Production-Style Transformer Explainability Platform |
BERTokenScope demonstrates how foundational transformer and attention concepts can scale into a production-oriented NLP explainability system for analyzing language, context, financial tone, and token-level model behaviour.
Mitra Boga






