Using CISA Known Exploited Vulnerabilities for Data-Driven Security Operations
This repository contains the complete implementation of our research on reinforcement learning-based vulnerability prioritization using the CISA KEV catalog, NVD CVSS scores, and EPSS data.
Modern organizations face thousands of published vulnerabilities with limited resources for remediation. This work demonstrates that reinforcement learning can learn optimal prioritization policies from real-world exploitation data, achieving:
- 98.4% classification accuracy (DQN)
- 3,587.50 average reward (+4,753% vs random)
- 10-minute training time (production-ready)
- Balanced prioritization (52% medium, 48% immediate)
| Method | Accuracy | Avg Reward | F1 (Macro) | Training Time |
|---|---|---|---|---|
| Random Baseline | N/A | -75.50 | N/A | N/A |
| XGBoost | 100.0% | N/A | 100.0% | < 1 min |
| DQN (Ours) | 98.4% | 3,587.50 | 65.7% | ~10 min |
| PPO | 46.9% | 2,822.00 | 21.3% | ~15 min |
# Clone repository
git clone https://github.com/GitSene/RL-KEV-Vulnerability-Prioritization.git
cd RL-KEV-Vulnerability-Prioritization
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Download CISA KEV catalog
python data/download_kev.py
# Enrich with NVD CVSS and EPSS (requires API key)
python src/data_processing/kev_enrichment.py --api-key YOUR_NVD_API_KEY# Train DQN agent
python scripts/train_dqn.py --episodes 200
# Train PPO agent
python scripts/train_ppo.py --episodes 200
# Train XGBoost baseline
python scripts/train_xgboost.py# Run complete evaluation
python scripts/evaluate_all.py├── data/ # Data download and storage
├── src/ # Source code
│ ├── data_processing/ # KEV enrichment pipeline
│ ├── environment/ # RL environment (Gymnasium)
│ ├── agents/ # DQN and PPO implementations
│ ├── baselines/ # XGBoost baseline
│ └── evaluation/ # Evaluation scripts
├── models/ # Trained model weights
├── notebooks/ # Jupyter notebooks for analysis
├── results/ # Output figures and tables
├── scripts/ # Executable training scripts
└── docs/ # Documentation
- CISA KEV Catalog: 1,464 confirmed exploited vulnerabilities
- NVD CVSS v3.1: Technical severity scores (0-10)
- EPSS: Exploitation probability predictions (0-1)
- Features: cvss_score, epss, epss_percentile, days_since_added, ransomware_flag
- State: 5-dimensional continuous feature vector
- Actions: Monitor, Patch-30d, Patch-7d, Patch-Immediate
- Reward: Urgency alignment + SLA compliance penalties
- Environment: Gymnasium-compatible MDP
- DQN: Value-based learning with experience replay
- PPO: Policy-gradient learning with actor-critic
- XGBoost: Traditional ML baseline
- 98.4% accuracy with balanced prioritization strategy
- 100% recall on high-urgency vulnerabilities
- Action distribution: 52% medium, 48% immediate
- Convergence: ~175 episodes (~10 minutes)
- Ransomware flag: 77.9%
- CVSS score: 19.9%
- Days since added: 2.2%
- EPSS features: <1%
If you use this code or data in your research, please cite:
@article{HabibiNorouzlou2025rl,
title={Reinforcement Learning-Based Vulnerability Prioritization Using CISA Known Exploited Vulnerabilities},
author={Babek Habibi Norouzlou},
journal={IEEE Transactions on Information Forensics and Security},
year={2025},
note={Under Review}
}This project is licensed under the MIT License - see the LICENSE file for details.
- CISA for maintaining the KEV catalog
- NIST for the National Vulnerability Database
- FIRST for the EPSS scoring system
- Author: Babek Habibi Norouzlou
- Email: bnorouzlou19519@ucumberlands.edu
- Institution: University of the Cumberlands
** If you find this work useful, please consider starring the repository!**