Web interface for downloading biological data from NCBI/ENA with real-time progress tracking, validation, and statistics.
- Search & Summary: View detailed metadata before downloading
- Accession Support: SRR, ERR, DRR (runs), SAMN, ERS (samples), SRP, ERP (studies)
- File Information: See file sizes, read types, and total data volume
- Real-time Progress: Live progress bars with speed and ETA
- Batch Downloads: Download multiple accessions simultaneously
- Task Queue: Manage concurrent downloads with configurable limits
- Cancel & Retry: Cancel running downloads, retry failed ones
- MD5 Verification: Automatic checksum verification
- Gzip Validation: Check compressed file integrity
- Validation Reports: Detailed results for each file
- Auto-retry: Re-download files that fail validation
- Download Overview: Total downloads, completed, failed
- Platform Distribution: Pie chart of sequencing platforms
- Timeline: Download activity over time
- Storage Usage: Disk space monitoring
- WebSocket: Live progress updates without polling
- Notifications: Status changes and completion alerts
┌─────────────────────────────────────────────────────────────┐
│ Frontend (React) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Download │ │ Tasks │ │ History │ │ Statistics │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │
│ │ │ │ │ │
│ └──────────────┴──────────────┴──────────────┘ │
│ │ │
│ WebSocket + REST API │
└──────────────────────────────┼───────────────────────────────┘
│
┌──────────────────────────────┼───────────────────────────────┐
│ Backend (FastAPI) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Metadata │ │ Download │ │Validation│ │ Task Mgr │ │
│ │ Service │ │ Service │ │ Service │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │
│ │ │
│ SQLite Database │
└──────────────────────────────┼───────────────────────────────┘
│
┌──────────┴──────────┐
│ │
ENA API NCBI API
- Python 3.8+
- Node.js 16+
- seq-fetch library (installed automatically)
# Enter development shell
nix-shell
# Start servers
./start-dev.shcd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install local dependencies
pip install -e ../seq-fetch
pip install -e ../seq-fetch-cli
# Install backend dependencies
pip install -e .
# Run server
python -m app.mainThe API will be available at http://localhost:8000
API documentation at http://localhost:8000/docs
cd frontend
# Install dependencies
npm install
# Start development server
npm startThe web interface will be available at http://localhost:3000
For a quick demo without React frontend:
# Start backend
cd backend
source venv/bin/activate
python -m uvicorn app.main:app --port 8000 &
# Start static file server for demo page
cd ..
python3 -m http.server 3000Access:
- Demo page: http://localhost:3000/demo.html
- API docs: http://localhost:8000/docs
# Build frontend
cd frontend
npm run build
# The build files will be in frontend/build/
# Configure backend to serve static files from this directoryCreate a .env file in the backend directory:
# Server
HOST=0.0.0.0
PORT=8000
DEBUG=false
# Storage
STORAGE_DIR=~/.seq-fetch-web/downloads
DATABASE_URL=sqlite+aiosqlite:///./seq_fetch_web.db
# Download settings
MAX_CONCURRENT_DOWNLOADS=3
MAX_RETRIES=3
CHUNK_SIZE=8192
DOWNLOAD_TIMEOUT=300| Endpoint | Method | Description |
|---|---|---|
/api/v1/metadata/summary/{accession} |
GET | Get data summary |
/api/v1/metadata/run/{accession} |
GET | Get run metadata |
/api/v1/metadata/sample/{accession} |
GET | Get sample metadata |
/api/v1/metadata/detect-type/{accession} |
GET | Detect accession type |
/api/v1/metadata/search |
GET | Search accessions |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/download/start |
POST | Start download |
/api/v1/download/batch |
POST | Batch download |
/api/v1/download/status/{task_id} |
GET | Get task status |
/api/v1/download/cancel/{task_id} |
POST | Cancel download |
/api/v1/download/task/{task_id} |
DELETE | Delete task |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/tasks/list |
GET | List all tasks |
/api/v1/tasks/{task_id} |
GET | Get task details |
/api/v1/tasks/stats/summary |
GET | Get task statistics |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/validation/file |
POST | Validate single file |
/api/v1/validation/files |
POST | Validate multiple files |
/api/v1/validation/downloaded/{accession} |
GET | Validate downloaded files |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/history/list |
GET | List download history |
/api/v1/history/{accession} |
GET | Get history details |
/api/v1/history/{accession} |
DELETE | Delete history |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/statistics/overview |
GET | Get overall statistics |
/api/v1/statistics/timeline |
GET | Get download timeline |
/api/v1/statistics/platforms |
GET | Get platform statistics |
/api/v1/statistics/storage |
GET | Get storage statistics |
| Endpoint | Description |
|---|---|
/api/v1/ws |
WebSocket for real-time updates |
/api/v1/ws?task_id=xxx |
Subscribe to specific task |
- Open the web interface
- Enter accession (e.g.,
SRR10617884) - Click "Get Summary" to view data details
- Click "Download" to start download
- Monitor progress in real-time
import requests
# Start multiple downloads
response = requests.post(
'http://localhost:8000/api/v1/download/batch',
json={
'accessions': ['SRR10617884', 'SRR10617885', 'SRR10617886'],
'file_type': 'fastq',
'verify_md5': True,
'verify_gzip': True
}
)import requests
response = requests.get(
'http://localhost:8000/api/v1/download/status/{task_id}'
)
print(response.json())import requests
response = requests.get(
'http://localhost:8000/api/v1/validation/downloaded/SRR10617884'
)
print(response.json())- Search for accessions
- View data summary with file sizes
- Start download with options
- Real-time progress bars
- Speed and ETA display
- Cancel/retry controls
- Overview cards
- Platform distribution pie chart
- Download timeline bar chart
- Storage usage monitoring
- Check if backend is running
- Verify WebSocket endpoint is accessible
- Check browser console for errors
- Check network connectivity
- Verify accession is valid
- Check storage space
- Review error messages in task details
- Reduce
MAX_CONCURRENT_DOWNLOADS - Clear completed tasks regularly
- Monitor storage usage
# Backend tests
cd backend
pytest
# Frontend tests
cd frontend
npm test# Backend
black app/
flake8 app/
# Frontend
npm run lintMIT License
- ENA (European Nucleotide Archive) for data API
- NCBI SRA for data archival
- seq-fetch library for core functionality