Lantana is a honeypot-as-code platform. Deploy IPv4/IPv6 dual-stack honeypots from an Ansible inventory, capture attacker behavior into a typed datalake, and ship intelligence as Discord briefs, STIX bundles, and a Streamlit dashboard. Aligned with MITRE Engage — honeypots are treated as disposable infrastructure that gets rotated and reshaped as narratives evolve.
The platform covers the full lifecycle: controlled exposure → structured ingestion → enrichment → analysis → intelligence output. Designed for disposability, policy-driven deployment, and strict blast-radius containment.
Tip
Lantana camara is a plant that attracts insects with its colorful flowers — much like a honeypot attracts attackers with its deliberately vulnerable services. It's invasive, resilient, and thrives in hostile environments.
Streamlit dashboard — overview |
Dashboard — geography |
More screenshots
git clone https://github.com/lopes/lantana.git
cd lantanaProvision a Debian 13 host (VM or bare metal). Terraform support for Proxmox is available under infra/terraform/environments/proxmox/.
cd config/ansible
cp -r inventories/op_single inventories/op_myopCustomize inventory.yml, main.yml, network.yml, narrative.yml, and reporting.yml under inventories/op_myop/group_vars/all/. See the setup guide for an annotated walkthrough of each file. Create the encrypted vault:
ansible-vault create inventories/op_myop/group_vars/all/vault.ymlansible-playbook -i inventories/op_myop/inventory.yml playbooks/deploy_single.yml --ask-vault-pass
ansible-playbook -i inventories/op_myop/inventory.yml playbooks/deploy_honeypots.yml --ask-vault-passansible-playbook -i inventories/op_myop/inventory.yml tests/validate-single-node.yml -vvv --ask-vault-passLantana is designed to operate safely in hostile environments and assumes that sensor hosts will eventually be compromised. To ensure ethical, legal, and operational safety, the platform enforces these rules at both architectural and operational levels:
- No offensive use. Honeypots must never be used as offensive infrastructure. The honeywall zone enforces outbound restrictions by default — compromised hosts cannot scan, attack, or otherwise harm third parties. Egress allowances must be explicit and narrowly scoped.
- Assume disposability. No secrets, credentials, production access, or sensitive systems on sensor hosts. Any compromise is total. Rebuilds are routine, not exceptional.
- No entrapment. Honeypots must not target specific individuals or organizations without explicit legal authorization. Lantana is broad-spectrum research, not targeted intelligence collection.
- Respect privacy. Captured data must be handled, stored, and processed according to applicable policies, regulations, and ethical standards.
- Align with operational goals. Narratives, exposure profiles, and sensor configurations must answer specific questions. Sensor rotation and topology shifts are part of the lifecycle, not ad hoc events.
- No infrastructure disclosure. Real operator-identifying values (WAN IPs, hostnames, domains, ASNs, SSH host fingerprints) must never appear in any artifact leaving the operator's control — that includes Discord reports, STIX bundles, this repository, commits, talks, screenshots. Real values live only inside each operation's untracked or vault-encrypted inventory. Examples in tracked files use RFC 5737 / 3849 / 2606 / 5398 documentation ranges.
lantana/
config/ansible/ # Ansible roles, playbooks, inventories, validation playbooks
infra/terraform/ # Terraform host provisioning (Proxmox)
pipeline/ # Python data pipeline (enrichment, OCSF, dashboard, reports, STIX)
scripts/ # Operational scripts (bootstrap, backup, probes, dashboard)
docs/ # Full documentation
| Document | Description |
|---|---|
| Architecture | Zoned model, deployment modes, network topology, tech stack |
| Setup Guide | First-deploy walkthrough: prepare server, clone an operation, vault, narrative, deploy honeypots |
| Pipeline | Data pipeline: bronze/silver/gold datalake, OCSF normalization, enrichment, reports, STIX |
| Integrations | Third-party threat-intel providers: endpoints, auth, rate limits, field extraction, live-probe workflow |
| Validation | Post-deploy verification: protocol smoke tests + day-by-day pipeline/report/dashboard checks |
| Risk Scoring | Composite + per-provider risk score formula, RIOT short-circuit, decomposition |
| Honeypots | Cowrie + Dionaea: per-honeypot config model, capability allowlist, persona drift notes |
| Troubleshooting | Common issues and fixes (Dionaea startup, nftables, Vector pipeline) |
| Glossary | Terminology and definitions |
Lantana intentionally avoids Kubernetes (honeypots are disposable, not HA), SIEM-first architectures (research honeypot data benefits from batch analytics over real-time alerting), and monolithic stacks like T-Pot (Lantana is composable — infrastructure, policy, sensors, and narratives evolve independently).
For the full rationale, see docs/architecture.md.
Tracked work for post-v1.0.0 — a mix of known gaps deliberately deferred from the v1 cut and planned improvements that would expand Lantana's analytical reach. Boxes get checked as items land.
- Dionaea download URL → pipeline. The
storeihandler captures attacker-delivered binaries to/var/lib/lantana/sensor/dionaea/binaries/, but URL + hash metadata stays inside dionaea's incident bus and never reachesdionaea.json. Surfacing the URL into bronze → silver → brief → STIX needs a small custom ihandler that subscribes todionaea.download.complete.uniqueand writes the URL + MD5 + connection metadata to a JSON stream Vector can tail. - Dionaea MSSQL/MySQL command bodies → pipeline. Same upstream constraint. The bundled
log_jsonihandler emits only connection lifecycle + credentials for these services; surfacing command bodies requires a custom ihandler or switching tolog_sqlite+ a tail job. - SHA-256 hash integration for dionaea binaries. Dionaea names captured files by MD5 (and re-emits SHA-512 via the
storeihandler), but the IOC pipeline (file_hash_sha256column, STIX file indicators, VirusTotal lookup) assumes SHA-256. Either add afile_hash_sha512column with downstream fallback handling, or hash files on disk after alantana-grant-readcron grantsnectaraccess to/var/lib/lantana/sensor/dionaea/binaries/(parallel to Cowrie's atroles/cowrie/tasks/main.yml). - Self-built honeypot images. Replace the upstream rolling tags
docker.io/cowrie/cowrie:latestanddocker.io/dinotools/dionaea:nightlywith images built by Lantana CI from a pinned upstream git ref, published toghcr.io/lopes/lantana-{cowrie,dionaea}. Two incidents on 2026-06-08 → 2026-06-11 traced to upstream rebases silently flowing into production (cowrie's in-image UID shifted 998 → 999; dionaea's two-user runtime started racing the Quadlet's,Uflag). Owning the build pipeline makes the in-container UID/GID a repo constant, moves rebases to PRs gated byvalidate-sensor-runtime, and lets us carry in-house patches (MSSQL version-string hardcoding, SHA-512-vs-SHA-256 hashing). Full brief in TODO.md. - Dashboard date-range selector. The Streamlit dashboard currently renders a fixed window from the gold tables. Adding
start+enddate pickers (defaulting to the last 7 days) and threading them through the Polars filters on every page would let analysts scope visualizations to specific incidents, persona-rotation windows, or comparative periods without code edits. Major analytical power-up for a small UI change. - Multi-day slow-burn detection — proper version. The v1.0.0 multi-day progression gold table was retired (it OOM'd the 7.6 GB collector at a 7-day silver lookback). The dashboard's Multi-Day Progression section now derives a multi-day flag from daily
behavioral_progressiongold over a trailing 7-day window — cheap, but lossy: it can't see the silver-only per-stage timestamps the old logic used (e.g. day of first credential attempt vs day of first interactive command), so "Multi-Day IPs" is a permissive proxy for slow-burn behaviour rather than a stage-progression signal. A proper rewrite would either (a) compute per-day per-stage timestamps at silver→gold time and store them in the dailybehavioral_progressionpartition, then aggregate at the dashboard, or (b) stream the multiday computation per-day to avoid the silver concat. Either approach keeps the slow-burn detection lossless without the OOM risk.
This project is licensed under the MIT License.









