Skip to content

Built a streaming fraud detection system with Apache Kafka and Python. Deployed a Kafka cluster via Docker Compose, implemented a transaction generator and fraud detector using kafka-python, and routed suspicious transactions to separate topics for real-time monitoring. Demonstrates event streaming, producers, consumers, and containerization.

Notifications You must be signed in to change notification settings

mtholahan/kafka-mini-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kafka Mini Project

📖 Abstract

This project implements a real-time fraud detection pipeline using Apache Kafka and Python. The system simulates financial transactions, streams them through Kafka, and applies rule-based filtering to flag suspicious activity. The goal is to gain practical experience with streaming architectures, producers, consumers, and containerized deployments.

The workflow includes:

  • Running a local Kafka cluster using Docker Compose with broker and Zookeeper services.

  • Building a transaction generator that continuously produces randomized account transfers into a Kafka topic.

  • Creating a fraud detector application that consumes transactions, evaluates them against business rules, and branches outputs into "legit" or "fraud" topics.

  • Packaging all components with Dockerfiles, requirements.txt, and docker-compose.yml for reproducibility.

  • Verifying results by consuming messages from output topics, confirming that transactions over $900 are correctly flagged as fraudulent.

Through this project, I gained hands-on skills in stream processing, Kafka topic design, producer/consumer APIs, and containerized workflow orchestration, while also exploring real-world challenges in fraud detection systems.

🛠 Requirements

  • Docker Engine 20.x or later

  • Docker Compose v2

  • Ubuntu 22.04 LTS environment (tested)

  • docker-compose.yml defining all services:

    • zookeeper (Confluent cp-zookeeper)

    • kafka broker (Confluent cp-kafka)

    • generator (Python producer app)

    • detector (Python consumer/producer app)

  • Python dependency (inside app containers):

    • kafka-python

🧰 Setup

  • Clone repository and navigate to kafka-docker/ directory

  • Build images: docker-compose build --no-cache

  • Start cluster + apps: docker-compose up -d

  • Verify broker startup logs (Kafka ready)

  • Verify generator and detector services running

  • Inspect Kafka topics via kafka-console-consumer from broker container

📊 Dataset

  • Streaming data consists of synthetic transactions generated by the producer app

  • Transaction schema includes: transaction_id, account_id, timestamp, amount, merchant, location

⏱️ Run Steps

  • Start services with: docker-compose up -d

  • Producer (generator) writes messages into topic: queueing.transactions

  • Consumer (detector) reads queueing.transactions, applies fraud detection rules, and branches to:

    • streaming.transactions.legit

    • streaming.transactions.fraud

  • Verify output using kafka-console-consumer inside broker container

📈 Outputs

  • Two Kafka topics with processed messages:

    • streaming.transactions.legit (valid transactions)

    • streaming.transactions.fraud (flagged transactions)

  • Console logs showing consumed/produced records

  • Demonstration of near real-time fraud detection pipeline

📸 Evidence

01_docker_running.png
Screenshot of Dockerized Kafka running

02_code_being_executed.png
Screenshot of code execution

03_legit_transactions.png
Screenshot of legitimate transactions

04_fraudulent_transactions.png
Screenshot of fraudulent transactions

📎 Deliverables

🛠️ Architecture

  • Multi-container Docker environment

  • Services:

    • Producer app → Kafka broker

    • Detector app (consumer + branching producer)

    • Zookeeper for coordination

  • Data flow:

    generator → queueing.transactions → detector → (fraud or legit topics)

🔍 Monitoring

  • Kafka CLI tools (kafka-console-consumer) to inspect topics

  • Docker logs for generator and detector services

  • Broker logs for message flow validation

♻️ Cleanup

  • Stop services: docker-compose down

  • Remove local Docker volumes for Kafka logs/state if re-running

  • Delete external Docker network if created manually

Generated automatically via Python + Jinja2 + SQL Server table tblMiniProjectProgress on 11-11-2025 15:31:05

About

Built a streaming fraud detection system with Apache Kafka and Python. Deployed a Kafka cluster via Docker Compose, implemented a transaction generator and fraud detector using kafka-python, and routed suspicious transactions to separate topics for real-time monitoring. Demonstrates event streaming, producers, consumers, and containerization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published