Skip to content

DataChefHQ/technical-interview-de

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Order Event Ingestion Pipeline

Scenario

Our platform receives a JSON export from the order management system every hour. Each export is a flat list of order events that need to be validated, enriched, and written to a processed output file for downstream consumers.

You've been handed this codebase by a junior engineer who got it working and moved on to another project. Your task is to work with it, understand it, and improve it.

Input format

Each record in data/orders_raw.json represents a single order event:

{
  "order_id": "ORD-1001",
  "customer_id": "CUST-42",
  "amount": 120.0,
  "status": "confirmed"
}

Valid statuses: pending, confirmed, shipped, cancelled.

How to run

uv run main.py

Output will be written to data/orders_out.json.

What the pipeline does

  1. Reads all records from the input file
  2. Validates that required fields are present
  3. Filters out records with unrecognised statuses
  4. Enriches each valid record with total_with_tax and a processed_at timestamp
  5. Writes the enriched records to the output file
  6. Prints a summary of processed / failed / skipped counts

About

Simple project for pair programming during the technical interview for data engineers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages