Skip to content

saishekhar10/linux-platform-validator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Linux Platform Validator

A lightweight Linux platform validation and stress-testing automation framework built in Python. It uses stress-ng to exercise CPU, memory, and disk/IO subsystems, evaluates results against configurable pass/fail thresholds, and produces structured JSON reports with CI-ready exit codes.

This project simulates the type of platform validation and automation work performed by Systems Test Engineers validating SoC platforms (e.g., Qualcomm Snapdragon) in production environments.


Table of Contents


Project Goal

Build a modular, production-quality platform validation framework that:

  1. Stress-tests core hardware subsystems (CPU, memory, disk/IO) using stress-ng.
  2. Parses performance metrics (bogo ops/sec) robustly using regex — not fragile string splits.
  3. Evaluates results against configurable pass/fail thresholds.
  4. Reports structured JSON results to disk for automated consumption.
  5. Integrates with CI/CD pipelines via proper exit codes (0 = PASS, 1 = FAIL).
  6. Collects system hardware and runtime information as part of each report.
  7. Maintains clean modular architecture — each test, utility, and concern lives in its own module.

The framework mirrors real-world validation workflows where hardware platforms are stress-tested after manufacturing, firmware updates, or kernel changes, and results are evaluated automatically.


Architecture

linux-platform-validator/
│
├── main.py                 # CLI entry point — argparse, orchestration, exit codes
├── config.py               # Thresholds, worker counts, default parameters
├── report.py               # JSON report generation and disk persistence
├── requirements.txt        # Dependencies (stress-ng is the only external dep)
├── README.md               # This file
│
├── tests/                  # Stress test modules (one per subsystem)
│   ├── __init__.py
│   ├── cpu_test.py         # CPU stress test → run_cpu_test()
│   ├── memory_test.py      # Memory stress test → run_memory_test()
│   └── disk_test.py        # Disk/IO stress test → run_disk_test()
│
├── utils/                  # Shared utilities
│   ├── __init__.py
│   ├── runner.py           # Safe subprocess execution wrapper
│   ├── parser.py           # Regex-based stress-ng output parser
│   └── system_info.py      # Structured system information collector
│
└── reports/                # Generated at runtime
    └── report.json         # Latest validation report

Data Flow

CLI args (main.py)
    │
    ├─→ run_cpu_test()    ─→ runner.py ─→ stress-ng --cpu    ─→ parser.py ─→ result dict
    ├─→ run_memory_test() ─→ runner.py ─→ stress-ng --vm     ─→ parser.py ─→ result dict
    ├─→ run_disk_test()   ─→ runner.py ─→ stress-ng --hdd    ─→ parser.py ─→ result dict
    │
    └─→ generate_report()
            ├─→ get_system_info()
            ├─→ Aggregate results + overall PASS/FAIL
            ├─→ Write reports/report.json
            └─→ Return report dict → print to console → exit code

Module Reference

main.py

Entry point. Parses CLI arguments with argparse, selects which tests to run, orchestrates execution, invokes the report generator, prints a summary table to the console, and returns a CI-ready exit code via sys.exit().

config.py

Central configuration. All thresholds, worker counts, allocation sizes, and report paths are defined here. No magic numbers exist anywhere else in the codebase.

Parameter Value Description
CPU_MIN_SCORE 1500 Minimum CPU bogo ops/sec to PASS
MEMORY_MIN_SCORE 1000 Minimum memory bogo ops/sec to PASS
DISK_MIN_SCORE 500 Minimum disk bogo ops/sec to PASS
DEFAULT_DURATION 10 Default test duration (seconds)
CPU_WORKERS 4 Number of CPU stressor workers
VM_WORKERS 2 Number of VM (memory) stressor workers
VM_BYTES 70% Memory allocation per VM worker
HDD_WORKERS 2 Number of HDD stressor workers
HDD_BYTES 70% Write size per HDD worker

report.py

Accepts a list of test result dicts, collects system info, determines overall PASS/FAIL (all tests must pass), writes reports/report.json to disk, and returns the complete report dict.

tests/cpu_test.py

Runs stress-ng --cpu {workers} --timeout {duration}s --metrics-brief. Parses the cpu stressor line for bogo ops/sec. Compares against CPU_MIN_SCORE.

tests/memory_test.py

Runs stress-ng --vm {workers} --vm-bytes {bytes} --timeout {duration}s --metrics-brief. Parses the vm stressor line for bogo ops/sec. Compares against MEMORY_MIN_SCORE.

tests/disk_test.py

Runs stress-ng --hdd {workers} --hdd-bytes {bytes} --timeout {duration}s --metrics-brief. Parses the hdd stressor line for bogo ops/sec. Compares against DISK_MIN_SCORE.

utils/runner.py

Wraps subprocess.run() with:

  • try/except for TimeoutExpired and general exceptions
  • Structured return dict: { success, stdout, stderr, returncode }
  • Debug-level command logging
  • Warning-level logging on non-zero exit codes

utils/parser.py

Generic regex parser for stress-ng --metrics-brief output. Accepts any stressor name (cpu, vm, hdd, etc.) and extracts the bogo-ops/s (real time) column. Handles both stdout and stderr since stress-ng behavior varies by version.

The regex pattern matches:

<stressor>  <bogo-ops>  <real-time>  <usr-time>  <sys-time>  <bogo-ops/s>

utils/system_info.py

Collects structured hardware and runtime information:

  • CPU: model, architecture, core count, threads/core, max MHz (from lscpu)
  • Memory: total, free, available, swap (from /proc/meminfo, converted to MB)
  • Disk: total, used, available, usage % (from df -h /)
  • Uptime: raw seconds and human-readable string (from /proc/uptime)

All functions return Python dicts, not raw command output.


Prerequisites

  • Python: 3.10+ (uses float | None union type syntax)
  • stress-ng: Must be installed and available in $PATH
  • Operating System: Linux (tested on Ubuntu)

Installation

# Install stress-ng
sudo apt update && sudo apt install -y stress-ng

# Clone the project
git clone <repository-url>
cd linux-platform-validator

# No Python package dependencies required — stdlib only

Usage

All commands must be run from the project root directory.

# Run all tests (CPU + Memory + Disk) with 30-second duration
python3 main.py --all --duration 30

# Run only the CPU test
python3 main.py --cpu --duration 30

# Run only the memory test
python3 main.py --memory --duration 15

# Run only the disk/IO test
python3 main.py --disk --duration 15

# Run CPU and memory together
python3 main.py --cpu --memory --duration 20

# Enable debug logging
python3 main.py --all --duration 10 --verbose

CLI Arguments

Flag Description
--cpu Run CPU stress test
--memory Run memory stress test
--disk Run disk/IO stress test
--all Run all tests
--duration N Test duration in seconds (default: 10)
--verbose, -v Enable debug-level logging

If no test flags are provided, the program exits with code 1 and an error message.


Configuration

Edit config.py to adjust thresholds and parameters for your target platform.

Thresholds determine PASS/FAIL. If a test's bogo ops/sec score is below the threshold, it fails. The overall status is FAIL if any individual test fails.

Threshold values should be calibrated for the specific hardware under test. A VM will produce different scores than bare metal, and a mobile SoC will differ from a server CPU. The recommended approach is:

  1. Run with --all --duration 60 on your target hardware.
  2. Note the baseline scores.
  3. Set thresholds to ~70-80% of baseline to allow for normal variance.

How Each Test Works

CPU Stress Test

Command: stress-ng --cpu 4 --timeout {duration}s --metrics-brief

Spawns 4 CPU worker processes that exercise a variety of CPU-intensive operations (integer math, floating point, bit manipulation, etc.). The --metrics-brief flag produces a summary table at the end with bogo-ops/sec for each stressor.

What it measures: Raw computational throughput of the CPU under full load across all cores.

Why it matters: Detects CPU defects, thermal throttling, frequency scaling issues, and hypervisor overhead. A score that is significantly lower than expected for a given CPU model indicates a problem.

Memory Stress Test

Command: stress-ng --vm 2 --vm-bytes 70% --timeout {duration}s --metrics-brief

Spawns 2 VM worker processes, each allocating and exercising 70% of available memory through continuous write/read/verify cycles.

What it measures: Memory bandwidth and subsystem throughput under sustained allocation pressure.

Why it matters: Detects faulty memory modules, memory controller issues, NUMA misconfigurations, and kernel memory management regressions. The vm stressor specifically tests mmap, write, read, and verify patterns.

Disk/IO Stress Test

Command: stress-ng --hdd 2 --hdd-bytes 70% --timeout {duration}s --metrics-brief

Spawns 2 HDD worker processes that perform sequential write/read operations to temporary files.

What it measures: Disk I/O throughput for sequential write operations.

Why it matters: Detects storage subsystem degradation, filesystem driver issues, I/O scheduler misconfigurations, and storage controller problems. Particularly important for platforms with eMMC/UFS storage where firmware updates can regress I/O performance.


Report Format

After every run, a JSON report is written to reports/report.json and printed to the console.

{
    "timestamp": "2026-02-20T13:32:17.407676",
    "system_info": {
        "cpu": {
            "model": "Intel(R) Core(TM) i5-5287U CPU @ 2.90GHz",
            "architecture": "x86_64",
            "cores": "4",
            "threads_per_core": "2",
            "max_mhz": "3300.0000"
        },
        "memory": {
            "total": "7842 MB",
            "free": "2160 MB",
            "available": "3305 MB",
            "swap_total": "4095 MB"
        },
        "disk": {
            "total": "457G",
            "used": "12G",
            "available": "422G",
            "use_pct": "3%"
        },
        "uptime": {
            "seconds": 3446.37,
            "human": "0h 57m"
        }
    },
    "results": [
        {
            "test": "cpu",
            "score": 1936.67,
            "unit": "bogo ops/sec",
            "threshold": 1500,
            "status": "PASS"
        },
        {
            "test": "memory",
            "score": 92566.69,
            "unit": "bogo ops/sec",
            "threshold": 1000,
            "status": "PASS"
        },
        {
            "test": "disk",
            "score": 21296.91,
            "unit": "bogo ops/sec",
            "threshold": 500,
            "status": "PASS"
        }
    ],
    "overall_status": "PASS"
}

Each test result contains:

  • test: Subsystem name
  • score: Measured bogo ops/sec
  • unit: Always "bogo ops/sec"
  • threshold: The configured minimum score for PASS
  • status: "PASS" or "FAIL"

overall_status is "PASS" only if every individual test passed.


Recorded Test Outputs

The following are actual recorded runs on the development machine.

Test Platform

  • CPU: Intel Core i5-5287U @ 2.90GHz (4 cores, 2 threads/core, turbo to 3.3 GHz)
  • RAM: 7,842 MB total
  • Disk: 457 GB total, 3% used
  • OS: Ubuntu Linux (running in a VM)

Run 1: All Tests, 5-Second Duration

$ python3 main.py --all --duration 5

=== Test Summary ===
  cpu           1954.74 bogo ops/sec    [FAIL] (threshold: 3000)
  memory        5796.68 bogo ops/sec    [PASS] (threshold: 1000)
  Overall status: FAIL
  Exit code: 1

Observation: CPU scored 1,954 against the initial threshold of 3,000 — a FAIL. The 5-second duration was too short for the CPU stressor to fully stabilize. Memory passed comfortably at 5.8x the threshold.

Action taken: Lowered CPU_MIN_SCORE from 3,000 to 1,500 to match the hardware class (5th-gen mobile i5 in a VM).

Run 2: CPU Only, 30-Second Duration

$ python3 main.py --cpu --duration 30

=== Test Summary ===
  cpu           2268.97 bogo ops/sec    [PASS] (threshold: 1500)
  Overall status: PASS
  Exit code: 0

Observation: With 30 seconds, the CPU score rose ~16% to 2,268.97 and comfortably passed the adjusted threshold. Longer duration allows workers to fully saturate cores and reach steady-state throughput.

Run 3: All Tests (CPU + Memory + Disk), 30-Second Duration

$ python3 main.py --all --duration 30

=== Test Summary ===
  cpu           1936.67 bogo ops/sec    [PASS] (threshold: 1500)
  memory       92566.69 bogo ops/sec    [PASS] (threshold: 1000)
  disk         21296.91 bogo ops/sec    [PASS] (threshold: 500)
  Overall status: PASS
  Exit code: 0

Observation: All three subsystems passed. CPU scored slightly lower (1,936 vs 2,268 in isolation) because the tests ran sequentially and the system had already been under load. Memory scored dramatically higher at 30s (92,566 vs 5,796 at 5s) — the vm stressor benefits significantly from longer runtimes. Disk scored 21,296, which is 42x the threshold.

Score Comparison Across Runs

Test 5s Run 30s Run (isolated) 30s Run (combined)
CPU 1,954.74 2,268.97 1,936.67
Memory 5,796.68 92,566.69
Disk 21,296.91

Observations and Learnings

1. Test Duration Significantly Affects Results

Short stress tests (5s) produce noisy, lower scores. Workers need time to initialize, allocate resources, and reach steady-state. A 30-second minimum is recommended for reliable measurements. For production validation, 60 seconds or more is standard.

2. stress-ng Writes Metrics to stderr on Some Versions

The --metrics-brief output may appear on stdout or stderr depending on the stress-ng version. The framework handles this by combining both streams before parsing. This was a key bug fix — the original codebase only checked result.stdout, which would silently fail on newer stress-ng versions.

3. Regex Parsing Is Essential

The original code used fragile column-index string splits to parse stress-ng output. Different versions of stress-ng may produce slightly different column spacing, extra columns, or variant stressor names (e.g., vm vs vm-rw). The regex pattern {stressor}[\w-]*\s+\d+\s+... handles all these variants robustly.

4. Thresholds Must Be Hardware-Specific

The initial CPU threshold of 3,000 was appropriate for a modern desktop CPU on bare metal, but too aggressive for a 5th-gen mobile i5 running inside a VM. Thresholds should be baselined per hardware platform. In a real SoC validation environment, each platform SKU would have its own threshold profile.

5. VM Overhead Reduces CPU Scores

The VM environment introduces hypervisor scheduling overhead, which reduces CPU-bound throughput by roughly 20-40% compared to bare metal. This is visible in the CPU scores (1,900-2,200) which are lower than what a bare-metal i5-5287U would produce (~3,000+).

6. Memory Scores Scale Dramatically With Duration

The vm stressor showed a 16x improvement from 5s to 30s (5,796 → 92,566). This is because at 5s, the stressor is still in its allocation and initialization phase. At 30s, it reaches steady-state mmap/write/verify cycling. Memory tests should always use ≥30s duration.

7. Sequential Test Execution Affects Scores

When running all tests together, the CPU score was slightly lower than when run in isolation (1,936 vs 2,268). Even though tests run sequentially (not in parallel), system state from prior tests (warm caches, memory fragmentation, kernel state) can influence subsequent results.

8. Structured Reporting Enables Automation

Raw console output is not machine-parseable. By producing a structured JSON report with a deterministic schema, the framework can be consumed by CI pipelines, test management systems, and automated alerting without any output scraping. The exit code provides immediate pass/fail signaling.

9. Error Isolation Prevents Cascading Failures

Each test runs in its own try/except boundary via utils/runner.py. If one stressor fails (e.g., stress-ng is not installed, or the disk is full), it returns a score of 0.0 and FAIL status without crashing the entire framework. Other tests continue to execute.

10. Logging vs Print

Replacing print() with Python's logging module provides:

  • Timestamped output for correlating with system events
  • Log levels (DEBUG, INFO, WARNING, ERROR) for filtering
  • Module-namespaced loggers for tracing which component produced each message
  • --verbose flag support without code changes

CI/CD Integration

The framework is designed for integration into CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, etc.).

Exit Codes

Exit Code Meaning
0 All selected tests PASSED
1 One or more tests FAILED, or no tests selected

Example: GitHub Actions

- name: Run platform validation
  run: python3 main.py --all --duration 30

- name: Upload report
  if: always()
  uses: actions/upload-artifact@v4
  with:
    name: platform-report
    path: reports/report.json

Example: Jenkins Pipeline

stage('Platform Validation') {
    sh 'python3 main.py --all --duration 30'
}
post {
    always {
        archiveArtifacts artifacts: 'reports/report.json'
    }
}

The non-zero exit code on failure will automatically fail the CI stage, preventing promotion of a platform image that doesn't meet performance thresholds.


Extending the Framework

To add a new stress test (e.g., network):

  1. Add threshold and parameters in config.py:

    NETWORK_MIN_SCORE = 100
  2. Create the test module at tests/network_test.py:

    from config import NETWORK_MIN_SCORE
    from utils.parser import parse_stress_ng_output
    from utils.runner import run_command
    
    def run_network_test(duration: int = 10) -> dict:
        command = f"stress-ng --sock 2 --timeout {duration}s --metrics-brief"
        result = run_command(command, timeout=duration + 30)
        combined_output = result["stdout"] + "\n" + result["stderr"]
        score = parse_stress_ng_output(combined_output, stressor="sock")
        status = "PASS" if score and score >= NETWORK_MIN_SCORE else "FAIL"
        return {
            "test": "network",
            "score": score or 0.0,
            "unit": "bogo ops/sec",
            "threshold": NETWORK_MIN_SCORE,
            "status": status,
        }
  3. Wire into main.py: Add a --network argument and call run_network_test().

The parser (utils/parser.py) is generic — it works with any stress-ng stressor name. No changes needed there.


License

This project is provided as-is for educational and portfolio purposes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages