Skip to content

Altakach313/ML_SpectralF_FullyHadronic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spectral Functions for Fully Hadronic SUSY Searches

This repository is a technical companion to the paper. It collects the configuration files and model artifacts needed to understand and reproduce the event-generation, detector-simulation, spectral-function, and machine-learning workflow used for the fully hadronic gluino-pair study.

The spectral-function calculation is documented explicitly below. The repository keeps the generation and simulation inputs, while leaving the implementation of the event-loop analysis to the user.

Repository layout

simulation/
  madgraph/
    signal_gluino_pair/   MadGraph cards for gluino-pair signal generation
    background_ttbar/     MadGraph cards for the ttbar background sample
  pythia/                 Pythia LHE showering template
  delphes/                Delphes ATLAS card and modified FastJetFinder module
scripts/                  Batch/helper scripts showing the production flow
models/
  all_spectral_bins/
                           Keras models trained with all 21 spectral bins
  feature_schema.json      Input feature ordering for the public models
  model_card.md            Technical model notes
  test_models_from_repo.ipynb
                           Notebook showing how to load and test the models
examples/
  model_ready_all_spectral_2000_test_sample.csv
                           Small standardized test sample for model checks
docs/                     Paper-facing availability text

External software

The production workflow used a CERN-style HEP software stack:

  • MadGraph5_aMC@NLO 3.5.7
  • Pythia 8
  • Delphes
  • ROOT
  • FastJet
  • Eigen
  • Python/TensorFlow/Keras for the ML models

The production environment used:

VO_ALICE@ROOT::v6-32-06-alice1-15
VO_ALICE@fastjet::v3.4.1_1.052-alice2-35
VO_ALICE@pythia::v8311-26
VO_ALICE@HepMC3::3.3.0-33

Paths in scripts/run_everything.sh are configurable through environment variables so the workflow can be adapted to another installation.

Event generation

The same production workflow is used for the gluino-pair signal and the ttbar background. The helper script exposes the sample choice as an argument and then selects the appropriate MadGraph cards, Pythia settings, and Delphes card.

Signal: gluino pair

The signal process is gluino-pair production with optional extra jet emission:

p p > go go / sq
p p > go go j / sq

The signal cards are in simulation/madgraph/signal_gluino_pair/:

  • proc_card.dat
  • run_card.dat
  • madspin_card.dat

The available cards use 13 TeV proton-proton collisions. The process card uses the MadGraph MSSM_SLHA2-full model, generates gluino-pair production with zero or one additional jet, and contains a representative gluino mass of 900 GeV and neutralino mass of 200 GeV. Mass scans can be generated by updating those parameters before launching MadGraph.

Background: ttbar

The background cards are in simulation/madgraph/background_ttbar/:

  • proc_card.dat
  • run_card.dat

The process card generates t t~ j j with hadronic and tau W-decay channels. The run card uses 13 TeV proton-proton collisions, ptj = 30, and maxjetflavor = 4.

Pythia and Delphes

The production flow is:

  1. generate LHE events with MadGraph,
  2. optionally run MadSpin,
  3. shower and hadronize the LHE events with Pythia 8,
  4. pass the Pythia output through Delphes,
  5. analyze the Delphes ROOT file to construct event-level features.

The Pythia settings are represented by:

simulation/pythia/pythia_lhef_template.cmnd

The Delphes card used in the study is:

simulation/delphes/delphes_card_ATLAS.tcl

The Delphes FastJetFinder module was modified so that an eta cut can be applied to jets inside the Delphes module. The modified files are included for traceability:

simulation/delphes/FastJetFinder.cc
simulation/delphes/FastJetFinder.h

To use them, copy these files into the matching Delphes source tree/module location and rebuild Delphes.

Spectral-function construction

For each selected Delphes event, the spectral function was computed from the reconstructed jets. Let each jet have transverse momentum pT_i, coordinates eta_i and phi_i, and let

DeltaR_ij = sqrt((eta_i - eta_j)^2 + DeltaPhi(phi_i, phi_j)^2).

For an angular bin beginning at R with width DeltaR = 0.4, all ordered jet pairs are scanned. The bin receives the transverse-momentum-weighted pair sum for pairs whose angular separation lies inside that annulus:

S(R) = (1 / HT^2) * sum_{i,j} pT_i pT_j I(R <= DeltaR_ij < R + 0.4),

where HT is the scalar sum of the selected jet transverse momenta and I(...) is one when the pair falls in the bin and zero otherwise.

The analysis used bins from R = 0.0 up to R = 8.0, giving 21 spectral features:

SpectralFunc0, SpectralFunc1, ..., SpectralFunc20.

The ML-ready event tables also contained standard event and jet observables, including eventHT, leading-jet transverse momenta, jet masses, missing transverse energy, b-jet multiplicity, deltaPhi, mTbjet, and the event-shape C parameter. The public trained models included here use the 21 spectral bins, the first five jet transverse momenta, and eventHT.

Machine-learning models

The public model files are:

models/all_spectral_bins/model_1200.h5
models/all_spectral_bins/model_1600.h5
models/all_spectral_bins/model_2000.h5

These are TensorFlow/Keras binary classifiers trained for gluino masses of 1.2, 1.6, and 2.0 TeV against the ttbar background, using all 21 spectral-function bins.

The model input ordering is recorded in:

models/feature_schema.json

The models were trained on standardized features. In the training notebooks, a StandardScaler was fit on the training split and applied to validation and test sets before model training. The serialized .h5 files therefore expect inputs with the same feature order and normalization convention.

The architecture used in the notebooks was:

Dense(64, relu)
Dropout(0.5)
Dense(32, relu)
Dropout(0.5)
Dense(1, sigmoid)

For more detail, see models/model_card.md.

To verify that the included model files can be loaded and used, open:

models/test_models_from_repo.ipynb

The notebook uses the included model files, models/feature_schema.json, and the small standardized example file in examples/. It verifies that the .h5 models load and produce scores, checks the 2000 GeV model on the example rows, and then shows where to plug in a user-provided standardized feature table.

For command-line inference on a model-ready CSV, use:

python models/example_inference.py \
  --model models/all_spectral_bins/model_2000.h5 \
  --input examples/model_ready_all_spectral_2000_test_sample.csv \
  --output scored_events.csv

The input CSV must contain the feature columns listed in models/feature_schema.json and must already use the same standardization convention as the training arrays.

Batch helper

The helper script:

scripts/run_everything.sh

shows the end-to-end production sequence for MadGraph, optional MadSpin, Pythia, and Delphes.

Example:

scripts/run_everything.sh 0 0 background_ttbar
scripts/run_everything.sh 0 0 signal_gluino_pair 2000

The HTCondor submit description is:

scripts/submit_delphes.sub

Data and code availability

A short paper-ready paragraph is available in:

docs/data_and_code_availability.md

Large generated event files and full training tables are not included in this repository because of their size.

About

Machine learning fully hadronic events with spectral functions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors