This repository is a technical companion to the paper. It collects the configuration files and model artifacts needed to understand and reproduce the event-generation, detector-simulation, spectral-function, and machine-learning workflow used for the fully hadronic gluino-pair study.
The spectral-function calculation is documented explicitly below. The repository keeps the generation and simulation inputs, while leaving the implementation of the event-loop analysis to the user.
simulation/
madgraph/
signal_gluino_pair/ MadGraph cards for gluino-pair signal generation
background_ttbar/ MadGraph cards for the ttbar background sample
pythia/ Pythia LHE showering template
delphes/ Delphes ATLAS card and modified FastJetFinder module
scripts/ Batch/helper scripts showing the production flow
models/
all_spectral_bins/
Keras models trained with all 21 spectral bins
feature_schema.json Input feature ordering for the public models
model_card.md Technical model notes
test_models_from_repo.ipynb
Notebook showing how to load and test the models
examples/
model_ready_all_spectral_2000_test_sample.csv
Small standardized test sample for model checks
docs/ Paper-facing availability text
The production workflow used a CERN-style HEP software stack:
- MadGraph5_aMC@NLO 3.5.7
- Pythia 8
- Delphes
- ROOT
- FastJet
- Eigen
- Python/TensorFlow/Keras for the ML models
The production environment used:
VO_ALICE@ROOT::v6-32-06-alice1-15
VO_ALICE@fastjet::v3.4.1_1.052-alice2-35
VO_ALICE@pythia::v8311-26
VO_ALICE@HepMC3::3.3.0-33Paths in scripts/run_everything.sh are configurable through environment
variables so the workflow can be adapted to another installation.
The same production workflow is used for the gluino-pair signal and the ttbar
background. The helper script exposes the sample choice as an argument and then
selects the appropriate MadGraph cards, Pythia settings, and Delphes card.
The signal process is gluino-pair production with optional extra jet emission:
p p > go go / sq
p p > go go j / sq
The signal cards are in simulation/madgraph/signal_gluino_pair/:
proc_card.datrun_card.datmadspin_card.dat
The available cards use 13 TeV proton-proton collisions. The process card
uses the MadGraph MSSM_SLHA2-full model, generates gluino-pair production
with zero or one additional jet, and contains a representative gluino mass of
900 GeV and neutralino mass of 200 GeV. Mass scans can be generated by updating
those parameters before launching MadGraph.
The background cards are in simulation/madgraph/background_ttbar/:
proc_card.datrun_card.dat
The process card generates t t~ j j with hadronic and tau W-decay channels.
The run card uses 13 TeV proton-proton collisions, ptj = 30, and
maxjetflavor = 4.
The production flow is:
- generate LHE events with MadGraph,
- optionally run MadSpin,
- shower and hadronize the LHE events with Pythia 8,
- pass the Pythia output through Delphes,
- analyze the Delphes ROOT file to construct event-level features.
The Pythia settings are represented by:
simulation/pythia/pythia_lhef_template.cmnd
The Delphes card used in the study is:
simulation/delphes/delphes_card_ATLAS.tcl
The Delphes FastJetFinder module was modified so that an eta cut can be
applied to jets inside the Delphes module. The modified files are included for
traceability:
simulation/delphes/FastJetFinder.cc
simulation/delphes/FastJetFinder.h
To use them, copy these files into the matching Delphes source tree/module location and rebuild Delphes.
For each selected Delphes event, the spectral function was computed from the
reconstructed jets. Let each jet have transverse momentum pT_i, coordinates
eta_i and phi_i, and let
DeltaR_ij = sqrt((eta_i - eta_j)^2 + DeltaPhi(phi_i, phi_j)^2).
For an angular bin beginning at R with width DeltaR = 0.4, all ordered jet
pairs are scanned. The bin receives the transverse-momentum-weighted pair sum
for pairs whose angular separation lies inside that annulus:
S(R) = (1 / HT^2) * sum_{i,j} pT_i pT_j I(R <= DeltaR_ij < R + 0.4),
where HT is the scalar sum of the selected jet transverse momenta and I(...)
is one when the pair falls in the bin and zero otherwise.
The analysis used bins from R = 0.0 up to R = 8.0, giving 21 spectral
features:
SpectralFunc0, SpectralFunc1, ..., SpectralFunc20.
The ML-ready event tables also contained standard event and jet observables,
including eventHT, leading-jet transverse momenta, jet masses, missing
transverse energy, b-jet multiplicity, deltaPhi, mTbjet, and the event-shape
C parameter. The public trained models included here use the 21 spectral bins,
the first five jet transverse momenta, and eventHT.
The public model files are:
models/all_spectral_bins/model_1200.h5
models/all_spectral_bins/model_1600.h5
models/all_spectral_bins/model_2000.h5
These are TensorFlow/Keras binary classifiers trained for gluino masses of
1.2, 1.6, and 2.0 TeV against the ttbar background, using all 21
spectral-function bins.
The model input ordering is recorded in:
models/feature_schema.json
The models were trained on standardized features. In the training notebooks, a
StandardScaler was fit on the training split and applied to validation and test
sets before model training. The serialized .h5 files therefore expect inputs
with the same feature order and normalization convention.
The architecture used in the notebooks was:
Dense(64, relu)
Dropout(0.5)
Dense(32, relu)
Dropout(0.5)
Dense(1, sigmoid)
For more detail, see models/model_card.md.
To verify that the included model files can be loaded and used, open:
models/test_models_from_repo.ipynb
The notebook uses the included model files, models/feature_schema.json, and
the small standardized example file in examples/. It verifies that the .h5
models load and produce scores, checks the 2000 GeV model on the example rows,
and then shows where to plug in a user-provided standardized feature table.
For command-line inference on a model-ready CSV, use:
python models/example_inference.py \
--model models/all_spectral_bins/model_2000.h5 \
--input examples/model_ready_all_spectral_2000_test_sample.csv \
--output scored_events.csvThe input CSV must contain the feature columns listed in
models/feature_schema.json and must already use the same standardization
convention as the training arrays.
The helper script:
scripts/run_everything.sh
shows the end-to-end production sequence for MadGraph, optional MadSpin, Pythia, and Delphes.
Example:
scripts/run_everything.sh 0 0 background_ttbar
scripts/run_everything.sh 0 0 signal_gluino_pair 2000The HTCondor submit description is:
scripts/submit_delphes.sub
A short paper-ready paragraph is available in:
docs/data_and_code_availability.md
Large generated event files and full training tables are not included in this repository because of their size.