Classifier

This is an pipeline for training/exporting/inferencing image classification models with PyTirch and TIMM. Hydra configs, WandB for experiment tracking, exporting to ONNX, TensorRT, OpenVino, TensorFlow, and TensorFlow Lite.

Configuration

model_name - Model name from timm
root - Path to the directory where you store your dataset and where model outputs will be saved
data_path - Path to the dataset filder. Should contain subfolders with label names
label_to_name: Class names corresponding to your dataset's folder names.
project_name: Name used in WandB.
exp: Experiment name used across the repository for paths in training, exporting, and inference.

For model architecture, refer to Torchvision or Hugging Face's timm.

Usage

To run the scripts, use the following commands:

make preprocess      # Converts images and PDFs to JPG format
make split           # Creates train, validation, and test CSVs with image paths
make train           # Runs the training pipeline
make export          # Exports weights in various formats after training
make bench           # Runs all exported models on the test set
make infer           # Runs inference on a test folder with subfolders as classes
make cnn_visualize   # Creates a heatmap visualization based on the last convolutional layer gradients (Grad-CAM)

Inference

Use inference classes in src/infer. Currently available:

Torch
TensorRT
OpenVINO

Outputs

Best model: Saved during the training process at output/models/exp_name_date.
Debug images: Preprocessed images (including augmentations) are saved at output/debug_images as they are fed into the model (except for normalization).
Evaluation predicts: Visualised model's predictions on val set. Includes GT as green and preds as blue.
Visualizations: cnn_visualize script saves heatmap visualizations of the last convolutional layer gradients at output/visualized.

Models comparison

Tested on 13k images dataset train: 10440, val: 1306, test: 1304, 4 classes, input size 348. Inferenced on 5070ti without optimisations, pure PyyTorch. F1 score calculated on the test set:

model_name | latency |  F1
-----------+---------+-------
efnet_b0   | 6.5ms   | 0.969
efnet_b1   | 7.6ms   | 0.976
efnetv2_s  | 9.4ms   | 0.967
mobnet4s   | 5.6ms   | 0.948
mobnet4m   | 6.7ms   | 0.970
mobnet4l   | 7.7ms   | 0.971
fastvit    | 7.2ms   | 0.973

Integration info example

Model format: .tflite Input size: Height=384, Width=384 Resize method: INTER_AREA (OpenCV) or equivalent Input tensor shape: [Batch Size, Height, Width, Channels] Channels order: RGB Data type: float32

Preprocessing: Read the image (BGR if using OpenCV), then convert to RGB Resize to (384, 384) with interpolation method cv2.INTER_AREA Convert to float32 and scale to 0–1 Normalize per channel Add batch dimension python snippet:

image = cv2.imread("img.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (384, 384), interpolation=cv2.INTER_AREA)
image = image.astype(np.float32) / 255.0
image = (image - [0.485, 0.456, 0.406]) / [0.229, 0.224, 0.225]
image = np.expand_dims(image, axis=0)

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
src		src
Makefile		Makefile
README.md		README.md
config.yaml		config.yaml
config_.yaml		config_.yaml
requirements.txt		requirements.txt
tf_requirements.txt		tf_requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classifier

Configuration

Usage

Inference

Outputs

Models comparison

Integration info example

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Classifier

Configuration

Usage

Inference

Outputs

Models comparison

Integration info example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages