TinyMyo (EMG) model support in BioFoundation #6

MatteoFasulo · 2025-12-07T20:56:03Z

Updates

Added end-to-end configuration and dataset support for TinyMyo EMG pretraining and finetuning pipelines.
Introduced new YAML configuration files covering experiments, data modules, model setups, and task definitions.
Implemented a dedicated EMGDataset class with efficient HDF5-backed loading, caching, and streamlined batch access.
Improved dataloader throughput by enabling persistent workers, reducing process spawn overhead.
Added a project-level .gitignore for Python build and cache artifacts.
Added a pyproject.toml configured for the uv package and project manager.
Added MinMaxNormalization under training utilities.
Added a checkpoint-to-safetensors conversion script (util/ckpt_to_safetensor.py) with a CLI for exporting Lightning checkpoints to HuggingFace format.
Added documentation for TinyMyo with description of the model, pretraining and downstream dataset, model size and performances.
Updated run_train.py to enforce rank-zero single-GPU testing for reproducibility, aligning with practices from the Meta - Generic Neuromotor Interface codebase.

- Updated `run_train.py` to fetch existing env variables for data patch and checkpoint dir, support for FP32 with high precision, final test using 1 single GPU rather than DDP to ensure reproducibility of reported metrics - Introduced `finetune_task_EMG.py` for fine-tuning EMG classification models - Added `pretrain_task_EMG.py` for masked reconstruction training, including token masking and signal logging. - Updated `train_utils.py` with a new `MinMaxNormalization` class for input normalization using minmax scaling. - Improved code structure and readability across all modified files. - persistent workers for finetuning data module to avoid spawning and destroying worker group multiple times

… tasks with general model docs

…for clarity and usability

…ript - removed input normalization in finetuning config - updated timm imports

…training and finetuning tasks - auto strategy for finetuning - removed logging interval in pretraining - update target model class in pretraining - added persistent workers in pretrain data module - added cache evict policy in EMG pretrain dataset - update TinyMyo forward method with masking support for both pretraining and finetuning - update pretrain task with a cleaner masking approach moving the tokenization logic inside the model - update finetuning task with dummy masking generation

- Updated YAML configuration files for EMG fine-tuning and pre-training datasets to ensure consistency. - Refined the TinyMyo model configuration, including adjustments to class definitions and parameters. - Improved code readability by consolidating dictionary definitions and removing unnecessary line breaks. - Enhanced the EMG dataset classes by optimizing type hints and initialization methods. - Streamlined the TinyMyo model's forward pass and weight initialization methods for better clarity and performance. - Fixed minor formatting issues across various files to adhere to coding standards.

Copilot

Pull request overview

This PR introduces comprehensive support for the TinyMyo EMG foundation model in the BioFoundation codebase, enabling both pretraining and finetuning workflows for electromyography signals. The implementation follows a transformer-based architecture with rotary position embeddings and includes efficient HDF5-backed datasets with caching.

Key changes:

Added TinyMyo model architecture (3.6M parameters) with support for pretraining, classification, and regression tasks
Implemented EMG-specific datasets with HDF5 loading, caching, and channel padding capabilities
Enhanced dataloaders with persistent workers for improved throughput
Added checkpoint conversion utilities and comprehensive configuration files

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 21 comments.

Show a summary per file

File	Description
util/train_utils.py	Added MinMaxNormalization utility class for input normalization
util/ckpt_to_safetensor.py	New checkpoint-to-safetensors conversion script with CLI
tasks/pretrain_task_EMG.py	Pretraining task with masked reconstruction and signal logging
tasks/finetune_task_EMG.py	Finetuning task with classification/regression support and metrics
models/TinyMyo.py	Complete TinyMyo model architecture with rotary attention blocks
datasets/emg_pretrain_dataset.py	HDF5-backed pretraining dataset with multi-file support
datasets/emg_finetune_dataset.py	HDF5-backed finetuning dataset with lazy loading
data_module/pretrain_data_module.py	Lightning data module with persistent workers enabled
data_module/finetune_data_module.py	Lightning data module for finetuning with persistent workers
run_train.py	Updated training script with rank-zero testing and process group management
pyproject.toml	Project configuration for uv package manager with dependencies
config/*	YAML configuration files for experiments, models, tasks, and data modules
docs/model/TinyMyo.md	Comprehensive documentation of model architecture and performance
.gitignore	Standard Python gitignore with Hydra outputs

Comments suppressed due to low confidence (2)

tasks/pretrain_task_EMG.py:283

This assignment to 'indices_array' is unnecessary as it is redefined before this value is used.

            indices_array = np.array(indices)

tasks/pretrain_task_EMG.py:284

This assignment to 'indices_array' is unnecessary as it is redefined before this value is used.

            indices_array = np.unique(indices)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tasks/finetune_task_EMG.py

tasks/pretrain_task_EMG.py

tasks/finetune_task_EMG.py

models/TinyMyo.py

data_module/finetune_data_module.py

data_module/pretrain_data_module.py

models/TinyMyo.py

pyproject.toml

…and raising error when default values are unchanged - in final test, best checkpoint after training is loaded instead of last checkpoint

MatteoFasulo added 2 commits November 25, 2025 16:41

feat: add EMG regression head and update TinyMyo model for regression…

a9f67ff

… tasks with general model docs

MatteoFasulo changed the title ~~[Draft] TinyMyo (EMG ) model support in BioFoundation~~ [Draft] TinyMyo (EMG) model support in BioFoundation Dec 7, 2025

MatteoFasulo added 4 commits December 8, 2025 16:38

feat: enhance dataset classes and update TinyMyo model documentation …

01d5084

…for clarity and usability

fix: update safetensor file path handling in checkpoint conversion sc…

8e40940

…ript - removed input normalization in finetuning config - updated timm imports

MatteoFasulo marked this pull request as ready for review December 19, 2025 14:53

Copilot AI review requested due to automatic review settings December 19, 2025 14:53

Copilot started reviewing on behalf of MatteoFasulo December 19, 2025 14:53 View session

MatteoFasulo changed the title ~~[Draft] TinyMyo (EMG) model support in BioFoundation~~ TinyMyo (EMG) model support in BioFoundation Dec 19, 2025

Copilot AI reviewed Dec 19, 2025

View reviewed changes

Thoriri and others added 3 commits December 19, 2025 16:04

Delete .gitignore

e8106e5

fix: correct parameter settings and addressed suggested changes

9bc3b89

Update training script handling environment variables default values …

b604d99

…and raising error when default values are unchanged - in final test, best checkpoint after training is loaded instead of last checkpoint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TinyMyo (EMG) model support in BioFoundation #6

TinyMyo (EMG) model support in BioFoundation #6

Uh oh!

MatteoFasulo commented Dec 7, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TinyMyo (EMG) model support in BioFoundation #6

Are you sure you want to change the base?

TinyMyo (EMG) model support in BioFoundation #6

Uh oh!

Conversation

MatteoFasulo commented Dec 7, 2025

Updates

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants