Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11", "3.12", "3.13"]

steps:
- uses: actions/checkout@v4
Expand All @@ -36,4 +36,4 @@ jobs:

- name: Type check
run: |
mypy src/micro/
mypy src/microplex/
28 changes: 14 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# microplex

Microdata synthesis and reweighting using normalizing flows.
Multi-source microdata synthesis and survey reweighting.

[![PyPI](https://img.shields.io/pypi/v/microplex.svg)](https://pypi.org/project/microplex/)
[![Tests](https://github.com/CosilicoAI/microplex/actions/workflows/test.yml/badge.svg)](https://github.com/CosilicoAI/microplex/actions/workflows/test.yml)
Expand Down Expand Up @@ -74,11 +74,11 @@ print(f"Using {stats['n_nonzero']} of {stats['n_records']} records")

| Feature | microplex | CT-GAN | TVAE | synthpop |
|---------|-------|--------|------|----------|
| Conditional generation | ✅ | ❌ | ❌ | ❌ |
| Multi-source fusion | ✅ | ❌ | ❌ | ❌ |
| Zero-inflation handling | ✅ | ❌ | ❌ | ⚠️ |
| Exact likelihood | ✅ | ❌ | ❌ | N/A |
| Stable training | ✅ | ⚠️ | | |
| Preserves source structure | ✅ | ❌ | ❌ | ⚠️ |
| Multiple synthesis methods | ✅ (QRF, QDNN, MAF) | ❌ | ❌ | ✅ (CART) |
| Survey reweighting | ✅ (IPF, entropy, sparse) | ❌ | | |
| PRDC evaluation | ✅ | ❌ | ❌ | |

### Use Cases

Expand Down Expand Up @@ -154,20 +154,20 @@ Full documentation at [cosilicoai.github.io/microplex](https://cosilicoai.github

## Benchmarks

See [benchmarks/](benchmarks/) for comparisons against:
See [benchmarks/](benchmarks/) for synthesis method comparisons:

- **CT-GAN**: Conditional Tabular GAN (from SDV)
- **TVAE**: Tabular VAE (from SDV)
- **Copulas**: Gaussian copula synthesis (from SDV)
- **synthpop**: CART-based synthesis (R package, via rpy2)
- **QRF / ZI-QRF**: Quantile regression forests (with/without zero-inflation)
- **QDNN / ZI-QDNN**: Quantile deep neural networks
- **MAF / ZI-MAF**: Masked autoregressive flows
- **CT-GAN / TVAE**: Deep generative baselines (from SDV)

## Citation

```bibtex
@software{microplex2024,
author = {Cosilico},
title = {microplex: Microdata synthesis and reweighting using normalizing flows},
year = {2024},
@software{microplex2025,
author = {Ghenis, Max},
title = {microplex: Multi-source microdata synthesis and survey reweighting},
year = {2025},
url = {https://github.com/CosilicoAI/microplex}
}
```
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
node_modules
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Book Theme

> [!NOTE]
> This repository contains the _build artifacts_ for the MyST Book Theme, it is not meant for editing by humans.
> - Source code exists at [`github.com/jupyter-book/myst-theme`](https://github.com/jupyter-book/myst-theme). Make edits there, not here.
> - To report an bug or request a feature, use the [`github.com/jupyter-book/myst-theme/issues`](https://github.com/executablebooks/myst-theme/issues).
Loading
Loading