Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 45 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The current version of TPOT was developed at Cedars-Sinai by:
- Jay Moran (jay.moran@cshs.org)
- Nicholas Matsumoto (nicholas.matsumoto@cshs.org)
- Hyunjun Choi (hyunjun.choi@cshs.org)
- Gabriel Ketron (gabriel.ketron@cshs.org)
- Gabriel Ketron (gabriel.ketron@cshs.org)
- Miguel E. Hernandez (miguel.e.hernandez@cshs.org)
- Jason Moore (moorejh28@gmail.com)

Expand Down Expand Up @@ -83,7 +83,6 @@ scipy
scikit-learn
update_checker
tqdm
stopit
pandas
joblib
xgboost
Expand Down Expand Up @@ -226,23 +225,36 @@ We welcome you to check the existing issues for bugs or enhancements to work on.

If you use TPOT in a scientific publication, please consider citing at least one of the following papers:

Trang T. Le, Weixuan Fu and Jason H. Moore (2020). [Scaling tree-based automated machine learning to biomedical big data with a feature set selector](https://academic.oup.com/bioinformatics/article/36/1/250/5511404). *Bioinformatics*.36(1): 250-256.
Hernandez, J. G., Saini, A. K., Ghosh, A., & Moore, J. H. (2025). [The tree-based pipeline optimization tool: Tackling biomedical research problems with genetic programming and automated machine learning](https://www.cell.com/patterns/fulltext/S2666-3899(25)00162-X). Patterns, 6(7).

BibTeX entry:

```bibtex
@article{le2020scaling,
title={Scaling tree-based automated machine learning to biomedical big data with a feature set selector},
author={Le, Trang T and Fu, Weixuan and Moore, Jason H},
journal={Bioinformatics},
volume={36},
number={1},
pages={250--256},
year={2020},
publisher={Oxford University Press}
```bibtext
@article{hernandez2025tree,
title={The tree-based pipeline optimization tool: Tackling biomedical research problems with genetic programming and automated machine learning},
author={Hernandez, Jose Guadalupe and Saini, Anil Kumar and Ghosh, Attri and Moore, Jason H},
journal={Patterns},
volume={6},
number={7},
year={2025},
publisher={Elsevier}
}
```

Ribeiro, P., Saini, A., Moran, J., Matsumoto, N., Choi, H., Hernandez, M., & Moore, J. H. (2024). [TPOT2: A New Graph-Based Implementation of the Tree-Based Pipeline Optimization Tool for Automated Machine Learning](https://link.springer.com/chapter/10.1007/978-981-99-8413-8_1). In Genetic programming theory and practice XX (pp. 1-17). Singapore: Springer Nature Singapore.

BitTex entry:

```bibtex
@incollection{ribeiro2024tpot2,
title={TPOT2: A New Graph-Based Implementation of the Tree-Based Pipeline Optimization Tool for Automated Machine Learning},
author={Ribeiro, Pedro and Saini, Anil and Moran, Jay and Matsumoto, Nicholas and Choi, Hyunjun and Hernandez, Miguel and Moore, Jason H},
booktitle={Genetic programming theory and practice XX},
pages={1--17},
year={2024},
publisher={Springer}
}
```

Randal S. Olson, Ryan J. Urbanowicz, Peter C. Andrews, Nicole A. Lavender, La Creis Kidd, and Jason H. Moore (2016). [Automating biomedical data science through tree-based pipeline optimization](http://link.springer.com/chapter/10.1007/978-3-319-31204-0_9). *Applications of Evolutionary Computation*, pages 123-137.

Expand Down Expand Up @@ -286,6 +298,26 @@ BibTeX entry:
}
```

## Related Papers

Trang T. Le, Weixuan Fu and Jason H. Moore (2020). [Scaling tree-based automated machine learning to biomedical big data with a feature set selector](https://academic.oup.com/bioinformatics/article/36/1/250/5511404). *Bioinformatics*.36(1): 250-256.

BibTeX entry:

```bibtex
@article{le2020scaling,
title={Scaling tree-based automated machine learning to biomedical big data with a feature set selector},
author={Le, Trang T and Fu, Weixuan and Moore, Jason H},
journal={Bioinformatics},
volume={36},
number={1},
pages={250--256},
year={2020},
publisher={Oxford University Press}
}
```


## Support for TPOT

TPOT was developed in the [Artificial Intelligence Innovation (A2I) Lab](http://epistasis.org/) at Cedars-Sinai with funding from the [NIH](http://www.nih.gov/) under grants U01 AG066833 and R01 LM010098. We are incredibly grateful for the support of the NIH and the Cedars-Sinai during the development of this project.
Expand Down
37 changes: 25 additions & 12 deletions docs/cite.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,36 @@
# Citing TPOT
If you use TPOT in a scientific publication, please consider citing at least one of the following papers:

Trang T. Le, Weixuan Fu and Jason H. Moore (2020). [Scaling tree-based automated machine learning to biomedical big data with a feature set selector](https://academic.oup.com/bioinformatics/article/36/1/250/5511404). *Bioinformatics*.36(1): 250-256.
Hernandez, J. G., Saini, A. K., Ghosh, A., & Moore, J. H. (2025). [The tree-based pipeline optimization tool: Tackling biomedical research problems with genetic programming and automated machine learning](https://www.cell.com/patterns/fulltext/S2666-3899(25)00162-X). Patterns, 6(7).

BibTeX entry:

```bibtex
@article{le2020scaling,
title={Scaling tree-based automated machine learning to biomedical big data with a feature set selector},
author={Le, Trang T and Fu, Weixuan and Moore, Jason H},
journal={Bioinformatics},
volume={36},
number={1},
pages={250--256},
year={2020},
publisher={Oxford University Press}
```bibtext
@article{hernandez2025tree,
title={The tree-based pipeline optimization tool: Tackling biomedical research problems with genetic programming and automated machine learning},
author={Hernandez, Jose Guadalupe and Saini, Anil Kumar and Ghosh, Attri and Moore, Jason H},
journal={Patterns},
volume={6},
number={7},
year={2025},
publisher={Elsevier}
}
```

Ribeiro, P., Saini, A., Moran, J., Matsumoto, N., Choi, H., Hernandez, M., & Moore, J. H. (2024). [TPOT2: A New Graph-Based Implementation of the Tree-Based Pipeline Optimization Tool for Automated Machine Learning](https://link.springer.com/chapter/10.1007/978-981-99-8413-8_1). In Genetic programming theory and practice XX (pp. 1-17). Singapore: Springer Nature Singapore.

BitTex entry:

```bibtex
@incollection{ribeiro2024tpot2,
title={TPOT2: A New Graph-Based Implementation of the Tree-Based Pipeline Optimization Tool for Automated Machine Learning},
author={Ribeiro, Pedro and Saini, Anil and Moran, Jay and Matsumoto, Nicholas and Choi, Hyunjun and Hernandez, Miguel and Moore, Jason H},
booktitle={Genetic programming theory and practice XX},
pages={1--17},
year={2024},
publisher={Springer}
}
```

Randal S. Olson, Ryan J. Urbanowicz, Peter C. Andrews, Nicole A. Lavender, La Creis Kidd, and Jason H. Moore (2016). [Automating biomedical data science through tree-based pipeline optimization](http://link.springer.com/chapter/10.1007/978-3-319-31204-0_9). *Applications of Evolutionary Computation*, pages 123-137.

Expand Down Expand Up @@ -59,4 +72,4 @@ BibTeX entry:
publisher = {ACM},
address = {New York, NY, USA},
}
```
```
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@ dependencies = [
"scikit-learn>=1.6",
"update_checker>=0.16",
"tqdm>=4.36.1",
"stopit>=1.1.1",
"pandas>=2.2.0",
"joblib>=1.1.1",
"xgboost>=3.0.0",
Expand Down
87 changes: 86 additions & 1 deletion tpot/builtin_modules/arithmetictransformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,12 @@ def transform_helper(self, X):
elif self.function == "1":
return np.ones((X.shape[0],1))

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


def issorted(x, rev=False):
if rev:
s = sorted(x)
Expand Down Expand Up @@ -163,6 +169,12 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return np.expand_dims(np.sum(X,1),1)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class mul_neg_1_Transformer(TransformerMixin, BaseEstimator):
def __init__(self):
"""
Expand All @@ -185,7 +197,13 @@ def transform_helper(self, X):
if len(X.shape) == 1:
X = np.expand_dims(X,0)
return X*-1


def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class MulTransformer(TransformerMixin, BaseEstimator):

def __init__(self):
Expand All @@ -210,6 +228,12 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return np.expand_dims(np.prod(X,1),1)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class SafeReciprocalTransformer(TransformerMixin, BaseEstimator):

def __init__(self):
Expand All @@ -234,6 +258,12 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return np.divide(1.0, X.astype(float), out=np.zeros_like(X).astype(float), where=X!=0) #TODO remove astypefloat?

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class EQTransformer(TransformerMixin, BaseEstimator):

def __init__(self):
Expand All @@ -258,6 +288,12 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return np.expand_dims(np.all(X == X[0,:], axis = 1),1).astype(float)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class NETransformer(TransformerMixin, BaseEstimator):

def __init__(self):
Expand All @@ -282,6 +318,10 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return 1- np.expand_dims(np.all(X == X[0,:], axis = 1),1).astype(float)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class GETransformer(TransformerMixin, BaseEstimator):
Expand Down Expand Up @@ -309,6 +349,11 @@ def transform_helper(self, X):
result = X >= 0
return result.astype(float)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class GTTransformer(TransformerMixin, BaseEstimator):
def __init__(self):
Expand All @@ -334,6 +379,11 @@ def transform_helper(self, X):
result = X > 0
return result.astype(float)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class LETransformer(TransformerMixin, BaseEstimator):
def __init__(self):
Expand All @@ -359,6 +409,11 @@ def transform_helper(self, X):
result = X <= 0
return result.astype(float)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class LTTransformer(TransformerMixin, BaseEstimator):
def __init__(self):
Expand All @@ -384,6 +439,11 @@ def transform_helper(self, X):
result = X < 0
return result.astype(float)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class MinTransformer(TransformerMixin, BaseEstimator):
def __init__(self):
Expand All @@ -408,6 +468,10 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return np.expand_dims(np.amin(X,1),1)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class MaxTransformer(TransformerMixin, BaseEstimator):
Expand All @@ -434,6 +498,11 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return np.expand_dims(np.amax(X,1),1)

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class ZeroTransformer(TransformerMixin, BaseEstimator):

Expand All @@ -459,6 +528,11 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return np.zeros((X.shape[0],1))

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class OneTransformer(TransformerMixin, BaseEstimator):
def __init__(self):
Expand All @@ -483,6 +557,11 @@ def transform_helper(self, X):
X = np.expand_dims(X,0)
return np.ones((X.shape[0],1))

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags


class NTransformer(TransformerMixin, BaseEstimator):

Expand All @@ -507,3 +586,9 @@ def transform_helper(self, X):
if len(X.shape) == 1:
X = np.expand_dims(X,0)
return np.ones((X.shape[0],1))*self.n

def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.requires_fit = False
return tags

3 changes: 2 additions & 1 deletion tpot/builtin_modules/estimatortransformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,4 +178,5 @@ def __sklearn_is_fitted__(self):
@property
def classes_(self):
"""The classes labels. Only exist if the last step is a classifier."""
return self.estimator._classes
return self.estimator._classes

Loading