Detailed model training control and batch modelling#365
Detailed model training control and batch modelling#365roussel-ryan merged 44 commits intoxopt-org:mainfrom
Conversation
4d8076a to
745516e
Compare
|
Looking good! LMK when this is ready for review and we can have a short discussion to go over it |
5f75338 to
a76ae75
Compare
a76ae75 to
ae8f86e
Compare
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
|
Some images for the record. Aiming to finish tomorrow.
@roussel-ryan do we have agreement on just using pydantic objects for LBFGS and other optimizers with direct 1:1 option translation? |
|
Hey Nikita, one thought on this is if we could unify this work with the NumericalOptimizer classes provided by Xopt down the road. I think either the direction you propose here with pydantic objects or synchronizing with the NumericalOptimizer classes would work. Merging the two could happen in a future PR since I think you want to get this in before the Xopt 3.0 release? |
|
Yes, I agree on moving towards shared API/object. There are some issues with how botorch uses different lbfgs algorithms in model vs acqf parts - current |
|
Sounds good to me. I'll review this in a few hours. Would you be able to handle updating the v3.0 branch with these changes once merged? I don't want to make a mistake with fixing merge conflicts |
|
Yes, that is ok, I'll rebase. |
|
In fact, for 3.0 I'll integrate with #337 and merge configs. In 2.x let's avoid major save-breaking changes. |
roussel-ryan
left a comment
There was a problem hiding this comment.
Looks pretty good to me. I've added a couple of asks for more documentation. I'm also wondering if it would be possible to add an example notebook that shows the difference between doing hyperparameter training / acq optimization with stronger / weaker convergence criteria to show the trade-off between speed and precision?
|
ready |


This PR introduces batch GP models and finer control over model training. The former can be useful for scalarized objectives, while latter is necessary to speed up BO in operational contexts. As recently demonstrated at a NAPAC25 talk, one can significantly relax fitting tolerances to meet real-time requirements without impact on convergence, especially with scalarized objectives. There is physical motivation - we cannot set the physical devices precisely enough for exact fitting to matter.
Changes:
Caveats:
train_model=False.Benchmarks for n_vars=12, n_obj=5, n_constr=2, n=500
CPU:
GPU (RTX 3070, H100 todo):
To reproduce:
python bench_runner.py bench_build_standard bench_build_batched bench_build_standard_adam bench_build_batched_adam bench_build_standard_gpytorch bench_build_batched_gpytorch -n 10 -device cpu