We'd like to periodically (weekly) retrain the model and deploy the latest model to production.
This would allow us to benefit from
- Additional data to train from
- Incorporate new labels that might have been added
We'd like to use GitOps. We should automatically create a PR to update the model and once that model is approved and merged the new model should get deployed automatically.
There are three pieces
- Automating training
- Currently this means automating running the notebook that trains an AutoML model
- Automatically creating a PR to update the model config to use the latest AutoML model
- Use GitOps tools to automatically sync the latest model configs down to the cluster
The last step was taken care of by #152
For step 2 here's what I'm thinking create a tekton task that does the following
- Run a custom binary to get the latest deployed model from AutoML and emit it to a YAML file
- Use yq and kpt to update the label bot config
- Use hub CLI to create a PR
One thing I'm not sure about yet is how to perform the above in a reconcile loop; i.e. only trigger the above if the config isn't already pointing at the latest model