In my Omicron CI performance work I developed an ad-hoc tool to gather CPU, memory and disk metrics for CI builds, writing them in a CSV file and uploading them as a Buildomat output. This worked well, and I could commit it to Omicron and call it a day, but having insights on the resource usage of a task is something that could prove useful for any Buildomat build.
We should polish the code I wrote in the ad-hoc tool and integrate it in the Buildomat agent, running it on every build. The agent would use the native system APIs to read metrics every second (reading /proc on Linux and calling kstat on illumos) without relying on subprocess, and upload the resulting CSV as the /buildomat/resource-usage.csv output.
We can then add a [rendered] link next to the output on the GitHub log viewer, linking to a page that renders the relevant graphs server side and displays them in the browser. I have other ideas on how to expose the graphs, but they'd rely on heavy JS which I'm not sure we want to add to the log viewer.
In my Omicron CI performance work I developed an ad-hoc tool to gather CPU, memory and disk metrics for CI builds, writing them in a CSV file and uploading them as a Buildomat output. This worked well, and I could commit it to Omicron and call it a day, but having insights on the resource usage of a task is something that could prove useful for any Buildomat build.
We should polish the code I wrote in the ad-hoc tool and integrate it in the Buildomat agent, running it on every build. The agent would use the native system APIs to read metrics every second (reading
/procon Linux and callingkstaton illumos) without relying on subprocess, and upload the resulting CSV as the/buildomat/resource-usage.csvoutput.We can then add a
[rendered]link next to the output on the GitHub log viewer, linking to a page that renders the relevant graphs server side and displays them in the browser. I have other ideas on how to expose the graphs, but they'd rely on heavy JS which I'm not sure we want to add to the log viewer.