ci: DH-20051: shard e2e tests and merge shard reports#1361
Conversation
mofojed
left a comment
There was a problem hiding this comment.
Have you tried this with a failed run to ensure the failure is reported correctly and you can download the report/watch the video?
Local e2e tests still work correctly?
Looking good so far
| with: | ||
| name: playwright-report | ||
| path: playwright-report/ | ||
| retention-days: 90 |
There was a problem hiding this comment.
We don't need to retain for 90 days, 30 days is fine.
| exit 1 | ||
| fi | ||
|
|
||
| npx -y playwright@1.44.1 merge-reports --reporter html,github ./all-blob-reports |
There was a problem hiding this comment.
We should probably update playwright as well. v1.44.1 is pretty old. That could be a separate PR though.
|
|
||
| jobs: | ||
| e2e-test: | ||
| e2e-tests: |
There was a problem hiding this comment.
While we should rename this step, we still should have an e2e-test step at the end that passes/fails based on the results. Reason being the branch protection requires an e2e-test step to pass before you can merge a PR:

In web-client-ui we consolidate the result into e2e-results, but I think we should just name it e2e-test here so that it matches the existing branch protection rules... otherwise I need to change those rules as well, and change it for the release branches... easier to just keep the same name.
https://github.com/deephaven/web-client-ui/blob/caeabb02f0abc824a07c505d0a7ca8693dd63cbb/.github/workflows/e2e.yml#L125
| needs: [e2e-tests] | ||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
We should use newer versions of these actions, these are old and deprecated
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| config: [chromium-1-2, chromium-2-2, firefox-1-2, firefox-2-2, webkit-1-2, webkit-2-2] |
There was a problem hiding this comment.
How did you decide on this sharding? Did you take a look at how long each shard takes individually? I seem to recall webkit taking longer which is why we split up webkit on web-client-ui but not chromium/firefox.
There was a problem hiding this comment.
Each shard is currently taking about 15 to 22 minutes to complete. I added a screenshot of the shard timings to the PR description for reference. We could split WebKit into 3 shards instead of 2, which might save another couple of minutes, but I think the current setup is reasonably well balanced for now.
| config: [chromium-1-2, chromium-2-2, firefox-1-2, firefox-2-2, webkit-1-2, webkit-2-2] | ||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 |
| apt-get update; \ | ||
| apt-get install build-essential --yes; \ | ||
| apt-get install fonts-dejavu-core --yes; | ||
| apt-get install --no-install-recommends -y fonts-dejavu-core; \ |
There was a problem hiding this comment.
Always prefer the long form of flags in scripts - it makes it clearer to the casual reader.
| apt-get install --no-install-recommends -y fonts-dejavu-core; \ | |
| apt-get install --no-install-recommends --yes fonts-dejavu-core; \ |
| branches: | ||
| - main | ||
| - 'release/**' | ||
| workflow_dispatch: |
There was a problem hiding this comment.
I guess there's no reason to remove this. Just probably isn't useful unless for testing.
| - name: Run tests | ||
| run: './tools/run_docker.sh ./tests/docker-compose.yml e2e-tests' | ||
| run: | | ||
| ./tools/run_docker.sh ./tests/docker-compose.yml e2e-tests \ |
There was a problem hiding this comment.
Does each shard need to rebuild the docker image separately? Or do they share the same build image so they don't need to each re-build it?
There was a problem hiding this comment.
Yes, each shard currently rebuilds the Docker image separately. We could change this to build once in a dedicated job and reuse that image across the shard jobs, which would save resources. Let me know if you want me to make that change.
There was a problem hiding this comment.
I'd like to do that if possible
use newer versions of action prefer the long form of flags remove workflow_dispatch e2e-results 30 days instead of 90
This PR shards the end-to-end workflow to reduce overall time and keep full test coverage.
Before: 51-53 mins
After: 22 mins
Changes:
CI reliability fixes:
Current e2e tests:

e2e tests with sharding:

More details, showing how long each separate shard takes:
