Draft: Update feat/dnase-2.7 with changes from main#58
Draft
jemma-nelson wants to merge 193 commits intofeat/dnase-2.7from
Draft
Draft: Update feat/dnase-2.7 with changes from main#58jemma-nelson wants to merge 193 commits intofeat/dnase-2.7from
jemma-nelson wants to merge 193 commits intofeat/dnase-2.7from
Conversation
Logic now matches that seen in the rest of our pipeline - prefer using the alignment's sample_name, and fall back to constructing it manually only when necessary. This should resolve the collation issues that have been dogging us this year.
CopyComplete.txt is a better signal that a flowcell is ready for processing than RTAComplete.txt. Older sequencers did not create CopyComplete.txt, I believe.
hpcz-2 was decommissioned, switching default queue for this.
We will re-enable this once we get the fastq deadline hit
Accidentally duplicated the input specification during a git merge
Now that we're regularly copying data over rather than symlinking it, it makes sense to remove the work directory in these cases.
Actually tested this time.
Make it more obvious when and where an alignment cannot be set up
With this fix, we should set `unset LIBRARY_KIT_METHOD` correctly in our bash scripts
Fix: alignprocess.py: library kits are optional
Contributor
Author
|
There may not be a path forward on merging this, and that's fine. We would not want to introduce significant changes to the DNase pipeline, as it is frozen here for a reprocessing effort. |
Fix/lp collation timing
The main thing changed here is to prefer the form
`logging.info("msg %s %s", arg1, arg2)`
This allows the logger to do the interpolation, which can save time
if the message is not printed because it is below the current log level.
previously the code generated would depend solely on where setup.sh was run, so if the execution didn't match, you would get surprising errors
Fix occasional oom error, cleanup run_pools.sh properly, and make sure run_pool.sh and run_alignments.sh are always properly submitted.
Feat/fastq container
Style/add precommit
Basically this was bash glob syntax vs. grep regex confusion. This regex was incorrectly looking for `collatefq*`, which is `collatef` followed by any number of `q`s. Adjusted to propery look for `collatefq` followed by (any number of anything) before the flowcell string.
Fix: fastqc/alignments/pools wait for collation
This was the cause of those pesky "Project_Lab/Sample_LP.../" directories that were causing us to duplicate work.
Alignprocess.py skips library pools
If this is missing, use the default analysis dir.
We don't use the output from this anymore, preferring to run the megamap pipeline or other analyses as appropriate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a draft PR to see all the changes that would be pulled in. Primary motivation is for commit c86a544.