Skip to content

Commit e1bfc76

Browse files
authored
Merge pull request #128 from LUMC/release_1.6.0
Release 1.6.0
2 parents 3deca52 + 2e464cb commit e1bfc76

File tree

15 files changed

+273
-45
lines changed

15 files changed

+273
-45
lines changed

.github/workflows/ci.yml

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,13 @@ jobs:
1414
strategy:
1515
matrix:
1616
python-version:
17-
- 3.6
17+
- "3.6"
1818
steps:
1919
- uses: actions/checkout@v2.3.4
2020
- name: Set up Python ${{ matrix.python-version }}
2121
uses: actions/setup-python@v2
22+
with:
23+
python-version: ${{ matrix.python-version }}
2224
- name: Install tox
2325
run: pip install tox
2426
- name: Lint
@@ -30,6 +32,8 @@ jobs:
3032
- uses: actions/checkout@v2.3.4
3133
- name: Set up Python ${{ matrix.python-version }}
3234
uses: actions/setup-python@v2
35+
with:
36+
python-version: ${{ matrix.python-version }}
3337
- name: Install tox
3438
run: pip install tox
3539
- name: Build docs
@@ -39,15 +43,18 @@ jobs:
3943
strategy:
4044
matrix:
4145
python-version:
42-
- 3.6
43-
- 3.7
44-
- 3.8
45-
- 3.9
46+
- "3.6"
47+
- "3.7"
48+
- "3.8"
49+
- "3.9"
50+
- "3.10"
4651
needs: lint
4752
steps:
4853
- uses: actions/checkout@v2.3.4
4954
- name: Set up Python ${{ matrix.python-version }}
5055
uses: actions/setup-python@v2
56+
with:
57+
python-version: ${{ matrix.python-version }}
5158
- name: Install tox
5259
run: pip install tox
5360
- name: Run tests
@@ -58,10 +65,10 @@ jobs:
5865

5966
test-functional:
6067
runs-on: ubuntu-latest
61-
needs: test
68+
needs: lint
6269
strategy:
6370
matrix:
64-
python-version: [3.7]
71+
python-version: ["3.7"]
6572
test-program: [cromwell, snakemake, miniwdl]
6673
steps:
6774
- uses: actions/checkout@v2.3.4
@@ -70,6 +77,8 @@ jobs:
7077
- name: Set up Python ${{ matrix.python-version }}
7178
if: ${{ matrix.test-program != 'cromwell' }}
7279
uses: actions/setup-python@v2
80+
with:
81+
python-version: ${{ matrix.python-version }}
7382
- name: Install tox
7483
if: ${{ matrix.test-program != 'cromwell' }}
7584
run: pip install tox

HISTORY.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,18 @@ Changelog
77
.. This document is user facing. Please word the changes in such a way
88
.. that users understand how the changes affect the new version.
99
10+
version 1.6.0
11+
---------------------------
12+
+ Add a ``--git-aware`` or ``--ga`` option to only copy copy files listed by
13+
git ls-files. This omits the ``.git`` folder, all untracked files and
14+
everything ignored by ``.gitignore``. This reduces the number of copy
15+
operations drastically.
16+
17+
Pytest-workflow will now emit a warning when copying of a git directory is
18+
detected without the ``--git-aware`` option.
19+
20+
+ Add support and tests for Python 3.10
21+
1022
version 1.5.0
1123
---------------------------
1224
+ Add support for python 3.9

README.rst

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,11 @@ pytest-workflow
3232
:target: https://doi.org/10.5281/zenodo.3757727
3333
:alt: More information on how to cite pytest-workflow here.
3434

35-
pytest-workflow is a pytest plugin that aims to make pipeline/workflow testing easy
36-
by using yaml files for the test configuration.
35+
pytest-workflow is a workflow-system agnostic testing framework that aims
36+
to make pipeline/workflow testing easy by using YAML files for the test
37+
configuration. Whether you write your pipelines in WDL, snakemake, bash or
38+
any other workflow framework, pytest-workflow makes testing easy.
39+
pytest-workflow is build on top of the pytest test framework.
3740

3841
For our complete documentation checkout our
3942
`readthedocs page <https://pytest-workflow.readthedocs.io/>`_.
@@ -42,7 +45,7 @@ For our complete documentation checkout our
4245
Installation
4346
============
4447
Pytest-workflow requires Python 3.6 or higher. It is tested on Python 3.6, 3.7,
45-
3.8 and 3.9. Python 2 is not supported.
48+
3.8, 3.9 and 3.10. Python 2 is not supported.
4649

4750
- Make sure your virtual environment is activated.
4851
- Install using pip ``pip install pytest-workflow``

docs/examples.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ Cromwell so it can be used as a command, instead of having to use the jar.
6868
md5sum: 173fd8023240a8016033b33f42db14a2
6969
stdout:
7070
contains:
71-
- "WorkflowSucceededState"
71+
- "workflow finished with status 'Succeeded'"
7272
7373
WDL with miniwdl example
7474
------------------------

docs/installation.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Installation
33
============
44

5-
Pytest-workflow is tested on python 3.6, 3.7, 3.8 and 3.9. Python 2 is not
5+
Pytest-workflow is tested on python 3.6, 3.7, 3.8, 3.9 and 3.10. Python 2 is not
66
supported.
77

88
In a virtual environment

docs/running_pytest_workflow.rst

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -55,11 +55,18 @@ The temporary directories created are copies of pytest's root directory, the
5555
directory from which it runs the tests. If you have lots of tests, and if you
5656
have a large repository, this may take a lot of disk space. To alleviate this
5757
you can use the ``--symlink`` flag which will create the same directory layout
58-
but instead symlinks the files instead of copying them. This is *slower* for
59-
lots of small files, and it carries with it the risk that the tests may alter
60-
files from your work directory. If there are a lot of large files and files are
61-
used read-only in tests, then it will use a lot less disk space and be faster
62-
as well.
58+
but instead symlinks the files instead of copying them. This carries with it
59+
the risk that the tests may alter files from your work directory. If there are
60+
a lot of large files and files are used read-only in tests, then it will use a
61+
lot less disk space and be faster as well.
62+
63+
.. note::
64+
65+
When your workflow is version controlled in git please use the
66+
``--git-aware`` option. This omits the ``.git`` folder, all untracked
67+
files and everything ignored by ``.gitignore``. This reduces the number of
68+
copy operations significantly.
69+
6370

6471
Running multiple workflows simultaneously
6572
-----------------------------------------

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[build-system]
2+
requires = ["setuptools>=51", "wheel"]
3+
build-backend = "setuptools.build_meta"

setup.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020

2121
setup(
2222
name="pytest-workflow",
23-
version="1.5.0",
23+
version="1.6.0",
2424
description="A pytest plugin for configuring workflow/pipeline tests "
2525
"using YAML files",
2626
author="Leiden University Medical Center",
@@ -43,6 +43,7 @@
4343
"Programming Language :: Python :: 3.7",
4444
"Programming Language :: Python :: 3.8",
4545
"Programming Language :: Python :: 3.9",
46+
"Programming Language :: Python :: 3.10",
4647
"Development Status :: 5 - Production/Stable",
4748
"License :: OSI Approved :: "
4849
"GNU Affero General Public License v3 or later (AGPLv3+)",

src/pytest_workflow/plugin.py

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
from .content_tests import ContentTestCollector
3535
from .file_tests import FileTestCollector
3636
from .schema import WorkflowTest, workflow_tests_from_schema
37-
from .util import is_in_dir, link_tree, replace_whitespace
37+
from .util import duplicate_tree, is_in_dir, replace_whitespace
3838
from .workflow import Workflow, WorkflowQueue
3939

4040

@@ -66,6 +66,12 @@ def pytest_addoption(parser: PytestParser):
6666
"symbolic links. This saves disk space, but should only be used "
6767
"for tests that do use these files read-only."
6868
)
69+
parser.addoption(
70+
"--ga", "--git-aware", action="store_true", dest="git_aware",
71+
help="Only copy files that are listed by the 'git ls-files' command. "
72+
"This ignores the .git directory, any untracked files and any "
73+
"files listed by .gitignore. "
74+
"Highly recommended when working in a git project.")
6975

7076
# Why `--tag <tag>` and not simply use `pytest -m <tag>`?
7177
# `-m` uses a "mark expression". So you have to type a piece of python
@@ -375,12 +381,22 @@ def queue_workflow(self):
375381
f"'{tempdir}' already exists. Deleting ...")
376382
shutil.rmtree(str(tempdir))
377383

384+
# Warn users of git that they should use the --git-aware option.
385+
# The .git directory contains all files ever checked in, and all diffs
386+
# in the entire history.
387+
root_dir = Path(self.config.rootdir)
388+
git_aware = self.config.getoption("git_aware")
389+
git_dir = root_dir / ".git"
390+
if git_dir.exists() and not git_aware:
391+
warnings.warn(
392+
f".git dir detected: {str(git_dir)}. pytest-workflow "
393+
f"will copy the entire .git directory and all files ignored "
394+
f"by git. It is recommended to use the --git-aware option.")
378395
# Copy the project directory to the temporary directory using pytest's
379396
# rootdir.
380-
if self.config.getoption("symlink"):
381-
link_tree(Path(str(self.config.rootdir)), tempdir)
382-
else:
383-
shutil.copytree(str(self.config.rootdir), str(tempdir))
397+
duplicate_tree(root_dir, tempdir,
398+
symlink=self.config.getoption("symlink"),
399+
git_aware=git_aware)
384400

385401
# Create a workflow and make sure it runs in the tempdir
386402
workflow = Workflow(command=self.workflow_test.command,

src/pytest_workflow/util.py

Lines changed: 123 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,15 @@
1+
import functools
12
import hashlib
23
import os
34
import re
5+
import shutil
6+
import subprocess # nosec
7+
import sys
48
import warnings
59
from pathlib import Path
10+
from typing import Callable, Iterator, List, Set, Tuple, Union
11+
12+
Filepath = Union[str, os.PathLike]
613

714

815
# This function was created to ensure the same conversion is used throughout
@@ -41,22 +48,128 @@ def is_in_dir(child: Path, parent: Path, strict: bool = False) -> bool:
4148
return False
4249

4350

44-
def link_tree(src: Path, dest: Path) -> None:
51+
def _run_command(*args):
52+
"""Run an external command and return the output"""
53+
result = subprocess.run(args, # nosec
54+
stdout=subprocess.PIPE,
55+
# Encoding to output as a string.
56+
encoding=sys.getdefaultencoding(),
57+
check=True)
58+
return result.stdout
59+
60+
61+
def git_root(path: Filepath) -> str:
62+
output = _run_command(
63+
"git", "-C", os.fspath(path), "rev-parse", "--show-toplevel")
64+
return output.strip() # Remove trailing newline
65+
66+
67+
def git_ls_files(path: Filepath) -> List[str]:
68+
output = _run_command("git", "-C", os.fspath(path), "ls-files",
69+
# Make sure submodules are included.
70+
"--recurse-submodules")
71+
# Remove trailing newlines and split to output all the paths
72+
return output.strip("\n").split("\n")
73+
74+
75+
def _duplicate_tree(src: Filepath, dest: Filepath
76+
) -> Iterator[Tuple[str, str, bool]]:
77+
"""Traverses src and for each file or directory yields a path to it,
78+
its destination, and whether it is a directory."""
79+
for entry in os.scandir(src): # type: os.DirEntry
80+
if entry.is_dir():
81+
dir_src = entry.path
82+
dir_dest = os.path.join(dest, entry.name)
83+
yield dir_src, dir_dest, True
84+
yield from _duplicate_tree(dir_src, dir_dest)
85+
elif entry.is_file() or entry.is_symlink():
86+
yield entry.path, os.path.join(dest, entry.name), False
87+
else:
88+
warnings.warn(f"Unsupported filetype for copying. "
89+
f"Skipping {entry.path}")
90+
91+
92+
def _duplicate_git_tree(src: Filepath, dest: Filepath
93+
) -> Iterator[Tuple[str, str, bool]]:
94+
"""Traverses src, finds all files registered in git and for each file or
95+
directory yields a path to it, its destination and whether it is a
96+
directory"""
97+
# A set of dirs we have already yielded. '' is the output of
98+
# os.path.dirname when the path is in the current directory.
99+
yielded_dirs: Set[str] = {''}
100+
for path in git_ls_files(src):
101+
# git ls-files does not list directories. Yield parent first to prevent
102+
# creating files in non-existing directories. Also check if it is
103+
# yielded before so each directory is only yielded once.
104+
parent = os.path.dirname(path)
105+
if parent not in yielded_dirs:
106+
# This maybe a nested directory, with non-existing parents itself.
107+
# Therefore:
108+
# - List parents from deepest to least deep by using os.path.dirname # noqa: E501
109+
# - Reverse the list to yield directories from least deep to deepest # noqa: E501
110+
# This ensures parents are always yielded before children.
111+
parents = []
112+
while parent not in yielded_dirs:
113+
yielded_dirs.add(parent)
114+
parents.append(parent)
115+
parent = os.path.dirname(parent)
116+
117+
for parent in reversed(parents):
118+
src_parent = os.path.join(src, parent)
119+
dest_parent = os.path.join(dest, parent)
120+
yield src_parent, dest_parent, True
121+
122+
# Yield the actual file if the directory has already been yielded.
123+
src_path = os.path.join(src, path)
124+
dest_path = os.path.join(dest, path)
125+
yield src_path, dest_path, False
126+
127+
128+
def duplicate_tree(src: Filepath, dest: Filepath,
129+
symlink: bool = False,
130+
git_aware: bool = False):
131+
"""
132+
Duplicates a filetree
133+
:param src: The source directory
134+
:param dest: The destination directory
135+
:param symlink: Create symlinks nstead of copying the files.
136+
:param git_aware: Only copy/symlink files registered by git.
137+
"""
138+
if not symlink and not git_aware:
139+
shutil.copytree(src, dest)
140+
return
141+
142+
if not os.path.isdir(src):
143+
# shutil.copytree also throws a NotADirectoryError
144+
raise NotADirectoryError(f"Not a directory: '{src}'")
145+
146+
if git_aware:
147+
path_iter = _duplicate_git_tree(src, dest)
148+
else:
149+
path_iter = _duplicate_tree(src, dest)
150+
if symlink:
151+
copy: Callable[[Filepath, Filepath], None] = \
152+
functools.partial(os.symlink, target_is_directory=False)
153+
else:
154+
copy = shutil.copy2 # Preserves metadata, also used by shutil.copytree
155+
156+
os.makedirs(dest, exist_ok=False)
157+
for src_path, dest_path, is_dir in path_iter:
158+
if is_dir:
159+
os.mkdir(dest_path)
160+
else:
161+
copy(src_path, dest_path)
162+
163+
164+
def link_tree(src: Filepath, dest: Filepath) -> None:
45165
"""
46166
Copies a tree by mimicking the directory structure and soft-linking the
47167
files
48168
:param src: The source directory
49169
:param dest: The destination directory
50170
"""
51-
if src.is_dir():
52-
dest.mkdir(parents=True)
53-
for path in os.listdir(str(src)):
54-
link_tree(Path(src, path), Path(dest, path))
55-
elif src.is_file() or src.is_symlink():
56-
dest.symlink_to(src, target_is_directory=False)
57-
else: # Only copy files and symlinks, no devices etc.
58-
warnings.warn(f"Unsupported filetype. Skipping copying: '{str(src)}' "
59-
f"to '{str(dest)}'.")
171+
# THIS FUNCTION IS KEPT FOR BACKWARDS-COMPATIBILITY
172+
duplicate_tree(src, dest, symlink=True)
60173

61174

62175
# block_size 64k with python is a few percent faster than linux native md5sum.

0 commit comments

Comments
 (0)