From 7aa02b2c61932769af7af12e520dbe5d0979870c Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Thu, 21 May 2026 20:44:06 -0400 Subject: [PATCH 01/20] pyproject.toml: fix requires-python typo The PEP-621 key is `requires-python` (with the `s`), not `required-python`. Hatch and setuptools silently ignore unknown top-level project keys, so the misspelled line has had no effect since it was written: nothing in the build pipeline has actually been enforcing a minimum Python version. After this change, `python -m build` produces a wheel whose METADATA includes `Requires-Python: >=3.8`; before it produced no such line at all. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- pyproject.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pyproject.toml b/pyproject.toml index 6cf449a..2c529e1 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -8,7 +8,7 @@ dependencies = [ # and no longer functions correctly #"gpg >= 1.10.0", ] -required-python = ">= 3.8" +requires-python = ">= 3.8" authors = [ {name = "Joshua Watt", email = "JPEWhacker@gmail.com"}, {name = "Trevor Woerner", email = "twoerner@gmail.com"}, From a60cd6be2cf7f77baee468dff601f16ccf905a39 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Thu, 21 May 2026 21:25:21 -0400 Subject: [PATCH 02/20] support Python 3.9 through 3.14 Python 3.8 reached end-of-life in October 2024; no upstream security or bug fixes are issued for it any longer. Drop it from the supported set. Python 3.13 (released October 2024) and 3.14 (released October 2025) are now both stable and broadly available; add them to the supported set. The change spans two surfaces that have to move together: - pyproject.toml: requires-python bumps to ">= 3.9" and the trove classifier list loses 3.8 and gains 3.13 and 3.14. - .github/workflows/ci.yml: the test matrix loses 3.8 and gains 3.13 and 3.14, so every interpreter declared as supported is actually exercised by CI. The bmaptool source code uses no 3.9+-specific syntax, so the floor change is a packaging and CI declaration only; no source files need to change to honor it. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/workflows/ci.yml | 3 ++- pyproject.toml | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index e0fdde5..9773232 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -11,11 +11,12 @@ jobs: strategy: matrix: python-version: - - "3.8" - "3.9" - "3.10" - "3.11" - "3.12" + - "3.13" + - "3.14" # Testing with native host python is required in order to test the # GPG code, since it must use the host python3-gpg package - "native" diff --git a/pyproject.toml b/pyproject.toml index 2c529e1..e119044 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -8,7 +8,7 @@ dependencies = [ # and no longer functions correctly #"gpg >= 1.10.0", ] -requires-python = ">= 3.8" +requires-python = ">= 3.9" authors = [ {name = "Joshua Watt", email = "JPEWhacker@gmail.com"}, {name = "Trevor Woerner", email = "twoerner@gmail.com"}, @@ -23,11 +23,12 @@ classifiers = [ "Topic :: Software Development :: Embedded Systems", "License :: OSI Approved :: GNU General Public License v2 (GPLv2)", "Programming Language :: Python :: 3", - "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", "Programming Language :: Python :: 3.11", "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", ] [project.optional-dependencies] From 40eed8e7f95f93e68530d9364501e70cb50eb1d8 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Thu, 21 May 2026 21:29:22 -0400 Subject: [PATCH 03/20] pyproject.toml: declare license via PEP-639 SPDX PEP-639 (accepted in 2023, supported by hatchling >= 1.18) introduces a structured way to declare a project's license in `pyproject.toml`: an SPDX expression string in the `license` field, plus a `license-files` glob list pointing at the license text in the distribution. Adopt that form: license = "GPL-2.0-only" license-files = ["LICENSE"] The existing trove classifier "License :: OSI Approved :: GNU General Public License v2 (GPLv2)" stays in place. PEP-639 marks license classifiers as deprecated but encourages keeping them during the deprecation overlap so older tooling that has not yet learned to read the SPDX field still identifies the license correctly. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- pyproject.toml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/pyproject.toml b/pyproject.toml index e119044..b644382 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -9,6 +9,8 @@ dependencies = [ #"gpg >= 1.10.0", ] requires-python = ">= 3.9" +license = "GPL-2.0-only" +license-files = ["LICENSE"] authors = [ {name = "Joshua Watt", email = "JPEWhacker@gmail.com"}, {name = "Trevor Woerner", email = "twoerner@gmail.com"}, From 75961fd9b66c94daadbfcf2d1783e64d1ebaaea7 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Thu, 21 May 2026 21:32:53 -0400 Subject: [PATCH 04/20] pyproject.toml: add maintainers block PEP-621 distinguishes `authors` (people who wrote the project) from `maintainers` (people who currently maintain it). Until now the file only had `authors`, which obscured who is on the hook for bmaptool today. Add a `maintainers` block. Any tool that surfaces a project's maintainer contact information will use it instead of falling back to the `authors` list. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- pyproject.toml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/pyproject.toml b/pyproject.toml index b644382..7d31a94 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -17,6 +17,11 @@ authors = [ {name = "Tim Orling", email = "ticotimo@gmail.com"}, ] +maintainers = [ + {name = "Trevor Woerner", email = "twoerner@gmail.com"}, + {name = "Joshua Watt", email = "JPEWhacker@gmail.com"}, + {name = "Tim Orling", email = "ticotimo@gmail.com"}, +] readme = "README.md" classifiers = [ "Development Status :: 5 - Production/Stable", From f30a9931eec706c455bdfba3e90828165e18e122 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Thu, 21 May 2026 21:41:43 -0400 Subject: [PATCH 05/20] pyproject.toml: add Changelog URL The project keeps a curated release history at the repo root in CHANGELOG.md. Add a `Changelog` entry to `[project.urls]` so any tool that surfaces a project's labeled URLs can point at it directly. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- pyproject.toml | 1 + 1 file changed, 1 insertion(+) diff --git a/pyproject.toml b/pyproject.toml index 7d31a94..bdb6f23 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -48,6 +48,7 @@ dev = [ Homepage = "https://github.com/yoctoproject/bmaptool" Repository = "https://github.com/yoctoproject/bmaptool.git" Issues = "https://github.com/yoctoproject/bmaptool/issues" +Changelog = "https://github.com/yoctoproject/bmaptool/blob/main/CHANGELOG.md" [project.scripts] bmaptool = "bmaptool.CLI:main" From c9f8fc1bf235caae9a587e62e072cc5c9a70373c Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Thu, 21 May 2026 22:06:46 -0400 Subject: [PATCH 06/20] bmaptool: move package version to `__version__` The package version lives in the conventional location: src/bmaptool/__init__.py: __version__ = "..." `bmaptool.CLI` re-exports it as the local `VERSION` constant (`from . import __version__ as VERSION`), keeping the existing `bmaptool --version` argparse wiring intact. `[tool.hatch.version]` in pyproject.toml points at `src/bmaptool/__init__.py`. Hatch's default regex source matches the `__version__` line correctly and the built wheel's METADATA carries the same version string. `make_a_release.sh` bumps `__version__` in `src/bmaptool/__init__.py` instead of an unrelated source file; the release flow is otherwise unchanged. `import bmaptool; bmaptool.__version__` now works as any Python user expects, and the version source is no longer buried in an argument-parsing module. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- make_a_release.sh | 4 ++-- pyproject.toml | 2 +- src/bmaptool/CLI.py | 2 +- src/bmaptool/__init__.py | 1 + 4 files changed, 5 insertions(+), 4 deletions(-) diff --git a/make_a_release.sh b/make_a_release.sh index 965c9c1..6160947 100755 --- a/make_a_release.sh +++ b/make_a_release.sh @@ -85,8 +85,8 @@ printf "%s" "$new_ver" | egrep -q -x '[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+' ask_question "Did you update the man page" ask_question "Did you update tests: test-data and oldcodebase" -# Change the version in the 'bmaptool/CLI.py' file -sed -i -e "s/^VERSION = \"[0-9]\+\.[0-9]\+\.[0-9]\+\"$/VERSION = \"$new_ver\"/" src/bmaptool/CLI.py +# Change the version in the package +sed -i -e "s/^__version__ = \"[0-9]\+\.[0-9]\+\.[0-9]\+\"$/__version__ = \"$new_ver\"/" src/bmaptool/__init__.py # Sed the version in the RPM spec file sed -i -e "s/^Version: [0-9]\+\.[0-9]\+\.[0-9]\+$/Version: $new_ver/" packaging/bmaptool.spec # Remove the "rc_num" macro from the RPM spec file to make sure we do not have diff --git a/pyproject.toml b/pyproject.toml index bdb6f23..38f8e71 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -58,4 +58,4 @@ requires = ["hatchling"] build-backend = "hatchling.build" [tool.hatch.version] -path = "src/bmaptool/CLI.py" +path = "src/bmaptool/__init__.py" diff --git a/src/bmaptool/CLI.py b/src/bmaptool/CLI.py index 40acf31..9183478 100644 --- a/src/bmaptool/CLI.py +++ b/src/bmaptool/CLI.py @@ -45,7 +45,7 @@ from typing import NamedTuple from . import BmapCreate, BmapCopy, BmapHelpers, TransRead -VERSION = "3.9.0" +from . import __version__ as VERSION log = logging.getLogger() # pylint: disable=C0103 diff --git a/src/bmaptool/__init__.py b/src/bmaptool/__init__.py index e69de29..fcd7ddb 100644 --- a/src/bmaptool/__init__.py +++ b/src/bmaptool/__init__.py @@ -0,0 +1 @@ +__version__ = "3.9.0" From d8e96db55b5ce9c79306554723a1e13b36c916bc Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 00:16:36 -0400 Subject: [PATCH 07/20] tests: remove backward-compat test of historical BmapCopy modules `tests/test_compat.py` has two halves: a forward-compat check that the current `BmapCopy` can read every historical bmap fixture in `tests/test-data/`, and a backward-compat check that *historical* `BmapCopy` modules under `tests/oldcodebase/` can read current fixtures. The backward-compat half is a 2026 anachronism: a user running bmaptool 3.x does not run BmapCopy 1.0 against today's bmap files, and the museum modules under `tests/oldcodebase/` are ~6,500 lines of pre-Python-3 code that drags in `six` as the only reason it stays in the dev extras. The forward-compat half (current code, all historical fixtures) is preserved unchanged. `tests/oldcodebase/` is removed entirely. `tests/test_compat.py` loses `_test_older_bmapcopy()`, its inner `import_module` helper, the `_OLDCODEBASE_SUBDIR` constant, and the docstring line that advertised the backward-compat behavior. `six` is removed from `[project.optional-dependencies].dev` in pyproject.toml; no production code or surviving test imports it. `make_a_release.sh`'s pre-release reminder ("Did you update tests: test-data and oldcodebase") drops the `oldcodebase` clause, since there is no longer such a directory to update. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- make_a_release.sh | 2 +- pyproject.toml | 1 - tests/oldcodebase/BmapCopy1_0.py | 744 ---------------------------- tests/oldcodebase/BmapCopy2_0.py | 670 ------------------------- tests/oldcodebase/BmapCopy2_1.py | 669 ------------------------- tests/oldcodebase/BmapCopy2_2.py | 671 ------------------------- tests/oldcodebase/BmapCopy2_3.py | 708 --------------------------- tests/oldcodebase/BmapCopy2_4.py | 708 --------------------------- tests/oldcodebase/BmapCopy2_5.py | 769 ----------------------------- tests/oldcodebase/BmapCopy2_6.py | 769 ----------------------------- tests/oldcodebase/BmapCopy3_0.py | 814 ------------------------------- tests/oldcodebase/__init__.py | 0 tests/test_compat.py | 72 +-- 13 files changed, 3 insertions(+), 6594 deletions(-) delete mode 100644 tests/oldcodebase/BmapCopy1_0.py delete mode 100644 tests/oldcodebase/BmapCopy2_0.py delete mode 100644 tests/oldcodebase/BmapCopy2_1.py delete mode 100644 tests/oldcodebase/BmapCopy2_2.py delete mode 100644 tests/oldcodebase/BmapCopy2_3.py delete mode 100644 tests/oldcodebase/BmapCopy2_4.py delete mode 100644 tests/oldcodebase/BmapCopy2_5.py delete mode 100644 tests/oldcodebase/BmapCopy2_6.py delete mode 100644 tests/oldcodebase/BmapCopy3_0.py delete mode 100644 tests/oldcodebase/__init__.py diff --git a/make_a_release.sh b/make_a_release.sh index 6160947..259241e 100755 --- a/make_a_release.sh +++ b/make_a_release.sh @@ -83,7 +83,7 @@ printf "%s" "$new_ver" | egrep -q -x '[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+' # Remind the maintainer about various important things ask_question "Did you update the man page" -ask_question "Did you update tests: test-data and oldcodebase" +ask_question "Did you update tests: test-data" # Change the version in the package sed -i -e "s/^__version__ = \"[0-9]\+\.[0-9]\+\.[0-9]\+\"$/__version__ = \"$new_ver\"/" src/bmaptool/__init__.py diff --git a/pyproject.toml b/pyproject.toml index 38f8e71..8898a0f 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -41,7 +41,6 @@ classifiers = [ [project.optional-dependencies] dev = [ "black >= 22.3.0", - "six >= 1.16.0", ] [project.urls] diff --git a/tests/oldcodebase/BmapCopy1_0.py b/tests/oldcodebase/BmapCopy1_0.py deleted file mode 100644 index d381128..0000000 --- a/tests/oldcodebase/BmapCopy1_0.py +++ /dev/null @@ -1,744 +0,0 @@ -# pylint: disable-all - -"""This module implements copying of images with bmap and provides the -following API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied.""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# * Too many statements (R0915) -# * Too many branches (R0912) -# pylint: disable=R0902 -# pylint: disable=R0915 -# pylint: disable=R0912 - -import os -import stat -import sys -import hashlib -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# A list of supported image formats -SUPPORTED_IMAGE_FORMATS = ("bz2", "gz", "tar.gz", "tgz", "tar.bz2") - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors.""" - - pass - - -class BmapCopy: - """This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file-like object of the destination file copy the image to - * full path or a file-like object of the bmap file (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - The image file may either be an uncompressed raw image or a compressed - image. Compression type is defined by the image file extension. Supported - types are listed by 'SUPPORTED_IMAGE_FORMATS'. - - IMPORTANT: if the image is given as a file-like object, the compression - type recognition is not performed - the file-like object's 'read()' method - is used directly instead. - - Once an instance of 'BmapCopy' is created, all the 'bmap_*' attributes are - initialized and available. They are read from the bmap. - - However, if bmap was not provided, this is not always the case and some of - the 'bmap_*' attributes are not initialize by the class constructor. - Instead, they are initialized only in the 'copy()' method. The reason for - this is that when bmap is absent, 'BmapCopy' uses sensible fall-back values - for the 'bmap_*' attributes assuming the entire image is "mapped". And if - the image is compressed, it cannot easily find out the image size. Thus, - this is postponed until the 'copy()' method decompresses the image for the - first time. - - The 'copy()' method implements the copying. You may choose whether to - verify the SHA1 checksum while copying or not. Note, this is done only in - case of bmap-based copying and only if bmap contains the SHA1 checksums - (e.g., bmap version 1.0 did not have SHA1 checksums). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'.""" - - def _initialize_sizes(self, image_size): - """This function is only used when the there is no bmap. It - initializes attributes like 'blocks_cnt', 'mapped_cnt', etc. Normally, - the values are read from the bmap file, but in this case they are just - set to something reasonable.""" - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = self.image_size + self.block_size - 1 - self.blocks_cnt /= self.block_size - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _parse_bmap(self): - """Parse the bmap file and initialize the 'bmap_*' attributes.""" - - bmap_pos = self._f_bmap.tell() - self._f_bmap.seek(0) - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s" % (self._bmap_path, err) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - major = int(self.bmap_version.split(".", 1)[0]) - if major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" % (SUPPORTED_BMAP_VERSION, major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) / self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - self._f_bmap.seek(bmap_pos) - - def _open_image_file(self): - """Open the image file which may be compressed or not. The compression - type is recognized by the file extension. Supported types are defined - by 'SUPPORTED_IMAGE_FORMATS'.""" - - try: - is_regular_file = stat.S_ISREG(os.stat(self._image_path).st_mode) - except OSError as err: - raise Error( - "cannot access image file '%s': %s" % (self._image_path, err.strerror) - ) - - if not is_regular_file: - raise Error("image file '%s' is not a regular file" % self._image_path) - - try: - if ( - self._image_path.endswith(".tar.gz") - or self._image_path.endswith(".tar.bz2") - or self._image_path.endswith(".tgz") - ): - import tarfile - - tar = tarfile.open(self._image_path, "r") - # The tarball is supposed to contain only one single member - members = tar.getnames() - if len(members) > 1: - raise Error( - "the image tarball '%s' contains more than " - "one file" % self._image_path - ) - elif len(members) == 0: - raise Error( - "the image tarball '%s' is empty (no files)" % self._image_path - ) - self._f_image = tar.extractfile(members[0]) - elif self._image_path.endswith(".gz"): - import gzip - - self._f_image = gzip.GzipFile(self._image_path, "rb") - elif self._image_path.endswith(".bz2"): - import bz2 - - self._f_image = bz2.BZ2File(self._image_path, "rb") - else: - self._image_is_compressed = False - self._f_image = open(self._image_path, "rb") - except IOError as err: - raise Error("cannot open image file '%s': %s" % (self._image_path, err)) - - self._f_image_needs_close = True - - def _validate_image_size(self): - """Make sure that image size from bmap matches real image size.""" - - image_size = os.fstat(self._f_image.fileno()).st_size - if image_size != self.image_size: - raise Error( - "Size mismatch, bmap '%s' was created for an image " - "of size %d bytes, but image '%s' has size %d bytes" - % (self._bmap_path, self.image_size, self._image_path, image_size) - ) - - def _open_destination_file(self): - """Open the destination file.""" - - try: - self._f_dest = open(self._dest_path, "w") - except IOError as err: - raise Error( - "cannot open destination file '%s': %s" % (self._dest_path, err) - ) - - self._f_dest_needs_close = True - - def _open_bmap_file(self): - """Open the bmap file.""" - - try: - self._f_bmap = open(self._bmap_path, "r") - except IOError as err: - raise Error( - "cannot open bmap file '%s': %s" % (self._bmap_path, err.strerror) - ) - - self._f_bmap_needs_close = True - - def __init__(self, image, dest, bmap=None): - """The class constructor. The parameters are: - image - full path or file object of the image which should be copied - dest - full path or file-like object of the destination file to - copy the image to - bmap - full path or file-like object of the bmap file to use for - copying""" - - self._xml = None - self._image_is_compressed = True - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 2 - - self.bmap_version = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_dest_needs_close = False - self._f_image_needs_close = False - self._f_bmap_needs_close = False - - self._f_bmap = None - self._f_bmap_path = None - - if hasattr(dest, "write"): - self._f_dest = dest - self._dest_path = dest.name - else: - self._dest_path = dest - self._open_destination_file() - - if hasattr(image, "read"): - self._f_image = image - self._image_path = image.name - else: - self._image_path = image - self._open_image_file() - - st_mode = os.fstat(self._f_dest.fileno()).st_mode - self._dest_is_regfile = stat.S_ISREG(st_mode) - - if bmap: - if hasattr(bmap, "read"): - self._f_bmap = bmap - self._bmap_path = bmap.name - else: - self._bmap_path = bmap - self._open_bmap_file() - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - # We can initialize size-related attributes only if we the image is - # uncompressed. - if not self._image_is_compressed: - image_size = os.fstat(self._f_image.fileno()).st_size - self._initialize_sizes(image_size) - - if not self._image_is_compressed: - self._validate_image_size() - - self._batch_blocks = self._batch_bytes / self.block_size - - def __del__(self): - """The class destructor which closes the opened files.""" - - if self._f_image_needs_close: - self._f_image.close() - if self._f_dest_needs_close: - self._f_dest.close() - if self._f_bmap_needs_close: - self._f_bmap.close() - - def _get_block_ranges(self): - """This is a helper generator that parses the bmap XML file and for - each block range in the XML file it yields ('first', 'last', 'sha1') - tuples, where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'sha1' is the SHA1 checksum of the range ('None' is used if it is - missing. - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown (the image is - compressed), the generator infinitely yields continuous ranges of - size '_batch_blocks'.""" - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if "sha1" in xml_element.attrib: - sha1 = xml_element.attrib["sha1"] - else: - sha1 = None - - yield (first, last, sha1) - - def _get_batches(self, first, last): - """This is a helper generator which splits block ranges from the bmap - file to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1).""" - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """This is generator which reads the image file in '_batch_blocks' - chunks and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data.""" - - try: - for first, last, sha1 in self._get_block_ranges(): - if verify and sha1: - hash_obj = hashlib.new("sha1") - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and sha1: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) / self.block_size - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and sha1 and hash_obj.hexdigest() != sha1: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s" - % (first, last, hash_obj.hexdigest(), sha1) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """Copy the image to the destination file using bmap. The sync - argument defines whether the destination file has to be synchronized - upon return. The 'verify' argument defines whether the SHA1 checksum - has to be verified while copying.""" - - # Save file positions in order to restore them at the end - image_pos = self._f_image.tell() - dest_pos = self._f_dest.tell() - if self._f_bmap: - bmap_pos = self._f_bmap.tell() - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - if not self.image_size: - # The image size was unknown up until now, probably because this is - # a compressed image. Initialize the corresponding class attributes - # now, when we know the size. - self._initialize_sizes(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks, but should have %u - inconsistent " - "bmap file" % (blocks_written, self.mapped_cnt) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - # Restore file positions - self._f_image.seek(image_pos) - self._f_dest.seek(dest_pos) - if self._f_bmap: - self._f_bmap.seek(bmap_pos) - - def sync(self): - """Synchronize the destination file to make sure all the data are - actually written to the disk.""" - - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """This class is a specialized version of 'BmapCopy' which copies the - image to a block device. Unlike the base 'BmapCopy' class, this class does - various optimizations specific to block devices, e.g., switching to the - 'noop' I/O scheduler.""" - - def _open_destination_file(self): - """Open the block device in exclusive mode.""" - - try: - self._f_dest = os.open(self._dest_path, os.O_WRONLY | os.O_EXCL) - except OSError as err: - raise Error( - "cannot open block device '%s' in exclusive mode: %s" - % (self._dest_path, err.strerror) - ) - - try: - os.fstat(self._f_dest).st_mode - except OSError as err: - raise Error( - "cannot access block device '%s': %s" % (self._dest_path, err.strerror) - ) - - # Turn the block device file descriptor into a file object - try: - self._f_dest = os.fdopen(self._f_dest, "wb") - except OSError as err: - os.close(self._f_dest) - raise Error("cannot open block device '%s': %s" % (self._dest_path, err)) - - self._f_dest_needs_close = True - - def _tune_block_device(self): - """ " Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError: - # No problem, this is just an optimization. - return - - # The file contains a list of scheduler with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the current scheduler name - import re - - match = re.match(r".*\[(.+)\].*", contents) - self._old_scheduler_value = match.group(1) - - # Limit the write buffering - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError: - return - - def _restore_bdev_settings(self): - """Restore old block device settings which we changed in - '_tune_block_device()'.""" - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError: - # No problem, this is just an optimization. - return - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError: - return - - def copy(self, sync=True, verify=True): - """The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time.""" - - try: - self._tune_block_device() - BmapCopy.copy(self, sync, verify) - except: - self._restore_bdev_settings() - raise - - def __init__(self, image, dest, bmap=None): - """The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices.""" - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap) - - self._batch_bytes = 1024 * 1024 - self._batch_blocks = self._batch_bytes / self.block_size - self._batch_queue_len = 6 - self._dest_fsync_watermark = (6 * 1024 * 1024) / self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known (i.e., it is not compressed) - check that - # it fits the block device. - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - try: - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - except OSError: - # No problem, this is just an optimization. - pass - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" diff --git a/tests/oldcodebase/BmapCopy2_0.py b/tests/oldcodebase/BmapCopy2_0.py deleted file mode 100644 index bfd085e..0000000 --- a/tests/oldcodebase/BmapCopy2_0.py +++ /dev/null @@ -1,670 +0,0 @@ -# pylint: disable-all - -"""This module implements copying of images with bmap and provides the -following API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied.""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# pylint: disable=R0902 - -import os -import stat -import sys -import hashlib -import datetime -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors.""" - - pass - - -class BmapCopy: - """This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file-like object of the destination file copy the image to - * full path or a file-like object of the bmap file (optional) - * image size in bytes (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - When the bmap is provided, it is not necessary to specify image size, - because the size is contained in the bmap. Otherwise, it is benefitial to - specify the size because it enables extra sanity checks and makes it - possible to provide the progress bar. - - When the image size is known either from the bmap or the caller specified - it to the class constructor, all the image geometry description attributes - ('blocks_cnt', etc) are initialized by the class constructor and available - for the user. - - However, when the size is not known, some of the image geometry - description attributes are not initialized by the class constructor. - Instead, they are initialized only by the 'copy()' method. - - The 'copy()' method implements image copying. You may choose whether to - verify the SHA1 checksum while copying or not. Note, this is done only in - case of bmap-based copying and only if bmap contains the SHA1 checksums - (e.g., bmap version 1.0 did not have SHA1 checksums). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'. - - It is possible to have a simple progress indicator while copying the image. - Use the 'set_progress_indicator()' method. - - You can copy only once with an instance of this class. This means that in - order to copy the image for the second time, you have to create a new class - instance.""" - - def set_progress_indicator(self, file_obj, format_string): - """Setup the progress indicator which shows how much data has been - copied in percent. - - The 'file_obj' argument is the console file object where the progress - has to be printed to. Pass 'None' to disable the progress indicator. - - The 'format_string' argument is the format string for the progress - indicator. It has to contain a single '%d' placeholder which will be - substitutes with copied data in percent.""" - - self._progress_file = file_obj - if format_string: - self._progress_format = format_string - else: - self._progress_format = "Copied %d%%" - - def _set_image_size(self, image_size): - """Set image size and initialize various other geometry-related - attributes.""" - - if self.image_size is not None and self.image_size != image_size: - raise Error( - "cannot set image size to %d bytes, it is known to " - "be %d bytes (%s)" - % (image_size, self.image_size, self.image_size_human) - ) - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = self.image_size + self.block_size - 1 - self.blocks_cnt /= self.block_size - - if self.mapped_cnt is None: - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _parse_bmap(self): - """Parse the bmap file and initialize corresponding class instance - attributs.""" - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s" % (self._bmap_path, err) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - major = int(self.bmap_version.split(".", 1)[0]) - if major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" % (SUPPORTED_BMAP_VERSION, major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) / self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - def __init__(self, image, dest, bmap=None, image_size=None): - """The class constructor. The parameters are: - image - file-like object of the image which should be copied, - should only support 'read()' and 'seek()' methods, - and only seeking forward has to be supported. - dest - file-like object of the destination file to copy the - image to. - bmap - file-like object of the bmap file to use for copying. - image_size - size of the image in bytes.""" - - self._xml = None - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 2 - - self.bmap_version = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_bmap = None - self._f_bmap_path = None - - self._progress_started = None - self._progress_index = None - self._progress_time = None - self._progress_file = None - self._progress_format = None - self.set_progress_indicator(None, None) - - self._f_image = image - self._image_path = image.name - - self._f_dest = dest - self._dest_path = dest.name - st_data = os.fstat(self._f_dest.fileno()) - self._dest_is_regfile = stat.S_ISREG(st_data.st_mode) - - # Special quirk for /dev/null which does not support fsync() - if ( - stat.S_ISCHR(st_data.st_mode) - and os.major(st_data.st_rdev) == 1 - and os.minor(st_data.st_rdev) == 3 - ): - self._dest_supports_fsync = False - else: - self._dest_supports_fsync = True - - if bmap: - self._f_bmap = bmap - self._bmap_path = bmap.name - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - if image_size: - self._set_image_size(image_size) - - self._batch_blocks = self._batch_bytes / self.block_size - - def _update_progress(self, blocks_written): - """Print the progress indicator if the mapped area size is known and - if the indicator has been enabled by assigning a console file object to - the 'progress_file' attribute.""" - - if not self._progress_file: - return - - if self.mapped_cnt: - assert blocks_written <= self.mapped_cnt - percent = int((float(blocks_written) / self.mapped_cnt) * 100) - progress = "\r" + self._progress_format % percent + "\n" - else: - # Do not rotate the wheel too fast - now = datetime.datetime.now() - min_delta = datetime.timedelta(milliseconds=250) - if now - self._progress_time < min_delta: - return - self._progress_time = now - - progress_wheel = ("-", "\\", "|", "/") - progress = "\r" + progress_wheel[self._progress_index % 4] + "\n" - self._progress_index += 1 - - # This is a little trick we do in order to make sure that the next - # message will always start from a new line - we switch to the new - # line after each progress update and move the cursor up. As an - # example, this is useful when the copying is interrupted by an - # exception - the error message will start form new line. - if self._progress_started: - # The "move cursor up" escape sequence - self._progress_file.write("\033[1A") - else: - self._progress_started = True - - self._progress_file.write(progress) - self._progress_file.flush() - - def _get_block_ranges(self): - """This is a helper generator that parses the bmap XML file and for - each block range in the XML file it yields ('first', 'last', 'sha1') - tuples, where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'sha1' is the SHA1 checksum of the range ('None' is used if it is - missing. - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown, the generator - infinitely yields continuous ranges of size '_batch_blocks'.""" - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if "sha1" in xml_element.attrib: - sha1 = xml_element.attrib["sha1"] - else: - sha1 = None - - yield (first, last, sha1) - - def _get_batches(self, first, last): - """This is a helper generator which splits block ranges from the bmap - file to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1).""" - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """This is generator which reads the image file in '_batch_blocks' - chunks and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data.""" - - try: - for first, last, sha1 in self._get_block_ranges(): - if verify and sha1: - hash_obj = hashlib.new("sha1") - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and sha1: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) / self.block_size - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and sha1 and hash_obj.hexdigest() != sha1: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s (image file %s)" - % (first, last, hash_obj.hexdigest(), sha1, self._image_path) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """Copy the image to the destination file using bmap. The 'sync' - argument defines whether the destination file has to be synchronized - upon return. The 'verify' argument defines whether the SHA1 checksum - has to be verified while copying.""" - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - self._progress_started = False - self._progress_index = 0 - self._progress_time = datetime.datetime.now() - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - self._update_progress(blocks_written) - - if not self.image_size: - # The image size was unknown up until now, set it - self._set_image_size(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks from image '%s' to '%s', but should " - "have %u - inconsistent bmap file '%s'" - % ( - blocks_written, - self._image_path, - self._dest_path, - self.mapped_cnt, - self._bmap_path, - ) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - def sync(self): - """Synchronize the destination file to make sure all the data are - actually written to the disk.""" - - if self._dest_supports_fsync: - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """This class is a specialized version of 'BmapCopy' which copies the - image to a block device. Unlike the base 'BmapCopy' class, this class does - various optimizations specific to block devices, e.g., switching to the - 'noop' I/O scheduler.""" - - def _tune_block_device(self): - """ " Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError as err: - # No problem, this is just an optimization - raise Error("cannot enable the 'noop' I/O scheduler: %s" % err) - - # The file contains a list of scheduler with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the current scheduler name - import re - - match = re.match(r".*\[(.+)\].*", contents) - self._old_scheduler_value = match.group(1) - - # Limit the write buffering - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError as err: - raise Error("cannot set max. I/O ratio to '1': %s" % err) - - def _restore_bdev_settings(self): - """Restore old block device settings which we changed in - '_tune_block_device()'.""" - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError as err: - raise Error( - "cannot restore the '%s' I/O scheduler: %s" - % (self._old_scheduler_value, err) - ) - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError as err: - raise Error( - "cannot set the max. I/O ratio back to '%s': %s" - % (self._old_max_ratio_value, err) - ) - - def copy(self, sync=True, verify=True): - """The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time.""" - - self._tune_block_device() - - try: - BmapCopy.copy(self, sync, verify) - except: - raise - finally: - self._restore_bdev_settings() - - def __init__(self, image, dest, bmap=None, image_size=None): - """The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices.""" - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap, image_size) - - self._batch_bytes = 1024 * 1024 - self._batch_blocks = self._batch_bytes / self.block_size - self._batch_queue_len = 6 - self._dest_fsync_watermark = (6 * 1024 * 1024) / self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known, check that it fits the block device - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" diff --git a/tests/oldcodebase/BmapCopy2_1.py b/tests/oldcodebase/BmapCopy2_1.py deleted file mode 100644 index 77c6a2e..0000000 --- a/tests/oldcodebase/BmapCopy2_1.py +++ /dev/null @@ -1,669 +0,0 @@ -# pylint: disable-all - -"""This module implements copying of images with bmap and provides the -following API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied.""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# pylint: disable=R0902 - -import os -import stat -import sys -import hashlib -import datetime -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors.""" - - pass - - -class BmapCopy: - """This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file-like object of the destination file copy the image to - * full path or a file-like object of the bmap file (optional) - * image size in bytes (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - When the bmap is provided, it is not necessary to specify image size, - because the size is contained in the bmap. Otherwise, it is benefitial to - specify the size because it enables extra sanity checks and makes it - possible to provide the progress bar. - - When the image size is known either from the bmap or the caller specified - it to the class constructor, all the image geometry description attributes - ('blocks_cnt', etc) are initialized by the class constructor and available - for the user. - - However, when the size is not known, some of the image geometry - description attributes are not initialized by the class constructor. - Instead, they are initialized only by the 'copy()' method. - - The 'copy()' method implements image copying. You may choose whether to - verify the SHA1 checksum while copying or not. Note, this is done only in - case of bmap-based copying and only if bmap contains the SHA1 checksums - (e.g., bmap version 1.0 did not have SHA1 checksums). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'. - - It is possible to have a simple progress indicator while copying the image. - Use the 'set_progress_indicator()' method. - - You can copy only once with an instance of this class. This means that in - order to copy the image for the second time, you have to create a new class - instance.""" - - def set_progress_indicator(self, file_obj, format_string): - """Setup the progress indicator which shows how much data has been - copied in percent. - - The 'file_obj' argument is the console file object where the progress - has to be printed to. Pass 'None' to disable the progress indicator. - - The 'format_string' argument is the format string for the progress - indicator. It has to contain a single '%d' placeholder which will be - substitutes with copied data in percent.""" - - self._progress_file = file_obj - if format_string: - self._progress_format = format_string - else: - self._progress_format = "Copied %d%%" - - def _set_image_size(self, image_size): - """Set image size and initialize various other geometry-related - attributes.""" - - if self.image_size is not None and self.image_size != image_size: - raise Error( - "cannot set image size to %d bytes, it is known to " - "be %d bytes (%s)" - % (image_size, self.image_size, self.image_size_human) - ) - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = (self.image_size + self.block_size - 1) // self.block_size - - if self.mapped_cnt is None: - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _parse_bmap(self): - """Parse the bmap file and initialize corresponding class instance - attributs.""" - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s" % (self._bmap_path, err) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - major = int(self.bmap_version.split(".", 1)[0]) - if major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" % (SUPPORTED_BMAP_VERSION, major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) // self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - def __init__(self, image, dest, bmap=None, image_size=None): - """The class constructor. The parameters are: - image - file-like object of the image which should be copied, - should only support 'read()' and 'seek()' methods, - and only seeking forward has to be supported. - dest - file-like object of the destination file to copy the - image to. - bmap - file-like object of the bmap file to use for copying. - image_size - size of the image in bytes.""" - - self._xml = None - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 2 - - self.bmap_version = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_bmap = None - self._f_bmap_path = None - - self._progress_started = None - self._progress_index = None - self._progress_time = None - self._progress_file = None - self._progress_format = None - self.set_progress_indicator(None, None) - - self._f_image = image - self._image_path = image.name - - self._f_dest = dest - self._dest_path = dest.name - st_data = os.fstat(self._f_dest.fileno()) - self._dest_is_regfile = stat.S_ISREG(st_data.st_mode) - - # Special quirk for /dev/null which does not support fsync() - if ( - stat.S_ISCHR(st_data.st_mode) - and os.major(st_data.st_rdev) == 1 - and os.minor(st_data.st_rdev) == 3 - ): - self._dest_supports_fsync = False - else: - self._dest_supports_fsync = True - - if bmap: - self._f_bmap = bmap - self._bmap_path = bmap.name - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - if image_size: - self._set_image_size(image_size) - - self._batch_blocks = self._batch_bytes // self.block_size - - def _update_progress(self, blocks_written): - """Print the progress indicator if the mapped area size is known and - if the indicator has been enabled by assigning a console file object to - the 'progress_file' attribute.""" - - if not self._progress_file: - return - - if self.mapped_cnt: - assert blocks_written <= self.mapped_cnt - percent = int((float(blocks_written) / self.mapped_cnt) * 100) - progress = "\r" + self._progress_format % percent + "\n" - else: - # Do not rotate the wheel too fast - now = datetime.datetime.now() - min_delta = datetime.timedelta(milliseconds=250) - if now - self._progress_time < min_delta: - return - self._progress_time = now - - progress_wheel = ("-", "\\", "|", "/") - progress = "\r" + progress_wheel[self._progress_index % 4] + "\n" - self._progress_index += 1 - - # This is a little trick we do in order to make sure that the next - # message will always start from a new line - we switch to the new - # line after each progress update and move the cursor up. As an - # example, this is useful when the copying is interrupted by an - # exception - the error message will start form new line. - if self._progress_started: - # The "move cursor up" escape sequence - self._progress_file.write("\033[1A") - else: - self._progress_started = True - - self._progress_file.write(progress) - self._progress_file.flush() - - def _get_block_ranges(self): - """This is a helper generator that parses the bmap XML file and for - each block range in the XML file it yields ('first', 'last', 'sha1') - tuples, where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'sha1' is the SHA1 checksum of the range ('None' is used if it is - missing. - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown, the generator - infinitely yields continuous ranges of size '_batch_blocks'.""" - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if "sha1" in xml_element.attrib: - sha1 = xml_element.attrib["sha1"] - else: - sha1 = None - - yield (first, last, sha1) - - def _get_batches(self, first, last): - """This is a helper generator which splits block ranges from the bmap - file to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1).""" - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """This is generator which reads the image file in '_batch_blocks' - chunks and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data.""" - - try: - for first, last, sha1 in self._get_block_ranges(): - if verify and sha1: - hash_obj = hashlib.new("sha1") - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and sha1: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) // self.block_size - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and sha1 and hash_obj.hexdigest() != sha1: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s (image file %s)" - % (first, last, hash_obj.hexdigest(), sha1, self._image_path) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """Copy the image to the destination file using bmap. The 'sync' - argument defines whether the destination file has to be synchronized - upon return. The 'verify' argument defines whether the SHA1 checksum - has to be verified while copying.""" - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - self._progress_started = False - self._progress_index = 0 - self._progress_time = datetime.datetime.now() - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - self._update_progress(blocks_written) - - if not self.image_size: - # The image size was unknown up until now, set it - self._set_image_size(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks from image '%s' to '%s', but should " - "have %u - inconsistent bmap file '%s'" - % ( - blocks_written, - self._image_path, - self._dest_path, - self.mapped_cnt, - self._bmap_path, - ) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - def sync(self): - """Synchronize the destination file to make sure all the data are - actually written to the disk.""" - - if self._dest_supports_fsync: - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """This class is a specialized version of 'BmapCopy' which copies the - image to a block device. Unlike the base 'BmapCopy' class, this class does - various optimizations specific to block devices, e.g., switching to the - 'noop' I/O scheduler.""" - - def _tune_block_device(self): - """ " Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError as err: - # No problem, this is just an optimization - raise Error("cannot enable the 'noop' I/O scheduler: %s" % err) - - # The file contains a list of scheduler with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the current scheduler name - import re - - match = re.match(r".*\[(.+)\].*", contents) - self._old_scheduler_value = match.group(1) - - # Limit the write buffering - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError as err: - raise Error("cannot set max. I/O ratio to '1': %s" % err) - - def _restore_bdev_settings(self): - """Restore old block device settings which we changed in - '_tune_block_device()'.""" - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError as err: - raise Error( - "cannot restore the '%s' I/O scheduler: %s" - % (self._old_scheduler_value, err) - ) - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError as err: - raise Error( - "cannot set the max. I/O ratio back to '%s': %s" - % (self._old_max_ratio_value, err) - ) - - def copy(self, sync=True, verify=True): - """The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time.""" - - self._tune_block_device() - - try: - BmapCopy.copy(self, sync, verify) - except: - raise - finally: - self._restore_bdev_settings() - - def __init__(self, image, dest, bmap=None, image_size=None): - """The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices.""" - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap, image_size) - - self._batch_bytes = 1024 * 1024 - self._batch_blocks = self._batch_bytes // self.block_size - self._batch_queue_len = 6 - self._dest_fsync_watermark = (6 * 1024 * 1024) // self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known, check that it fits the block device - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" diff --git a/tests/oldcodebase/BmapCopy2_2.py b/tests/oldcodebase/BmapCopy2_2.py deleted file mode 100644 index 89235f2..0000000 --- a/tests/oldcodebase/BmapCopy2_2.py +++ /dev/null @@ -1,671 +0,0 @@ -# pylint: disable-all - -"""This module implements copying of images with bmap and provides the -following API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied.""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# pylint: disable=R0902 - -import os -import stat -import sys -import hashlib -import datetime -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors.""" - - pass - - -class BmapCopy: - """This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file-like object of the destination file copy the image to - * full path or a file-like object of the bmap file (optional) - * image size in bytes (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - When the bmap is provided, it is not necessary to specify image size, - because the size is contained in the bmap. Otherwise, it is benefitial to - specify the size because it enables extra sanity checks and makes it - possible to provide the progress bar. - - When the image size is known either from the bmap or the caller specified - it to the class constructor, all the image geometry description attributes - ('blocks_cnt', etc) are initialized by the class constructor and available - for the user. - - However, when the size is not known, some of the image geometry - description attributes are not initialized by the class constructor. - Instead, they are initialized only by the 'copy()' method. - - The 'copy()' method implements image copying. You may choose whether to - verify the SHA1 checksum while copying or not. Note, this is done only in - case of bmap-based copying and only if bmap contains the SHA1 checksums - (e.g., bmap version 1.0 did not have SHA1 checksums). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'. - - It is possible to have a simple progress indicator while copying the image. - Use the 'set_progress_indicator()' method. - - You can copy only once with an instance of this class. This means that in - order to copy the image for the second time, you have to create a new class - instance.""" - - def set_progress_indicator(self, file_obj, format_string): - """Setup the progress indicator which shows how much data has been - copied in percent. - - The 'file_obj' argument is the console file object where the progress - has to be printed to. Pass 'None' to disable the progress indicator. - - The 'format_string' argument is the format string for the progress - indicator. It has to contain a single '%d' placeholder which will be - substitutes with copied data in percent.""" - - self._progress_file = file_obj - if format_string: - self._progress_format = format_string - else: - self._progress_format = "Copied %d%%" - - def _set_image_size(self, image_size): - """Set image size and initialize various other geometry-related - attributes.""" - - if self.image_size is not None and self.image_size != image_size: - raise Error( - "cannot set image size to %d bytes, it is known to " - "be %d bytes (%s)" - % (image_size, self.image_size, self.image_size_human) - ) - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = self.image_size + self.block_size - 1 - self.blocks_cnt /= self.block_size - - if self.mapped_cnt is None: - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _parse_bmap(self): - """Parse the bmap file and initialize corresponding class instance - attributs.""" - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s" % (self._bmap_path, err) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - major = int(self.bmap_version.split(".", 1)[0]) - if major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" % (SUPPORTED_BMAP_VERSION, major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) / self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - def __init__(self, image, dest, bmap=None, image_size=None): - """The class constructor. The parameters are: - image - file-like object of the image which should be copied, - should only support 'read()' and 'seek()' methods, - and only seeking forward has to be supported. - dest - file-like object of the destination file to copy the - image to. - bmap - file-like object of the bmap file to use for copying. - image_size - size of the image in bytes.""" - - self._xml = None - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 2 - - self.bmap_version = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_bmap = None - self._f_bmap_path = None - - self._progress_started = None - self._progress_index = None - self._progress_time = None - self._progress_file = None - self._progress_format = None - self.set_progress_indicator(None, None) - - self._f_image = image - self._image_path = image.name - - self._f_dest = dest - self._dest_path = dest.name - st_data = os.fstat(self._f_dest.fileno()) - self._dest_is_regfile = stat.S_ISREG(st_data.st_mode) - - # Special quirk for /dev/null which does not support fsync() - if ( - stat.S_ISCHR(st_data.st_mode) - and os.major(st_data.st_rdev) == 1 - and os.minor(st_data.st_rdev) == 3 - ): - self._dest_supports_fsync = False - else: - self._dest_supports_fsync = True - - if bmap: - self._f_bmap = bmap - self._bmap_path = bmap.name - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - if image_size: - self._set_image_size(image_size) - - self._batch_blocks = self._batch_bytes / self.block_size - - def _update_progress(self, blocks_written): - """Print the progress indicator if the mapped area size is known and - if the indicator has been enabled by assigning a console file object to - the 'progress_file' attribute.""" - - if not self._progress_file: - return - - if self.mapped_cnt: - assert blocks_written <= self.mapped_cnt - percent = int((float(blocks_written) / self.mapped_cnt) * 100) - progress = "\r" + self._progress_format % percent + "\n" - else: - # Do not rotate the wheel too fast - now = datetime.datetime.now() - min_delta = datetime.timedelta(milliseconds=250) - if now - self._progress_time < min_delta: - return - self._progress_time = now - - progress_wheel = ("-", "\\", "|", "/") - progress = "\r" + progress_wheel[self._progress_index % 4] + "\n" - self._progress_index += 1 - - # This is a little trick we do in order to make sure that the next - # message will always start from a new line - we switch to the new - # line after each progress update and move the cursor up. As an - # example, this is useful when the copying is interrupted by an - # exception - the error message will start form new line. - if self._progress_started: - # The "move cursor up" escape sequence - self._progress_file.write("\033[1A") # pylint: disable=W1401 - else: - self._progress_started = True - - self._progress_file.write(progress) - self._progress_file.flush() - - def _get_block_ranges(self): - """This is a helper generator that parses the bmap XML file and for - each block range in the XML file it yields ('first', 'last', 'sha1') - tuples, where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'sha1' is the SHA1 checksum of the range ('None' is used if it is - missing. - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown, the generator - infinitely yields continuous ranges of size '_batch_blocks'.""" - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if "sha1" in xml_element.attrib: - sha1 = xml_element.attrib["sha1"] - else: - sha1 = None - - yield (first, last, sha1) - - def _get_batches(self, first, last): - """This is a helper generator which splits block ranges from the bmap - file to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1).""" - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """This is generator which reads the image file in '_batch_blocks' - chunks and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data.""" - - try: - for first, last, sha1 in self._get_block_ranges(): - if verify and sha1: - hash_obj = hashlib.new("sha1") - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and sha1: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) / self.block_size - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and sha1 and hash_obj.hexdigest() != sha1: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s (image file %s)" - % (first, last, hash_obj.hexdigest(), sha1, self._image_path) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """Copy the image to the destination file using bmap. The 'sync' - argument defines whether the destination file has to be synchronized - upon return. The 'verify' argument defines whether the SHA1 checksum - has to be verified while copying.""" - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - self._progress_started = False - self._progress_index = 0 - self._progress_time = datetime.datetime.now() - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - self._update_progress(blocks_written) - - if not self.image_size: - # The image size was unknown up until now, set it - self._set_image_size(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks from image '%s' to '%s', but should " - "have %u - inconsistent bmap file '%s'" - % ( - blocks_written, - self._image_path, - self._dest_path, - self.mapped_cnt, - self._bmap_path, - ) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - def sync(self): - """Synchronize the destination file to make sure all the data are - actually written to the disk.""" - - if self._dest_supports_fsync: - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """This class is a specialized version of 'BmapCopy' which copies the - image to a block device. Unlike the base 'BmapCopy' class, this class does - various optimizations specific to block devices, e.g., switching to the - 'noop' I/O scheduler.""" - - def _tune_block_device(self): - """ " Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError as err: - # No problem, this is just an optimization - raise Error("cannot enable the 'noop' I/O scheduler: %s" % err) - - # The file contains a list of scheduler with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the current scheduler name - import re - - match = re.match(r".*\[(.+)\].*", contents) - if match: - self._old_scheduler_value = match.group(1) - - # Limit the write buffering - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError as err: - raise Error("cannot set max. I/O ratio to '1': %s" % err) - - def _restore_bdev_settings(self): - """Restore old block device settings which we changed in - '_tune_block_device()'.""" - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError as err: - raise Error( - "cannot restore the '%s' I/O scheduler: %s" - % (self._old_scheduler_value, err) - ) - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError as err: - raise Error( - "cannot set the max. I/O ratio back to '%s': %s" - % (self._old_max_ratio_value, err) - ) - - def copy(self, sync=True, verify=True): - """The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time.""" - - self._tune_block_device() - - try: - BmapCopy.copy(self, sync, verify) - except: - raise - finally: - self._restore_bdev_settings() - - def __init__(self, image, dest, bmap=None, image_size=None): - """The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices.""" - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap, image_size) - - self._batch_bytes = 1024 * 1024 - self._batch_blocks = self._batch_bytes / self.block_size - self._batch_queue_len = 6 - self._dest_fsync_watermark = (6 * 1024 * 1024) / self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known, check that it fits the block device - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" diff --git a/tests/oldcodebase/BmapCopy2_3.py b/tests/oldcodebase/BmapCopy2_3.py deleted file mode 100644 index 21097e2..0000000 --- a/tests/oldcodebase/BmapCopy2_3.py +++ /dev/null @@ -1,708 +0,0 @@ -# pylint: disable-all - -"""This module implements copying of images with bmap and provides the -following API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied.""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# pylint: disable=R0902 - -import os -import stat -import sys -import hashlib -import datetime -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors.""" - - pass - - -class BmapCopy: - """This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file object of the destination file copy the image to - * full path or a file object of the bmap file (optional) - * image size in bytes (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - When the bmap is provided, it is not necessary to specify image size, - because the size is contained in the bmap. Otherwise, it is benefitial to - specify the size because it enables extra sanity checks and makes it - possible to provide the progress bar. - - When the image size is known either from the bmap or the caller specified - it to the class constructor, all the image geometry description attributes - ('blocks_cnt', etc) are initialized by the class constructor and available - for the user. - - However, when the size is not known, some of the image geometry - description attributes are not initialized by the class constructor. - Instead, they are initialized only by the 'copy()' method. - - The 'copy()' method implements image copying. You may choose whether to - verify the SHA1 checksum while copying or not. Note, this is done only in - case of bmap-based copying and only if bmap contains the SHA1 checksums - (e.g., bmap version 1.0 did not have SHA1 checksums). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'. - - It is possible to have a simple progress indicator while copying the image. - Use the 'set_progress_indicator()' method. - - You can copy only once with an instance of this class. This means that in - order to copy the image for the second time, you have to create a new class - instance.""" - - def set_progress_indicator(self, file_obj, format_string): - """Setup the progress indicator which shows how much data has been - copied in percent. - - The 'file_obj' argument is the console file object where the progress - has to be printed to. Pass 'None' to disable the progress indicator. - - The 'format_string' argument is the format string for the progress - indicator. It has to contain a single '%d' placeholder which will be - substitutes with copied data in percent.""" - - self._progress_file = file_obj - if format_string: - self._progress_format = format_string - else: - self._progress_format = "Copied %d%%" - - def _set_image_size(self, image_size): - """Set image size and initialize various other geometry-related - attributes.""" - - if self.image_size is not None and self.image_size != image_size: - raise Error( - "cannot set image size to %d bytes, it is known to " - "be %d bytes (%s)" - % (image_size, self.image_size, self.image_size_human) - ) - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = self.image_size + self.block_size - 1 - self.blocks_cnt /= self.block_size - - if self.mapped_cnt is None: - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _verify_bmap_checksum(self): - """This is a helper function which verifies SHA1 checksum of the bmap - file.""" - - import mmap - - correct_sha1 = self._xml.find("BmapFileSHA1").text.strip() - - # Before verifying the shecksum, we have to substitute the SHA1 value - # stored in the file with all zeroes. For these purposes we create - # private memory mapping of the bmap file. - mapped_bmap = mmap.mmap(self._f_bmap.fileno(), 0, access=mmap.ACCESS_COPY) - - sha1_pos = mapped_bmap.find(correct_sha1) - assert sha1_pos != -1 - - mapped_bmap[sha1_pos : sha1_pos + 40] = "0" * 40 - calculated_sha1 = hashlib.sha1(mapped_bmap).hexdigest() - - mapped_bmap.close() - - if calculated_sha1 != correct_sha1: - raise Error( - "checksum mismatch for bmap file '%s': calculated " - "'%s', should be '%s'" - % (self._bmap_path, calculated_sha1, correct_sha1) - ) - - def _parse_bmap(self): - """Parse the bmap file and initialize corresponding class instance - attributs.""" - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s" % (self._bmap_path, err) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - self.bmap_version_major = int(self.bmap_version.split(".", 1)[0]) - self.bmap_version_minor = int(self.bmap_version.split(".", 1)[1]) - if self.bmap_version_major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" - % (SUPPORTED_BMAP_VERSION, self.bmap_version_major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) / self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - if self.bmap_version_major >= 1 and self.bmap_version_minor >= 3: - # Bmap file checksum appeard in format 1.3 - self._verify_bmap_checksum() - - def __init__(self, image, dest, bmap=None, image_size=None): - """The class constructor. The parameters are: - image - file-like object of the image which should be copied, - should only support 'read()' and 'seek()' methods, - and only seeking forward has to be supported. - dest - file object of the destination file to copy the image - to. - bmap - file object of the bmap file to use for copying. - image_size - size of the image in bytes.""" - - self._xml = None - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 2 - - self.bmap_version = None - self.bmap_version_major = None - self.bmap_version_minor = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_bmap = None - self._f_bmap_path = None - - self._progress_started = None - self._progress_index = None - self._progress_time = None - self._progress_file = None - self._progress_format = None - self.set_progress_indicator(None, None) - - self._f_image = image - self._image_path = image.name - - self._f_dest = dest - self._dest_path = dest.name - st_data = os.fstat(self._f_dest.fileno()) - self._dest_is_regfile = stat.S_ISREG(st_data.st_mode) - - # Special quirk for /dev/null which does not support fsync() - if ( - stat.S_ISCHR(st_data.st_mode) - and os.major(st_data.st_rdev) == 1 - and os.minor(st_data.st_rdev) == 3 - ): - self._dest_supports_fsync = False - else: - self._dest_supports_fsync = True - - if bmap: - self._f_bmap = bmap - self._bmap_path = bmap.name - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - if image_size: - self._set_image_size(image_size) - - self._batch_blocks = self._batch_bytes / self.block_size - - def _update_progress(self, blocks_written): - """Print the progress indicator if the mapped area size is known and - if the indicator has been enabled by assigning a console file object to - the 'progress_file' attribute.""" - - if not self._progress_file: - return - - if self.mapped_cnt: - assert blocks_written <= self.mapped_cnt - percent = int((float(blocks_written) / self.mapped_cnt) * 100) - progress = "\r" + self._progress_format % percent + "\n" - else: - # Do not rotate the wheel too fast - now = datetime.datetime.now() - min_delta = datetime.timedelta(milliseconds=250) - if now - self._progress_time < min_delta: - return - self._progress_time = now - - progress_wheel = ("-", "\\", "|", "/") - progress = "\r" + progress_wheel[self._progress_index % 4] + "\n" - self._progress_index += 1 - - # This is a little trick we do in order to make sure that the next - # message will always start from a new line - we switch to the new - # line after each progress update and move the cursor up. As an - # example, this is useful when the copying is interrupted by an - # exception - the error message will start form new line. - if self._progress_started: - # The "move cursor up" escape sequence - self._progress_file.write("\033[1A") # pylint: disable=W1401 - else: - self._progress_started = True - - self._progress_file.write(progress) - self._progress_file.flush() - - def _get_block_ranges(self): - """This is a helper generator that parses the bmap XML file and for - each block range in the XML file it yields ('first', 'last', 'sha1') - tuples, where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'sha1' is the SHA1 checksum of the range ('None' is used if it is - missing. - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown, the generator - infinitely yields continuous ranges of size '_batch_blocks'.""" - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if "sha1" in xml_element.attrib: - sha1 = xml_element.attrib["sha1"] - else: - sha1 = None - - yield (first, last, sha1) - - def _get_batches(self, first, last): - """This is a helper generator which splits block ranges from the bmap - file to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1).""" - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """This is generator which reads the image file in '_batch_blocks' - chunks and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data.""" - - try: - for first, last, sha1 in self._get_block_ranges(): - if verify and sha1: - hash_obj = hashlib.new("sha1") - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and sha1: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) / self.block_size - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and sha1 and hash_obj.hexdigest() != sha1: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s (image file %s)" - % (first, last, hash_obj.hexdigest(), sha1, self._image_path) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """Copy the image to the destination file using bmap. The 'sync' - argument defines whether the destination file has to be synchronized - upon return. The 'verify' argument defines whether the SHA1 checksum - has to be verified while copying.""" - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - self._progress_started = False - self._progress_index = 0 - self._progress_time = datetime.datetime.now() - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - self._update_progress(blocks_written) - - if not self.image_size: - # The image size was unknown up until now, set it - self._set_image_size(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks from image '%s' to '%s', but should " - "have %u - bmap file '%s' does not belong to this" - "image" - % ( - blocks_written, - self._image_path, - self._dest_path, - self.mapped_cnt, - self._bmap_path, - ) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - def sync(self): - """Synchronize the destination file to make sure all the data are - actually written to the disk.""" - - if self._dest_supports_fsync: - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """This class is a specialized version of 'BmapCopy' which copies the - image to a block device. Unlike the base 'BmapCopy' class, this class does - various optimizations specific to block devices, e.g., switching to the - 'noop' I/O scheduler.""" - - def _tune_block_device(self): - """ " Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError as err: - # No problem, this is just an optimization - raise Error("cannot enable the 'noop' I/O scheduler: %s" % err) - - # The file contains a list of scheduler with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the current scheduler name - import re - - match = re.match(r".*\[(.+)\].*", contents) - if match: - self._old_scheduler_value = match.group(1) - - # Limit the write buffering - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError as err: - raise Error("cannot set max. I/O ratio to '1': %s" % err) - - def _restore_bdev_settings(self): - """Restore old block device settings which we changed in - '_tune_block_device()'.""" - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError as err: - raise Error( - "cannot restore the '%s' I/O scheduler: %s" - % (self._old_scheduler_value, err) - ) - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError as err: - raise Error( - "cannot set the max. I/O ratio back to '%s': %s" - % (self._old_max_ratio_value, err) - ) - - def copy(self, sync=True, verify=True): - """The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time.""" - - self._tune_block_device() - - try: - BmapCopy.copy(self, sync, verify) - except: - raise - finally: - self._restore_bdev_settings() - - def __init__(self, image, dest, bmap=None, image_size=None): - """The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices.""" - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap, image_size) - - self._batch_bytes = 1024 * 1024 - self._batch_blocks = self._batch_bytes / self.block_size - self._batch_queue_len = 6 - self._dest_fsync_watermark = (6 * 1024 * 1024) / self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known, check that it fits the block device - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" diff --git a/tests/oldcodebase/BmapCopy2_4.py b/tests/oldcodebase/BmapCopy2_4.py deleted file mode 100644 index 21097e2..0000000 --- a/tests/oldcodebase/BmapCopy2_4.py +++ /dev/null @@ -1,708 +0,0 @@ -# pylint: disable-all - -"""This module implements copying of images with bmap and provides the -following API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied.""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# pylint: disable=R0902 - -import os -import stat -import sys -import hashlib -import datetime -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors.""" - - pass - - -class BmapCopy: - """This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file object of the destination file copy the image to - * full path or a file object of the bmap file (optional) - * image size in bytes (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - When the bmap is provided, it is not necessary to specify image size, - because the size is contained in the bmap. Otherwise, it is benefitial to - specify the size because it enables extra sanity checks and makes it - possible to provide the progress bar. - - When the image size is known either from the bmap or the caller specified - it to the class constructor, all the image geometry description attributes - ('blocks_cnt', etc) are initialized by the class constructor and available - for the user. - - However, when the size is not known, some of the image geometry - description attributes are not initialized by the class constructor. - Instead, they are initialized only by the 'copy()' method. - - The 'copy()' method implements image copying. You may choose whether to - verify the SHA1 checksum while copying or not. Note, this is done only in - case of bmap-based copying and only if bmap contains the SHA1 checksums - (e.g., bmap version 1.0 did not have SHA1 checksums). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'. - - It is possible to have a simple progress indicator while copying the image. - Use the 'set_progress_indicator()' method. - - You can copy only once with an instance of this class. This means that in - order to copy the image for the second time, you have to create a new class - instance.""" - - def set_progress_indicator(self, file_obj, format_string): - """Setup the progress indicator which shows how much data has been - copied in percent. - - The 'file_obj' argument is the console file object where the progress - has to be printed to. Pass 'None' to disable the progress indicator. - - The 'format_string' argument is the format string for the progress - indicator. It has to contain a single '%d' placeholder which will be - substitutes with copied data in percent.""" - - self._progress_file = file_obj - if format_string: - self._progress_format = format_string - else: - self._progress_format = "Copied %d%%" - - def _set_image_size(self, image_size): - """Set image size and initialize various other geometry-related - attributes.""" - - if self.image_size is not None and self.image_size != image_size: - raise Error( - "cannot set image size to %d bytes, it is known to " - "be %d bytes (%s)" - % (image_size, self.image_size, self.image_size_human) - ) - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = self.image_size + self.block_size - 1 - self.blocks_cnt /= self.block_size - - if self.mapped_cnt is None: - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _verify_bmap_checksum(self): - """This is a helper function which verifies SHA1 checksum of the bmap - file.""" - - import mmap - - correct_sha1 = self._xml.find("BmapFileSHA1").text.strip() - - # Before verifying the shecksum, we have to substitute the SHA1 value - # stored in the file with all zeroes. For these purposes we create - # private memory mapping of the bmap file. - mapped_bmap = mmap.mmap(self._f_bmap.fileno(), 0, access=mmap.ACCESS_COPY) - - sha1_pos = mapped_bmap.find(correct_sha1) - assert sha1_pos != -1 - - mapped_bmap[sha1_pos : sha1_pos + 40] = "0" * 40 - calculated_sha1 = hashlib.sha1(mapped_bmap).hexdigest() - - mapped_bmap.close() - - if calculated_sha1 != correct_sha1: - raise Error( - "checksum mismatch for bmap file '%s': calculated " - "'%s', should be '%s'" - % (self._bmap_path, calculated_sha1, correct_sha1) - ) - - def _parse_bmap(self): - """Parse the bmap file and initialize corresponding class instance - attributs.""" - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s" % (self._bmap_path, err) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - self.bmap_version_major = int(self.bmap_version.split(".", 1)[0]) - self.bmap_version_minor = int(self.bmap_version.split(".", 1)[1]) - if self.bmap_version_major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" - % (SUPPORTED_BMAP_VERSION, self.bmap_version_major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) / self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - if self.bmap_version_major >= 1 and self.bmap_version_minor >= 3: - # Bmap file checksum appeard in format 1.3 - self._verify_bmap_checksum() - - def __init__(self, image, dest, bmap=None, image_size=None): - """The class constructor. The parameters are: - image - file-like object of the image which should be copied, - should only support 'read()' and 'seek()' methods, - and only seeking forward has to be supported. - dest - file object of the destination file to copy the image - to. - bmap - file object of the bmap file to use for copying. - image_size - size of the image in bytes.""" - - self._xml = None - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 2 - - self.bmap_version = None - self.bmap_version_major = None - self.bmap_version_minor = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_bmap = None - self._f_bmap_path = None - - self._progress_started = None - self._progress_index = None - self._progress_time = None - self._progress_file = None - self._progress_format = None - self.set_progress_indicator(None, None) - - self._f_image = image - self._image_path = image.name - - self._f_dest = dest - self._dest_path = dest.name - st_data = os.fstat(self._f_dest.fileno()) - self._dest_is_regfile = stat.S_ISREG(st_data.st_mode) - - # Special quirk for /dev/null which does not support fsync() - if ( - stat.S_ISCHR(st_data.st_mode) - and os.major(st_data.st_rdev) == 1 - and os.minor(st_data.st_rdev) == 3 - ): - self._dest_supports_fsync = False - else: - self._dest_supports_fsync = True - - if bmap: - self._f_bmap = bmap - self._bmap_path = bmap.name - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - if image_size: - self._set_image_size(image_size) - - self._batch_blocks = self._batch_bytes / self.block_size - - def _update_progress(self, blocks_written): - """Print the progress indicator if the mapped area size is known and - if the indicator has been enabled by assigning a console file object to - the 'progress_file' attribute.""" - - if not self._progress_file: - return - - if self.mapped_cnt: - assert blocks_written <= self.mapped_cnt - percent = int((float(blocks_written) / self.mapped_cnt) * 100) - progress = "\r" + self._progress_format % percent + "\n" - else: - # Do not rotate the wheel too fast - now = datetime.datetime.now() - min_delta = datetime.timedelta(milliseconds=250) - if now - self._progress_time < min_delta: - return - self._progress_time = now - - progress_wheel = ("-", "\\", "|", "/") - progress = "\r" + progress_wheel[self._progress_index % 4] + "\n" - self._progress_index += 1 - - # This is a little trick we do in order to make sure that the next - # message will always start from a new line - we switch to the new - # line after each progress update and move the cursor up. As an - # example, this is useful when the copying is interrupted by an - # exception - the error message will start form new line. - if self._progress_started: - # The "move cursor up" escape sequence - self._progress_file.write("\033[1A") # pylint: disable=W1401 - else: - self._progress_started = True - - self._progress_file.write(progress) - self._progress_file.flush() - - def _get_block_ranges(self): - """This is a helper generator that parses the bmap XML file and for - each block range in the XML file it yields ('first', 'last', 'sha1') - tuples, where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'sha1' is the SHA1 checksum of the range ('None' is used if it is - missing. - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown, the generator - infinitely yields continuous ranges of size '_batch_blocks'.""" - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if "sha1" in xml_element.attrib: - sha1 = xml_element.attrib["sha1"] - else: - sha1 = None - - yield (first, last, sha1) - - def _get_batches(self, first, last): - """This is a helper generator which splits block ranges from the bmap - file to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1).""" - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """This is generator which reads the image file in '_batch_blocks' - chunks and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data.""" - - try: - for first, last, sha1 in self._get_block_ranges(): - if verify and sha1: - hash_obj = hashlib.new("sha1") - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and sha1: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) / self.block_size - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and sha1 and hash_obj.hexdigest() != sha1: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s (image file %s)" - % (first, last, hash_obj.hexdigest(), sha1, self._image_path) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """Copy the image to the destination file using bmap. The 'sync' - argument defines whether the destination file has to be synchronized - upon return. The 'verify' argument defines whether the SHA1 checksum - has to be verified while copying.""" - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - self._progress_started = False - self._progress_index = 0 - self._progress_time = datetime.datetime.now() - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - self._update_progress(blocks_written) - - if not self.image_size: - # The image size was unknown up until now, set it - self._set_image_size(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks from image '%s' to '%s', but should " - "have %u - bmap file '%s' does not belong to this" - "image" - % ( - blocks_written, - self._image_path, - self._dest_path, - self.mapped_cnt, - self._bmap_path, - ) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - def sync(self): - """Synchronize the destination file to make sure all the data are - actually written to the disk.""" - - if self._dest_supports_fsync: - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """This class is a specialized version of 'BmapCopy' which copies the - image to a block device. Unlike the base 'BmapCopy' class, this class does - various optimizations specific to block devices, e.g., switching to the - 'noop' I/O scheduler.""" - - def _tune_block_device(self): - """ " Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError as err: - # No problem, this is just an optimization - raise Error("cannot enable the 'noop' I/O scheduler: %s" % err) - - # The file contains a list of scheduler with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the current scheduler name - import re - - match = re.match(r".*\[(.+)\].*", contents) - if match: - self._old_scheduler_value = match.group(1) - - # Limit the write buffering - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError as err: - raise Error("cannot set max. I/O ratio to '1': %s" % err) - - def _restore_bdev_settings(self): - """Restore old block device settings which we changed in - '_tune_block_device()'.""" - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError as err: - raise Error( - "cannot restore the '%s' I/O scheduler: %s" - % (self._old_scheduler_value, err) - ) - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError as err: - raise Error( - "cannot set the max. I/O ratio back to '%s': %s" - % (self._old_max_ratio_value, err) - ) - - def copy(self, sync=True, verify=True): - """The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time.""" - - self._tune_block_device() - - try: - BmapCopy.copy(self, sync, verify) - except: - raise - finally: - self._restore_bdev_settings() - - def __init__(self, image, dest, bmap=None, image_size=None): - """The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices.""" - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap, image_size) - - self._batch_bytes = 1024 * 1024 - self._batch_blocks = self._batch_bytes / self.block_size - self._batch_queue_len = 6 - self._dest_fsync_watermark = (6 * 1024 * 1024) / self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known, check that it fits the block device - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" diff --git a/tests/oldcodebase/BmapCopy2_5.py b/tests/oldcodebase/BmapCopy2_5.py deleted file mode 100644 index 37b5f48..0000000 --- a/tests/oldcodebase/BmapCopy2_5.py +++ /dev/null @@ -1,769 +0,0 @@ -# pylint: disable-all - -# Copyright (c) 2012-2013 Intel, Inc. -# -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License, version 2, -# as published by the Free Software Foundation. -# -# This program is distributed in the hope that it will be useful, but -# WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# General Public License for more details. - -""" -This module implements copying of images with bmap and provides the following -API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied. -""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# pylint: disable=R0902 - -import os -import stat -import sys -import hashlib -import logging -import datetime -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """ - A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors. - """ - - pass - - -class BmapCopy: - """ - This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file object of the destination file copy the image to - * full path or a file object of the bmap file (optional) - * image size in bytes (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - When the bmap is provided, it is not necessary to specify image size, - because the size is contained in the bmap. Otherwise, it is benefitial to - specify the size because it enables extra sanity checks and makes it - possible to provide the progress bar. - - When the image size is known either from the bmap or the caller specified - it to the class constructor, all the image geometry description attributes - ('blocks_cnt', etc) are initialized by the class constructor and available - for the user. - - However, when the size is not known, some of the image geometry - description attributes are not initialized by the class constructor. - Instead, they are initialized only by the 'copy()' method. - - The 'copy()' method implements image copying. You may choose whether to - verify the SHA1 checksum while copying or not. Note, this is done only in - case of bmap-based copying and only if bmap contains the SHA1 checksums - (e.g., bmap version 1.0 did not have SHA1 checksums). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'. - - It is possible to have a simple progress indicator while copying the image. - Use the 'set_progress_indicator()' method. - - You can copy only once with an instance of this class. This means that in - order to copy the image for the second time, you have to create a new class - instance. - """ - - def set_progress_indicator(self, file_obj, format_string): - """ - Setup the progress indicator which shows how much data has been copied - in percent. - - The 'file_obj' argument is the console file object where the progress - has to be printed to. Pass 'None' to disable the progress indicator. - - The 'format_string' argument is the format string for the progress - indicator. It has to contain a single '%d' placeholder which will be - substitutes with copied data in percent. - """ - - self._progress_file = file_obj - if format_string: - self._progress_format = format_string - else: - self._progress_format = "Copied %d%%" - - def _set_image_size(self, image_size): - """ - Set image size and initialize various other geometry-related attributes. - """ - - if self.image_size is not None and self.image_size != image_size: - raise Error( - "cannot set image size to %d bytes, it is known to " - "be %d bytes (%s)" - % (image_size, self.image_size, self.image_size_human) - ) - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = self.image_size + self.block_size - 1 - self.blocks_cnt /= self.block_size - - if self.mapped_cnt is None: - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _verify_bmap_checksum(self): - """ - This is a helper function which verifies SHA1 checksum of the bmap file. - """ - - import mmap - - correct_sha1 = self._xml.find("BmapFileSHA1").text.strip() - - # Before verifying the shecksum, we have to substitute the SHA1 value - # stored in the file with all zeroes. For these purposes we create - # private memory mapping of the bmap file. - mapped_bmap = mmap.mmap(self._f_bmap.fileno(), 0, access=mmap.ACCESS_COPY) - - sha1_pos = mapped_bmap.find(correct_sha1) - assert sha1_pos != -1 - - mapped_bmap[sha1_pos : sha1_pos + 40] = "0" * 40 - calculated_sha1 = hashlib.sha1(mapped_bmap).hexdigest() - - mapped_bmap.close() - - if calculated_sha1 != correct_sha1: - raise Error( - "checksum mismatch for bmap file '%s': calculated " - "'%s', should be '%s'" - % (self._bmap_path, calculated_sha1, correct_sha1) - ) - - def _parse_bmap(self): - """ - Parse the bmap file and initialize corresponding class instance attributs. - """ - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s" % (self._bmap_path, err) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - self.bmap_version_major = int(self.bmap_version.split(".", 1)[0]) - self.bmap_version_minor = int(self.bmap_version.split(".", 1)[1]) - if self.bmap_version_major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" - % (SUPPORTED_BMAP_VERSION, self.bmap_version_major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) / self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - if self.bmap_version_major >= 1 and self.bmap_version_minor >= 3: - # Bmap file checksum appeard in format 1.3 - self._verify_bmap_checksum() - - def __init__(self, image, dest, bmap=None, image_size=None, logger=None): - """ - The class constructor. The parameters are: - image - file-like object of the image which should be copied, - should only support 'read()' and 'seek()' methods, - and only seeking forward has to be supported. - dest - file object of the destination file to copy the image - to. - bmap - file object of the bmap file to use for copying. - image_size - size of the image in bytes. - logger - the logger object to use for printing messages. - """ - - self._logger = logger - if self._logger is None: - self._logger = logging.getLogger(__name__) - - self._xml = None - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 2 - - self.bmap_version = None - self.bmap_version_major = None - self.bmap_version_minor = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_bmap = None - self._f_bmap_path = None - - self._progress_started = None - self._progress_index = None - self._progress_time = None - self._progress_file = None - self._progress_format = None - self.set_progress_indicator(None, None) - - self._f_image = image - self._image_path = image.name - - self._f_dest = dest - self._dest_path = dest.name - st_data = os.fstat(self._f_dest.fileno()) - self._dest_is_regfile = stat.S_ISREG(st_data.st_mode) - - # Special quirk for /dev/null which does not support fsync() - if ( - stat.S_ISCHR(st_data.st_mode) - and os.major(st_data.st_rdev) == 1 - and os.minor(st_data.st_rdev) == 3 - ): - self._dest_supports_fsync = False - else: - self._dest_supports_fsync = True - - if bmap: - self._f_bmap = bmap - self._bmap_path = bmap.name - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - if image_size: - self._set_image_size(image_size) - - self._batch_blocks = self._batch_bytes / self.block_size - - def _update_progress(self, blocks_written): - """ - Print the progress indicator if the mapped area size is known and if - the indicator has been enabled by assigning a console file object to - the 'progress_file' attribute. - """ - - if not self._progress_file: - return - - if self.mapped_cnt: - assert blocks_written <= self.mapped_cnt - percent = int((float(blocks_written) / self.mapped_cnt) * 100) - progress = "\r" + self._progress_format % percent + "\n" - else: - # Do not rotate the wheel too fast - now = datetime.datetime.now() - min_delta = datetime.timedelta(milliseconds=250) - if now - self._progress_time < min_delta: - return - self._progress_time = now - - progress_wheel = ("-", "\\", "|", "/") - progress = "\r" + progress_wheel[self._progress_index % 4] + "\n" - self._progress_index += 1 - - # This is a little trick we do in order to make sure that the next - # message will always start from a new line - we switch to the new - # line after each progress update and move the cursor up. As an - # example, this is useful when the copying is interrupted by an - # exception - the error message will start form new line. - if self._progress_started: - # The "move cursor up" escape sequence - self._progress_file.write("\033[1A") # pylint: disable=W1401 - else: - self._progress_started = True - - self._progress_file.write(progress) - self._progress_file.flush() - - def _get_block_ranges(self): - """ - This is a helper generator that parses the bmap XML file and for each - block range in the XML file it yields ('first', 'last', 'sha1') tuples, - where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'sha1' is the SHA1 checksum of the range ('None' is used if it is - missing. - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown, the generator - infinitely yields continuous ranges of size '_batch_blocks'. - """ - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if "sha1" in xml_element.attrib: - sha1 = xml_element.attrib["sha1"] - else: - sha1 = None - - yield (first, last, sha1) - - def _get_batches(self, first, last): - """ - This is a helper generator which splits block ranges from the bmap file - to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1). - """ - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """ - This is generator which reads the image file in '_batch_blocks' chunks - and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data. - """ - - try: - for first, last, sha1 in self._get_block_ranges(): - if verify and sha1: - hash_obj = hashlib.new("sha1") - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and sha1: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) / self.block_size - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and sha1 and hash_obj.hexdigest() != sha1: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s (image file %s)" - % (first, last, hash_obj.hexdigest(), sha1, self._image_path) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """ - Copy the image to the destination file using bmap. The 'sync' argument - defines whether the destination file has to be synchronized upon - return. The 'verify' argument defines whether the SHA1 checksum has to - be verified while copying. - """ - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - self._progress_started = False - self._progress_index = 0 - self._progress_time = datetime.datetime.now() - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - self._update_progress(blocks_written) - - if not self.image_size: - # The image size was unknown up until now, set it - self._set_image_size(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks from image '%s' to '%s', but should " - "have %u - bmap file '%s' does not belong to this" - "image" - % ( - blocks_written, - self._image_path, - self._dest_path, - self.mapped_cnt, - self._bmap_path, - ) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - def sync(self): - """ - Synchronize the destination file to make sure all the data are actually - written to the disk. - """ - - if self._dest_supports_fsync: - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """ - This class is a specialized version of 'BmapCopy' which copies the image to - a block device. Unlike the base 'BmapCopy' class, this class does various - optimizations specific to block devices, e.g., switching to the 'noop' I/O - scheduler. - """ - - def _tune_block_device(self): - """ - Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError as err: - self._logger.warning( - "failed to enable I/O optimization, expect " - "suboptimal speed (reason: cannot switch " - "to the 'noop' I/O scheduler: %s)" % err - ) - else: - # The file contains a list of schedulers with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the name of the current scheduler. - import re - - match = re.match(r".*\[(.+)\].*", contents) - if match: - self._old_scheduler_value = match.group(1) - - # Limit the write buffering, because we do not need too much of it when - # writing sequntially. Excessive buffering makes some systems not very - # responsive, e.g., this was observed in Fedora 17. - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError as err: - self._logger.warning( - "failed to disable excessive buffering, " - "expect worse system responsiveness " - "(reason: cannot set max. I/O ratio to " - "1: %s)" % err - ) - - def _restore_bdev_settings(self): - """ - Restore old block device settings which we changed in - '_tune_block_device()'. - """ - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError as err: - raise Error( - "cannot restore the '%s' I/O scheduler: %s" - % (self._old_scheduler_value, err) - ) - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError as err: - raise Error( - "cannot set the max. I/O ratio back to '%s': %s" - % (self._old_max_ratio_value, err) - ) - - def copy(self, sync=True, verify=True): - """ - The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time. - """ - - self._tune_block_device() - - try: - BmapCopy.copy(self, sync, verify) - except: - raise - finally: - self._restore_bdev_settings() - - def __init__(self, image, dest, bmap=None, image_size=None, logger=None): - """ - The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices. - """ - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap, image_size, logger=logger) - - self._batch_bytes = 1024 * 1024 - self._batch_blocks = self._batch_bytes / self.block_size - self._batch_queue_len = 6 - self._dest_fsync_watermark = (6 * 1024 * 1024) / self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known, check that it fits the block device - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" diff --git a/tests/oldcodebase/BmapCopy2_6.py b/tests/oldcodebase/BmapCopy2_6.py deleted file mode 100644 index 9bbdc4d..0000000 --- a/tests/oldcodebase/BmapCopy2_6.py +++ /dev/null @@ -1,769 +0,0 @@ -# pylint: disable-all - -# Copyright (c) 2012-2013 Intel, Inc. -# -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License, version 2, -# as published by the Free Software Foundation. -# -# This program is distributed in the hope that it will be useful, but -# WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# General Public License for more details. - -""" -This module implements copying of images with bmap and provides the following -API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied. -""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# pylint: disable=R0902 - -import os -import stat -import sys -import hashlib -import logging -import datetime -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """ - A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors. - """ - - pass - - -class BmapCopy: - """ - This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file object of the destination file copy the image to - * full path or a file object of the bmap file (optional) - * image size in bytes (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - When the bmap is provided, it is not necessary to specify image size, - because the size is contained in the bmap. Otherwise, it is benefitial to - specify the size because it enables extra sanity checks and makes it - possible to provide the progress bar. - - When the image size is known either from the bmap or the caller specified - it to the class constructor, all the image geometry description attributes - ('blocks_cnt', etc) are initialized by the class constructor and available - for the user. - - However, when the size is not known, some of the image geometry - description attributes are not initialized by the class constructor. - Instead, they are initialized only by the 'copy()' method. - - The 'copy()' method implements image copying. You may choose whether to - verify the SHA1 checksum while copying or not. Note, this is done only in - case of bmap-based copying and only if bmap contains the SHA1 checksums - (e.g., bmap version 1.0 did not have SHA1 checksums). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'. - - It is possible to have a simple progress indicator while copying the image. - Use the 'set_progress_indicator()' method. - - You can copy only once with an instance of this class. This means that in - order to copy the image for the second time, you have to create a new class - instance. - """ - - def __init__(self, image, dest, bmap=None, image_size=None, logger=None): - """ - The class constructor. The parameters are: - image - file-like object of the image which should be copied, - should only support 'read()' and 'seek()' methods, - and only seeking forward has to be supported. - dest - file object of the destination file to copy the image - to. - bmap - file object of the bmap file to use for copying. - image_size - size of the image in bytes. - logger - the logger object to use for printing messages. - """ - - self._logger = logger - if self._logger is None: - self._logger = logging.getLogger(__name__) - - self._xml = None - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 2 - - self.bmap_version = None - self.bmap_version_major = None - self.bmap_version_minor = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_bmap = None - self._f_bmap_path = None - - self._progress_started = None - self._progress_index = None - self._progress_time = None - self._progress_file = None - self._progress_format = None - self.set_progress_indicator(None, None) - - self._f_image = image - self._image_path = image.name - - self._f_dest = dest - self._dest_path = dest.name - st_data = os.fstat(self._f_dest.fileno()) - self._dest_is_regfile = stat.S_ISREG(st_data.st_mode) - - # Special quirk for /dev/null which does not support fsync() - if ( - stat.S_ISCHR(st_data.st_mode) - and os.major(st_data.st_rdev) == 1 - and os.minor(st_data.st_rdev) == 3 - ): - self._dest_supports_fsync = False - else: - self._dest_supports_fsync = True - - if bmap: - self._f_bmap = bmap - self._bmap_path = bmap.name - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - if image_size: - self._set_image_size(image_size) - - self._batch_blocks = self._batch_bytes / self.block_size - - def set_progress_indicator(self, file_obj, format_string): - """ - Setup the progress indicator which shows how much data has been copied - in percent. - - The 'file_obj' argument is the console file object where the progress - has to be printed to. Pass 'None' to disable the progress indicator. - - The 'format_string' argument is the format string for the progress - indicator. It has to contain a single '%d' placeholder which will be - substitutes with copied data in percent. - """ - - self._progress_file = file_obj - if format_string: - self._progress_format = format_string - else: - self._progress_format = "Copied %d%%" - - def _set_image_size(self, image_size): - """ - Set image size and initialize various other geometry-related attributes. - """ - - if self.image_size is not None and self.image_size != image_size: - raise Error( - "cannot set image size to %d bytes, it is known to " - "be %d bytes (%s)" - % (image_size, self.image_size, self.image_size_human) - ) - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = self.image_size + self.block_size - 1 - self.blocks_cnt /= self.block_size - - if self.mapped_cnt is None: - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _verify_bmap_checksum(self): - """ - This is a helper function which verifies SHA1 checksum of the bmap file. - """ - - import mmap - - correct_sha1 = self._xml.find("BmapFileSHA1").text.strip() - - # Before verifying the shecksum, we have to substitute the SHA1 value - # stored in the file with all zeroes. For these purposes we create - # private memory mapping of the bmap file. - mapped_bmap = mmap.mmap(self._f_bmap.fileno(), 0, access=mmap.ACCESS_COPY) - - sha1_pos = mapped_bmap.find(correct_sha1) - assert sha1_pos != -1 - - mapped_bmap[sha1_pos : sha1_pos + 40] = "0" * 40 - calculated_sha1 = hashlib.sha1(mapped_bmap).hexdigest() - - mapped_bmap.close() - - if calculated_sha1 != correct_sha1: - raise Error( - "checksum mismatch for bmap file '%s': calculated " - "'%s', should be '%s'" - % (self._bmap_path, calculated_sha1, correct_sha1) - ) - - def _parse_bmap(self): - """ - Parse the bmap file and initialize corresponding class instance attributs. - """ - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s" % (self._bmap_path, err) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - self.bmap_version_major = int(self.bmap_version.split(".", 1)[0]) - self.bmap_version_minor = int(self.bmap_version.split(".", 1)[1]) - if self.bmap_version_major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" - % (SUPPORTED_BMAP_VERSION, self.bmap_version_major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) / self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - if self.bmap_version_major >= 1 and self.bmap_version_minor >= 3: - # Bmap file checksum appeard in format 1.3 - self._verify_bmap_checksum() - - def _update_progress(self, blocks_written): - """ - Print the progress indicator if the mapped area size is known and if - the indicator has been enabled by assigning a console file object to - the 'progress_file' attribute. - """ - - if not self._progress_file: - return - - if self.mapped_cnt: - assert blocks_written <= self.mapped_cnt - percent = int((float(blocks_written) / self.mapped_cnt) * 100) - progress = "\r" + self._progress_format % percent + "\n" - else: - # Do not rotate the wheel too fast - now = datetime.datetime.now() - min_delta = datetime.timedelta(milliseconds=250) - if now - self._progress_time < min_delta: - return - self._progress_time = now - - progress_wheel = ("-", "\\", "|", "/") - progress = "\r" + progress_wheel[self._progress_index % 4] + "\n" - self._progress_index += 1 - - # This is a little trick we do in order to make sure that the next - # message will always start from a new line - we switch to the new - # line after each progress update and move the cursor up. As an - # example, this is useful when the copying is interrupted by an - # exception - the error message will start form new line. - if self._progress_started: - # The "move cursor up" escape sequence - self._progress_file.write("\033[1A") # pylint: disable=W1401 - else: - self._progress_started = True - - self._progress_file.write(progress) - self._progress_file.flush() - - def _get_block_ranges(self): - """ - This is a helper generator that parses the bmap XML file and for each - block range in the XML file it yields ('first', 'last', 'sha1') tuples, - where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'sha1' is the SHA1 checksum of the range ('None' is used if it is - missing. - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown, the generator - infinitely yields continuous ranges of size '_batch_blocks'. - """ - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if "sha1" in xml_element.attrib: - sha1 = xml_element.attrib["sha1"] - else: - sha1 = None - - yield (first, last, sha1) - - def _get_batches(self, first, last): - """ - This is a helper generator which splits block ranges from the bmap file - to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1). - """ - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """ - This is generator which reads the image file in '_batch_blocks' chunks - and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data. - """ - - try: - for first, last, sha1 in self._get_block_ranges(): - if verify and sha1: - hash_obj = hashlib.new("sha1") - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and sha1: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) / self.block_size - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and sha1 and hash_obj.hexdigest() != sha1: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s (image file %s)" - % (first, last, hash_obj.hexdigest(), sha1, self._image_path) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """ - Copy the image to the destination file using bmap. The 'sync' argument - defines whether the destination file has to be synchronized upon - return. The 'verify' argument defines whether the SHA1 checksum has to - be verified while copying. - """ - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - self._progress_started = False - self._progress_index = 0 - self._progress_time = datetime.datetime.now() - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - self._update_progress(blocks_written) - - if not self.image_size: - # The image size was unknown up until now, set it - self._set_image_size(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks from image '%s' to '%s', but should " - "have %u - bmap file '%s' does not belong to this " - "image" - % ( - blocks_written, - self._image_path, - self._dest_path, - self.mapped_cnt, - self._bmap_path, - ) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - def sync(self): - """ - Synchronize the destination file to make sure all the data are actually - written to the disk. - """ - - if self._dest_supports_fsync: - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """ - This class is a specialized version of 'BmapCopy' which copies the image to - a block device. Unlike the base 'BmapCopy' class, this class does various - optimizations specific to block devices, e.g., switching to the 'noop' I/O - scheduler. - """ - - def __init__(self, image, dest, bmap=None, image_size=None, logger=None): - """ - The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices. - """ - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap, image_size, logger=logger) - - self._batch_bytes = 1024 * 1024 - self._batch_blocks = self._batch_bytes / self.block_size - self._batch_queue_len = 6 - self._dest_fsync_watermark = (6 * 1024 * 1024) / self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known, check that it fits the block device - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" - - def _tune_block_device(self): - """ - Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError as err: - self._logger.warning( - "failed to enable I/O optimization, expect " - "suboptimal speed (reason: cannot switch " - "to the 'noop' I/O scheduler: %s)" % err - ) - else: - # The file contains a list of schedulers with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the name of the current scheduler. - import re - - match = re.match(r".*\[(.+)\].*", contents) - if match: - self._old_scheduler_value = match.group(1) - - # Limit the write buffering, because we do not need too much of it when - # writing sequntially. Excessive buffering makes some systems not very - # responsive, e.g., this was observed in Fedora 17. - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError as err: - self._logger.warning( - "failed to disable excessive buffering, " - "expect worse system responsiveness " - "(reason: cannot set max. I/O ratio to " - "1: %s)" % err - ) - - def _restore_bdev_settings(self): - """ - Restore old block device settings which we changed in - '_tune_block_device()'. - """ - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError as err: - raise Error( - "cannot restore the '%s' I/O scheduler: %s" - % (self._old_scheduler_value, err) - ) - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError as err: - raise Error( - "cannot set the max. I/O ratio back to '%s': %s" - % (self._old_max_ratio_value, err) - ) - - def copy(self, sync=True, verify=True): - """ - The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time. - """ - - self._tune_block_device() - - try: - BmapCopy.copy(self, sync, verify) - except: - raise - finally: - self._restore_bdev_settings() diff --git a/tests/oldcodebase/BmapCopy3_0.py b/tests/oldcodebase/BmapCopy3_0.py deleted file mode 100644 index 81eed59..0000000 --- a/tests/oldcodebase/BmapCopy3_0.py +++ /dev/null @@ -1,814 +0,0 @@ -# pylint: disable-all - -# Copyright (c) 2012-2013 Intel, Inc. -# -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License, version 2, -# as published by the Free Software Foundation. -# -# This program is distributed in the hope that it will be useful, but -# WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# General Public License for more details. - -""" -This module implements copying of images with bmap and provides the following -API. - 1. BmapCopy class - implements copying to any kind of file, be that a block - device or a regular file. - 2. BmapBdevCopy class - based on BmapCopy and specializes on copying to block - devices. It does some more sanity checks and some block device performance - tuning. - -The bmap file is an XML file which contains a list of mapped blocks of the -image. Mapped blocks are the blocks which have disk sectors associated with -them, as opposed to holes, which are blocks with no associated disk sectors. In -other words, the image is considered to be a sparse file, and bmap basically -contains a list of mapped blocks of this sparse file. The bmap additionally -contains some useful information like block size (usually 4KiB), image size, -mapped blocks count, etc. - -The bmap is used for copying the image to a block device or to a regular file. -The idea is that we copy quickly with bmap because we copy only mapped blocks -and ignore the holes, because they are useless. And if the image is generated -properly (starting with a huge hole and writing all the data), it usually -contains only little mapped blocks, comparing to the overall image size. And -such an image compresses very well (because holes are read as all zeroes), so -it is beneficial to distributor them as compressed files along with the bmap. - -Here is an example. Suppose you have a 4GiB image which contains only 100MiB of -user data and you need to flash it to a slow USB stick. With bmap you end up -copying only a little bit more than 100MiB of data from the image to the USB -stick (namely, you copy only mapped blocks). This is a lot faster than copying -all 4GiB of data. We say that it is a bit more than 100MiB because things like -file-system meta-data (inode tables, superblocks, etc), partition table, etc -also contribute to the mapped blocks and are also copied. -""" - -# Disable the following pylint recommendations: -# * Too many instance attributes (R0902) -# pylint: disable=R0902 - -import os -import stat -import sys -import hashlib -import logging -import datetime -from six import reraise -from six.moves import queue as Queue -from six.moves import _thread as thread -from xml.etree import ElementTree -from bmaptool.BmapHelpers import human_size - -# The highest supported bmap format version -SUPPORTED_BMAP_VERSION = "1.0" - - -class Error(Exception): - """ - A class for exceptions generated by the 'BmapCopy' module. We currently - support only one type of exceptions, and we basically throw human-readable - problem description in case of errors. - """ - - pass - - -class BmapCopy: - """ - This class implements the bmap-based copying functionality. To copy an - image with bmap you should create an instance of this class, which requires - the following: - - * full path or a file-like object of the image to copy - * full path or a file object of the destination file copy the image to - * full path or a file object of the bmap file (optional) - * image size in bytes (optional) - - Although the main purpose of this class is to use bmap, the bmap is not - required, and if it was not provided then the entire image will be copied - to the destination file. - - When the bmap is provided, it is not necessary to specify image size, - because the size is contained in the bmap. Otherwise, it is benefitial to - specify the size because it enables extra sanity checks and makes it - possible to provide the progress bar. - - When the image size is known either from the bmap or the caller specified - it to the class constructor, all the image geometry description attributes - ('blocks_cnt', etc) are initialized by the class constructor and available - for the user. - - However, when the size is not known, some of the image geometry - description attributes are not initialized by the class constructor. - Instead, they are initialized only by the 'copy()' method. - - The 'copy()' method implements image copying. You may choose whether to - verify the checksum while copying or not. Note, this is done only in case - of bmap-based copying and only if bmap contains checksums (e.g., bmap - version 1.0 did not have checksums support). - - You may choose whether to synchronize the destination file after writing or - not. To explicitly synchronize it, use the 'sync()' method. - - This class supports all the bmap format versions up version - 'SUPPORTED_BMAP_VERSION'. - - It is possible to have a simple progress indicator while copying the image. - Use the 'set_progress_indicator()' method. - - You can copy only once with an instance of this class. This means that in - order to copy the image for the second time, you have to create a new class - instance. - """ - - def __init__(self, image, dest, bmap=None, image_size=None, log=None): - """ - The class constructor. The parameters are: - image - file-like object of the image which should be copied, - should only support 'read()' and 'seek()' methods, - and only seeking forward has to be supported. - dest - file object of the destination file to copy the image - to. - bmap - file object of the bmap file to use for copying. - image_size - size of the image in bytes. - log - the logger object to use for printing messages. - """ - - self._log = log - if self._log is None: - self._log = logging.getLogger(__name__) - - self._xml = None - - self._dest_fsync_watermark = None - self._batch_blocks = None - self._batch_queue = None - self._batch_bytes = 1024 * 1024 - self._batch_queue_len = 6 - - self.bmap_version = None - self.bmap_version_major = None - self.bmap_version_minor = None - self.block_size = None - self.blocks_cnt = None - self.mapped_cnt = None - self.image_size = None - self.image_size_human = None - self.mapped_size = None - self.mapped_size_human = None - self.mapped_percent = None - - self._f_bmap = None - self._f_bmap_path = None - - self._progress_started = None - self._progress_index = None - self._progress_time = None - self._progress_file = None - self._progress_format = None - self.set_progress_indicator(None, None) - - self._f_image = image - self._image_path = image.name - - self._f_dest = dest - self._dest_path = dest.name - st_data = os.fstat(self._f_dest.fileno()) - self._dest_is_regfile = stat.S_ISREG(st_data.st_mode) - - # The bmap file checksum type and length - self._cs_type = None - self._cs_len = None - self._cs_attrib_name = None - - # Special quirk for /dev/null which does not support fsync() - if ( - stat.S_ISCHR(st_data.st_mode) - and os.major(st_data.st_rdev) == 1 - and os.minor(st_data.st_rdev) == 3 - ): - self._dest_supports_fsync = False - else: - self._dest_supports_fsync = True - - if bmap: - self._f_bmap = bmap - self._bmap_path = bmap.name - self._parse_bmap() - else: - # There is no bmap. Initialize user-visible attributes to something - # sensible with an assumption that we just have all blocks mapped. - self.bmap_version = 0 - self.block_size = 4096 - self.mapped_percent = 100 - - if image_size: - self._set_image_size(image_size) - - self._batch_blocks = self._batch_bytes / self.block_size - - def set_progress_indicator(self, file_obj, format_string): - """ - Setup the progress indicator which shows how much data has been copied - in percent. - - The 'file_obj' argument is the console file object where the progress - has to be printed to. Pass 'None' to disable the progress indicator. - - The 'format_string' argument is the format string for the progress - indicator. It has to contain a single '%d' placeholder which will be - substitutes with copied data in percent. - """ - - self._progress_file = file_obj - if format_string: - self._progress_format = format_string - else: - self._progress_format = "Copied %d%%" - - def _set_image_size(self, image_size): - """ - Set image size and initialize various other geometry-related attributes. - """ - - if self.image_size is not None and self.image_size != image_size: - raise Error( - "cannot set image size to %d bytes, it is known to " - "be %d bytes (%s)" - % (image_size, self.image_size, self.image_size_human) - ) - - self.image_size = image_size - self.image_size_human = human_size(image_size) - self.blocks_cnt = self.image_size + self.block_size - 1 - self.blocks_cnt /= self.block_size - - if self.mapped_cnt is None: - self.mapped_cnt = self.blocks_cnt - self.mapped_size = self.image_size - self.mapped_size_human = self.image_size_human - - def _verify_bmap_checksum(self): - """ - This is a helper function which verifies the bmap file checksum. - """ - - import mmap - - if self.bmap_version_minor == 3: - correct_chksum = self._xml.find("BmapFileSHA1").text.strip() - else: - correct_chksum = self._xml.find("BmapFileChecksum").text.strip() - - # Before verifying the shecksum, we have to substitute the checksum - # value stored in the file with all zeroes. For these purposes we - # create private memory mapping of the bmap file. - mapped_bmap = mmap.mmap(self._f_bmap.fileno(), 0, access=mmap.ACCESS_COPY) - - chksum_pos = mapped_bmap.find(correct_chksum) - assert chksum_pos != -1 - - mapped_bmap[chksum_pos : chksum_pos + self._cs_len] = "0" * self._cs_len - - hash_obj = hashlib.new(self._cs_type) - hash_obj.update(mapped_bmap) - calculated_chksum = hash_obj.hexdigest() - - mapped_bmap.close() - - if calculated_chksum != correct_chksum: - raise Error( - "checksum mismatch for bmap file '%s': calculated " - "'%s', should be '%s'" - % (self._bmap_path, calculated_chksum, correct_chksum) - ) - - def _parse_bmap(self): - """ - Parse the bmap file and initialize corresponding class instance attributs. - """ - - try: - self._xml = ElementTree.parse(self._f_bmap) - except ElementTree.ParseError as err: - # Extrace the erroneous line with some context - self._f_bmap.seek(0) - xml_extract = "" - for num, line in enumerate(self._f_bmap): - if num >= err.position[0] - 4 and num <= err.position[0] + 4: - xml_extract += "Line %d: %s" % (num, line) - - raise Error( - "cannot parse the bmap file '%s' which should be a " - "proper XML file: %s, the XML extract:\n%s" - % (self._bmap_path, err, xml_extract) - ) - - xml = self._xml - self.bmap_version = str(xml.getroot().attrib.get("version")) - - # Make sure we support this version - self.bmap_version_major = int(self.bmap_version.split(".", 1)[0]) - self.bmap_version_minor = int(self.bmap_version.split(".", 1)[1]) - if self.bmap_version_major > SUPPORTED_BMAP_VERSION: - raise Error( - "only bmap format version up to %d is supported, " - "version %d is not supported" - % (SUPPORTED_BMAP_VERSION, self.bmap_version_major) - ) - - # Fetch interesting data from the bmap XML file - self.block_size = int(xml.find("BlockSize").text.strip()) - self.blocks_cnt = int(xml.find("BlocksCount").text.strip()) - self.mapped_cnt = int(xml.find("MappedBlocksCount").text.strip()) - self.image_size = int(xml.find("ImageSize").text.strip()) - self.image_size_human = human_size(self.image_size) - self.mapped_size = self.mapped_cnt * self.block_size - self.mapped_size_human = human_size(self.mapped_size) - self.mapped_percent = (self.mapped_cnt * 100.0) / self.blocks_cnt - - blocks_cnt = (self.image_size + self.block_size - 1) / self.block_size - if self.blocks_cnt != blocks_cnt: - raise Error( - "Inconsistent bmap - image size does not match " - "blocks count (%d bytes != %d blocks * %d bytes)" - % (self.image_size, self.blocks_cnt, self.block_size) - ) - - if self.bmap_version_major >= 1 and self.bmap_version_minor >= 3: - # Bmap file checksum appeard in format 1.3 and the only supported - # checksum type was SHA1. Version 1.4 started supporting arbitrary - # checksum types. A new "ChecksumType" tag was introduce to specify - # the checksum function name. And all XML tags which contained - # "sha1" in their name were renamed to something more neutral. - if self.bmap_version_minor == 3: - self._cs_type = "sha1" - self._cs_attrib_name = "sha1" - else: - self._cs_type = xml.find("ChecksumType").text.strip() - self._cs_attrib_name = "chksum" - - try: - self._cs_len = len(hashlib.new(self._cs_type).hexdigest()) - except ValueError as err: - raise Error( - 'cannot initialize hash function "%s": %s' % (self._cs_type, err) - ) - self._verify_bmap_checksum() - - def _update_progress(self, blocks_written): - """ - Print the progress indicator if the mapped area size is known and if - the indicator has been enabled by assigning a console file object to - the 'progress_file' attribute. - """ - - if self.mapped_cnt: - assert blocks_written <= self.mapped_cnt - percent = int((float(blocks_written) / self.mapped_cnt) * 100) - self._log.debug( - "wrote %d blocks out of %d (%d%%)" - % (blocks_written, self.mapped_cnt, percent) - ) - else: - self._log.debug("wrote %d blocks" % blocks_written) - - if not self._progress_file: - return - - if self.mapped_cnt: - progress = "\r" + self._progress_format % percent + "\n" - else: - # Do not rotate the wheel too fast - now = datetime.datetime.now() - min_delta = datetime.timedelta(milliseconds=250) - if now - self._progress_time < min_delta: - return - self._progress_time = now - - progress_wheel = ("-", "\\", "|", "/") - progress = "\r" + progress_wheel[self._progress_index % 4] + "\n" - self._progress_index += 1 - - # This is a little trick we do in order to make sure that the next - # message will always start from a new line - we switch to the new - # line after each progress update and move the cursor up. As an - # example, this is useful when the copying is interrupted by an - # exception - the error message will start form new line. - if self._progress_started: - # The "move cursor up" escape sequence - self._progress_file.write("\033[1A") # pylint: disable=W1401 - else: - self._progress_started = True - - self._progress_file.write(progress) - self._progress_file.flush() - - def _get_block_ranges(self): - """ - This is a helper generator that parses the bmap XML file and for each - block range in the XML file it yields ('first', 'last', 'chksum') - tuples, where: - * 'first' is the first block of the range; - * 'last' is the last block of the range; - * 'chksum' is the checksum of the range ('None' is used if it is - missing). - - If there is no bmap file, the generator just yields a single range - for entire image file. If the image size is unknown, the generator - infinitely yields continuous ranges of size '_batch_blocks'. - """ - - if not self._f_bmap: - # We do not have the bmap, yield a tuple with all blocks - if self.blocks_cnt: - yield (0, self.blocks_cnt - 1, None) - else: - # We do not know image size, keep yielding tuples with many - # blocks infinitely. - first = 0 - while True: - yield (first, first + self._batch_blocks - 1, None) - first += self._batch_blocks - return - - # We have the bmap, just read it and yield block ranges - xml = self._xml - xml_bmap = xml.find("BlockMap") - - for xml_element in xml_bmap.findall("Range"): - blocks_range = xml_element.text.strip() - # The range of blocks has the "X - Y" format, or it can be just "X" - # in old bmap format versions. First, split the blocks range string - # and strip white-spaces. - split = [x.strip() for x in blocks_range.split("-", 1)] - - first = int(split[0]) - if len(split) > 1: - last = int(split[1]) - if first > last: - raise Error("bad range (first > last): '%s'" % blocks_range) - else: - last = first - - if self._cs_attrib_name in xml_element.attrib: - chksum = xml_element.attrib[self._cs_attrib_name] - else: - chksum = None - - yield (first, last, chksum) - - def _get_batches(self, first, last): - """ - This is a helper generator which splits block ranges from the bmap file - to smaller batches. Indeed, we cannot read and write entire block - ranges from the image file, because a range can be very large. So we - perform the I/O in batches. Batch size is defined by the - '_batch_blocks' attribute. Thus, for each (first, last) block range, - the generator yields smaller (start, end, length) batch ranges, where: - * 'start' is the starting batch block number; - * 'last' is the ending batch block number; - * 'length' is the batch length in blocks (same as - 'end' - 'start' + 1). - """ - - batch_blocks = self._batch_blocks - - while first + batch_blocks - 1 <= last: - yield (first, first + batch_blocks - 1, batch_blocks) - first += batch_blocks - - batch_blocks = last - first + 1 - if batch_blocks: - yield (first, first + batch_blocks - 1, batch_blocks) - - def _get_data(self, verify): - """ - This is generator which reads the image file in '_batch_blocks' chunks - and yields ('type', 'start', 'end', 'buf) tuples, where: - * 'start' is the starting block number of the batch; - * 'end' is the last block of the batch; - * 'buf' a buffer containing the batch data. - """ - - try: - for first, last, chksum in self._get_block_ranges(): - if verify and chksum: - hash_obj = hashlib.new(self._cs_type) - - self._f_image.seek(first * self.block_size) - - iterator = self._get_batches(first, last) - for start, end, length in iterator: - try: - buf = self._f_image.read(length * self.block_size) - except IOError as err: - raise Error( - "error while reading blocks %d-%d of the " - "image file '%s': %s" % (start, end, self._image_path, err) - ) - - if not buf: - self._batch_queue.put(None) - return - - if verify and chksum: - hash_obj.update(buf) - - blocks = (len(buf) + self.block_size - 1) / self.block_size - self._log.debug( - "queueing %d blocks, queue length is %d" - % (blocks, self._batch_queue.qsize()) - ) - - self._batch_queue.put(("range", start, start + blocks - 1, buf)) - - if verify and chksum and hash_obj.hexdigest() != chksum: - raise Error( - "checksum mismatch for blocks range %d-%d: " - "calculated %s, should be %s (image file %s)" - % (first, last, hash_obj.hexdigest(), chksum, self._image_path) - ) - # Silence pylint warning about catching too general exception - # pylint: disable=W0703 - except Exception: - # pylint: enable=W0703 - # In case of any exception - just pass it to the main thread - # through the queue. - reraise(exc_info[0], exc_info[1], exc_info[2]) - - self._batch_queue.put(None) - - def copy(self, sync=True, verify=True): - """ - Copy the image to the destination file using bmap. The 'sync' argument - defines whether the destination file has to be synchronized upon - return. The 'verify' argument defines whether the checksum has to be - verified while copying. - """ - - # Create the queue for block batches and start the reader thread, which - # will read the image in batches and put the results to '_batch_queue'. - self._batch_queue = Queue.Queue(self._batch_queue_len) - thread.start_new_thread(self._get_data, (verify,)) - - blocks_written = 0 - bytes_written = 0 - fsync_last = 0 - - self._progress_started = False - self._progress_index = 0 - self._progress_time = datetime.datetime.now() - - # Read the image in '_batch_blocks' chunks and write them to the - # destination file - while True: - batch = self._batch_queue.get() - if batch is None: - # No more data, the image is written - break - elif batch[0] == "error": - # The reader thread encountered an error and passed us the - # exception. - exc_info = batch[1] - raise exc_info[1].with_traceback(exc_info[2]) - - (start, end, buf) = batch[1:4] - - assert len(buf) <= (end - start + 1) * self.block_size - assert len(buf) > (end - start) * self.block_size - - self._f_dest.seek(start * self.block_size) - - # Synchronize the destination file if we reached the watermark - if self._dest_fsync_watermark: - if blocks_written >= fsync_last + self._dest_fsync_watermark: - fsync_last = blocks_written - self.sync() - - try: - self._f_dest.write(buf) - except IOError as err: - raise Error( - "error while writing blocks %d-%d of '%s': %s" - % (start, end, self._dest_path, err) - ) - - self._batch_queue.task_done() - blocks_written += end - start + 1 - bytes_written += len(buf) - - self._update_progress(blocks_written) - - if not self.image_size: - # The image size was unknown up until now, set it - self._set_image_size(bytes_written) - - # This is just a sanity check - we should have written exactly - # 'mapped_cnt' blocks. - if blocks_written != self.mapped_cnt: - raise Error( - "wrote %u blocks from image '%s' to '%s', but should " - "have %u - bmap file '%s' does not belong to this " - "image" - % ( - blocks_written, - self._image_path, - self._dest_path, - self.mapped_cnt, - self._bmap_path, - ) - ) - - if self._dest_is_regfile: - # Make sure the destination file has the same size as the image - try: - os.ftruncate(self._f_dest.fileno(), self.image_size) - except OSError as err: - raise Error("cannot truncate file '%s': %s" % (self._dest_path, err)) - - try: - self._f_dest.flush() - except IOError as err: - raise Error("cannot flush '%s': %s" % (self._dest_path, err)) - - if sync: - self.sync() - - def sync(self): - """ - Synchronize the destination file to make sure all the data are actually - written to the disk. - """ - - if self._dest_supports_fsync: - try: - os.fsync(self._f_dest.fileno()), - except OSError as err: - raise Error( - "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) - ) - - -class BmapBdevCopy(BmapCopy): - """ - This class is a specialized version of 'BmapCopy' which copies the image to - a block device. Unlike the base 'BmapCopy' class, this class does various - optimizations specific to block devices, e.g., switching to the 'noop' I/O - scheduler. - """ - - def __init__(self, image, dest, bmap=None, image_size=None, log=None): - """ - The same as the constructor of the 'BmapCopy' base class, but adds - useful guard-checks specific to block devices. - """ - - # Call the base class constructor first - BmapCopy.__init__(self, image, dest, bmap, image_size, log=log) - - self._dest_fsync_watermark = (6 * 1024 * 1024) / self.block_size - - self._sysfs_base = None - self._sysfs_scheduler_path = None - self._sysfs_max_ratio_path = None - self._old_scheduler_value = None - self._old_max_ratio_value = None - - # If the image size is known, check that it fits the block device - if self.image_size: - try: - bdev_size = os.lseek(self._f_dest.fileno(), 0, os.SEEK_END) - os.lseek(self._f_dest.fileno(), 0, os.SEEK_SET) - except OSError as err: - raise Error( - "cannot seed block device '%s': %s " - % (self._dest_path, err.strerror) - ) - - if bdev_size < self.image_size: - raise Error( - "the image file '%s' has size %s and it will not " - "fit the block device '%s' which has %s capacity" - % ( - self._image_path, - self.image_size_human, - self._dest_path, - human_size(bdev_size), - ) - ) - - # Construct the path to the sysfs directory of our block device - st_rdev = os.fstat(self._f_dest.fileno()).st_rdev - self._sysfs_base = "/sys/dev/block/%s:%s/" % ( - os.major(st_rdev), - os.minor(st_rdev), - ) - - # Check if the 'queue' sub-directory exists. If yes, then our block - # device is entire disk. Otherwise, it is a partition, in which case we - # need to go one level up in the sysfs hierarchy. - if not os.path.exists(self._sysfs_base + "queue"): - self._sysfs_base = self._sysfs_base + "../" - - self._sysfs_scheduler_path = self._sysfs_base + "queue/scheduler" - self._sysfs_max_ratio_path = self._sysfs_base + "bdi/max_ratio" - - def _tune_block_device(self): - """ - Tune the block device for better performance: - 1. Switch to the 'noop' I/O scheduler if it is available - sequential - write to the block device becomes a lot faster comparing to CFQ. - 2. Limit the write buffering - we do not need the kernel to buffer a - lot of the data we send to the block device, because we write - sequentially. Limit the buffering. - - The old settings are saved in order to be able to restore them later. - """ - # Switch to the 'noop' I/O scheduler - try: - with open(self._sysfs_scheduler_path, "r+") as f_scheduler: - contents = f_scheduler.read() - f_scheduler.seek(0) - f_scheduler.write("noop") - except IOError as err: - self._log.warning( - "failed to enable I/O optimization, expect " - "suboptimal speed (reason: cannot switch " - "to the 'noop' I/O scheduler: %s)" % err - ) - else: - # The file contains a list of schedulers with the current - # scheduler in square brackets, e.g., "noop deadline [cfq]". - # Fetch the name of the current scheduler. - import re - - match = re.match(r".*\[(.+)\].*", contents) - if match: - self._old_scheduler_value = match.group(1) - - # Limit the write buffering, because we do not need too much of it when - # writing sequntially. Excessive buffering makes some systems not very - # responsive, e.g., this was observed in Fedora 17. - try: - with open(self._sysfs_max_ratio_path, "r+") as f_ratio: - self._old_max_ratio_value = f_ratio.read() - f_ratio.seek(0) - f_ratio.write("1") - except IOError as err: - self._log.warning( - "failed to disable excessive buffering, expect " - "worse system responsiveness (reason: cannot set " - "max. I/O ratio to 1: %s)" % err - ) - - def _restore_bdev_settings(self): - """ - Restore old block device settings which we changed in - '_tune_block_device()'. - """ - - if self._old_scheduler_value is not None: - try: - with open(self._sysfs_scheduler_path, "w") as f_scheduler: - f_scheduler.write(self._old_scheduler_value) - except IOError as err: - raise Error( - "cannot restore the '%s' I/O scheduler: %s" - % (self._old_scheduler_value, err) - ) - - if self._old_max_ratio_value is not None: - try: - with open(self._sysfs_max_ratio_path, "w") as f_ratio: - f_ratio.write(self._old_max_ratio_value) - except IOError as err: - raise Error( - "cannot set the max. I/O ratio back to '%s': %s" - % (self._old_max_ratio_value, err) - ) - - def copy(self, sync=True, verify=True): - """ - The same as in the base class but tunes the block device for better - performance before starting writing. Additionally, it forces block - device synchronization from time to time in order to make sure we do - not get stuck in 'fsync()' for too long time. The problem is that the - kernel synchronizes block devices when the file is closed. And the - result is that if the user interrupts us while we are copying the data, - the program will be blocked in 'close()' waiting for the block device - synchronization, which may last minutes for slow USB stick. This is - very bad user experience, and we work around this effect by - synchronizing from time to time. - """ - - self._tune_block_device() - - try: - BmapCopy.copy(self, sync, verify) - except: - raise - finally: - self._restore_bdev_settings() diff --git a/tests/oldcodebase/__init__.py b/tests/oldcodebase/__init__.py deleted file mode 100644 index e69de29..0000000 diff --git a/tests/test_compat.py b/tests/test_compat.py index 86acfff..5fec566 100644 --- a/tests/test_compat.py +++ b/tests/test_compat.py @@ -15,9 +15,8 @@ # General Public License for more details. """ -This unit test verifies various compatibility aspects of the BmapCopy module: - * current BmapCopy has to handle all the older bmap formats - * older BmapCopy have to handle all the newer compatible bmap formats +This unit test verifies that the current BmapCopy module can read every +historical bmap file format supplied as a fixture in `tests/test-data/`. """ # Disable the following pylint recommendations: @@ -45,8 +44,6 @@ _BMAP_TEMPL = "test.image.bmap.v" # Name of the subdirectory where test data are stored _TEST_DATA_SUBDIR = "test-data" -# Name of the subdirectory where old BmapCopy modules are stored -_OLDCODEBASE_SUBDIR = "oldcodebase" class TestCompat(unittest.TestCase): @@ -88,70 +85,5 @@ def test(self): image_path, self._f_copy.name, bmap_path, image_chksum, image_size ) - # Test the older versions of BmapCopy - self._test_older_bmapcopy() - self._f_copy.close() self._f_image.close() - - def _test_older_bmapcopy(self): - """Test older than the current versions of the BmapCopy class.""" - - def import_module(searched_module): - """Search and import a module by its name.""" - - modref = __import__(searched_module) - for name in searched_module.split(".")[1:]: - modref = getattr(modref, name) - return modref - - oldcodebase_dir = os.path.join(os.path.dirname(__file__), _OLDCODEBASE_SUBDIR) - - # Construct the list of old BmapCopy modules - old_modules = [] - for dentry in os.listdir(oldcodebase_dir): - if dentry.startswith("BmapCopy") and dentry.endswith(".py"): - old_modules.append("tests." + _OLDCODEBASE_SUBDIR + "." + dentry[:-3]) - - for old_module in old_modules: - modref = import_module(old_module) - - for bmap_path in self._bmap_paths: - self._do_test_older_bmapcopy(bmap_path, modref) - - def _do_test_older_bmapcopy(self, bmap_path, modref): - """ - Test an older version of BmapCopy class, referenced by the 'modref' - argument. The 'bmap_path' argument is the bmap file path to test with. - """ - - # Get a reference to the older BmapCopy class object to test with - old_bmapcopy_class = getattr(modref, "BmapCopy") - supported_ver = getattr(modref, "SUPPORTED_BMAP_VERSION") - - f_bmap = open(bmap_path, "r") - - # Find the version of the bmap file. The easiest is to simply use the - # latest BmapCopy. - bmapcopy = BmapCopy.BmapCopy(self._f_image, self._f_copy, f_bmap) - bmap_version = bmapcopy.bmap_version - bmap_version_major = bmapcopy.bmap_version_major - - try: - if supported_ver >= bmap_version: - writer = old_bmapcopy_class(self._f_image, self._f_copy, f_bmap) - writer.copy(True, True) - except: # pylint: disable=W0702 - if supported_ver >= bmap_version_major: - # The BmapCopy which we are testing is supposed to support this - # version of bmap file format. However, bmap format version 1.4 - # was a screw-up, because it actually had incompatible changes, - # so old versions of BmapCopy are supposed to fail. - if not (supported_ver == 1 and bmap_version == "1.4"): - print( - 'Module "%s" failed to handle "%s"' - % (modref.__name__, bmap_path) - ) - raise - - f_bmap.close() From b1ee490f90458309076a5671d7a232a44c69b9fe Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 14:04:40 -0400 Subject: [PATCH 08/20] pyproject.toml: swap black for ruff in dev extras Replace black with ruff in the dev optional-dependencies group and add a minimal [tool.ruff] block (line-length 88, py39 target) so ruff format is a drop-in replacement for black at the same line length. Format-only scope on purpose: this commit does not enable [tool.ruff.lint]. Ruff's lint side is a future, separate decision - turning it on against the current tree surfaces a substantial backlog across pyupgrade, bugbear, and bandit rules that deserves its own focused pass rather than a rider on the formatter swap. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- pyproject.toml | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/pyproject.toml b/pyproject.toml index 8898a0f..71d4917 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -40,7 +40,7 @@ classifiers = [ [project.optional-dependencies] dev = [ - "black >= 22.3.0", + "ruff >= 0.6.0", ] [project.urls] @@ -58,3 +58,7 @@ build-backend = "hatchling.build" [tool.hatch.version] path = "src/bmaptool/__init__.py" + +[tool.ruff] +line-length = 88 +target-version = "py39" From 46c6f8739b92b5e139c998eb84c7419809ff505d Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 14:17:53 -0400 Subject: [PATCH 09/20] reformat: apply ruff format to the entire tree One-time, tree-wide application of ruff format. Pure formatting churn - no behavior changes: - merge adjacent string literals that span continuation lines - modernize one nested 'with' to py3.10+ parenthesized syntax - make two trailing-comma statement expressions explicit as one-element tuples This commit is isolated so it can be added to .git-blame-ignore-revs and 'git blame' can skip it cleanly. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- src/bmaptool/BmapCopy.py | 11 ++++++----- src/bmaptool/CLI.py | 9 +++------ src/bmaptool/Filemap.py | 7 +++---- src/bmaptool/TransRead.py | 3 +-- tests/test_bmap_helpers.py | 12 +++--------- 5 files changed, 16 insertions(+), 26 deletions(-) diff --git a/src/bmaptool/BmapCopy.py b/src/bmaptool/BmapCopy.py index f8f0a2f..a746d6f 100644 --- a/src/bmaptool/BmapCopy.py +++ b/src/bmaptool/BmapCopy.py @@ -316,7 +316,7 @@ def set_psplash_pipe(self, path): self._psplash_pipe = path else: _log.warning( - "'%s' is not a pipe, so psplash progress will not be " "updated" % path + "'%s' is not a pipe, so psplash progress will not be updated" % path ) def set_progress_indicator(self, file_obj, format_string): @@ -785,7 +785,7 @@ def sync(self): if self._dest_supports_fsync: try: - os.fsync(self._f_dest.fileno()), + (os.fsync(self._f_dest.fileno()),) except OSError as err: raise Error( "cannot synchronize '%s': %s " % (self._dest_path, err.strerror) @@ -878,9 +878,10 @@ def copy(self, sync=True, verify=True): # This was observed e.g. in Fedora 17. # The old settings are saved and restored by the context managers. - with SysfsChange(self._sysfs_max_ratio_path, "1") as max_ratio_chg, SysfsChange( - self._sysfs_scheduler_path, "none" - ) as scheduler_chg: + with ( + SysfsChange(self._sysfs_max_ratio_path, "1") as max_ratio_chg, + SysfsChange(self._sysfs_scheduler_path, "none") as scheduler_chg, + ): if max_ratio_chg.error: _log.warning( "failed to disable excessive buffering, expect " diff --git a/src/bmaptool/CLI.py b/src/bmaptool/CLI.py index 9183478..b5bede2 100644 --- a/src/bmaptool/CLI.py +++ b/src/bmaptool/CLI.py @@ -615,9 +615,7 @@ def copy_command(args): ) if args.bmap_sig and not bmap_obj: - error_out( - "the bmap signature file was specified, but bmap file was " "not found" - ) + error_out("the bmap signature file was specified, but bmap file was not found") f_obj = verify_bmap_signature(args, bmap_obj, bmap_path, image_obj.is_url) if f_obj: @@ -653,8 +651,7 @@ def copy_command(args): log.info("no bmap given, copy entire image to '%s'" % args.dest) else: error_out( - "bmap file not found, please, use --nobmap option to " - "flash without bmap" + "bmap file not found, please, use --nobmap option to flash without bmap" ) else: log.info("block map format version %s" % writer.bmap_version) @@ -767,7 +764,7 @@ def create_command(args): "all %s are mapped, no holes in '%s'" % (creator.image_size_human, args.image) ) - log.warning("was the image handled incorrectly and holes " "were expanded?") + log.warning("was the image handled incorrectly and holes were expanded?") def parse_arguments(): diff --git a/src/bmaptool/Filemap.py b/src/bmaptool/Filemap.py index 9792c0a..e4c400e 100644 --- a/src/bmaptool/Filemap.py +++ b/src/bmaptool/Filemap.py @@ -96,7 +96,7 @@ def __init__(self, image): raise Error("cannot flush image file '%s': %s" % (self._image_path, err)) try: - os.fsync(self._f_image.fileno()), + (os.fsync(self._f_image.fileno()),) except OSError as err: raise Error( "cannot synchronize image file '%s': %s " @@ -192,8 +192,7 @@ def _lseek(file_obj, offset, whence): return -1 elif err.errno == errno.EINVAL: raise ErrorNotSupp( - "the kernel or file-system does not support " - '"SEEK_HOLE" and "SEEK_DATA"' + 'the kernel or file-system does not support "SEEK_HOLE" and "SEEK_DATA"' ) else: raise @@ -422,7 +421,7 @@ def _invoke_fiemap(self, block, count): raise ErrorNotSupp(errstr) if err.errno == errno.ENOTTY: errstr = ( - "FilemapFiemap: the FIEMAP ioctl is not supported " "by the kernel" + "FilemapFiemap: the FIEMAP ioctl is not supported by the kernel" ) _log.debug(errstr) raise ErrorNotSupp(errstr) diff --git a/src/bmaptool/TransRead.py b/src/bmaptool/TransRead.py index c35e033..21efbd9 100644 --- a/src/bmaptool/TransRead.py +++ b/src/bmaptool/TransRead.py @@ -139,8 +139,7 @@ def _decode_sshpass_exit_code(code): result = "invalid/incorrect password" elif code == 6: result = ( - "host public key is unknown. sshpass exits without " - "confirming the new key" + "host public key is unknown. sshpass exits without confirming the new key" ) elif code == 255: # SSH result =s 255 on any error diff --git a/tests/test_bmap_helpers.py b/tests/test_bmap_helpers.py index 2cefad6..ed7f0ba 100644 --- a/tests/test_bmap_helpers.py +++ b/tests/test_bmap_helpers.py @@ -126,9 +126,7 @@ def test_is_zfs_configuration_compatible_not_installed(self): self.assertFalse(BmapHelpers.is_zfs_configuration_compatible()) @patch.object(BmapHelpers, "get_file_system_type", return_value="zfs") - def test_is_compatible_file_system_zfs_valid( - self, mock_get_fs_type - ): # pylint: disable=unused-argument + def test_is_compatible_file_system_zfs_valid(self, mock_get_fs_type): # pylint: disable=unused-argument """Check compatibility check passes when zfs param is set correctly""" with tempfile.NamedTemporaryFile( @@ -141,9 +139,7 @@ def test_is_compatible_file_system_zfs_valid( self.assertTrue(BmapHelpers.is_compatible_file_system(fobj.name)) @patch.object(BmapHelpers, "get_file_system_type", return_value="zfs") - def test_is_compatible_file_system_zfs_invalid( - self, mock_get_fs_type - ): # pylint: disable=unused-argument + def test_is_compatible_file_system_zfs_invalid(self, mock_get_fs_type): # pylint: disable=unused-argument """Check compatibility check fails when zfs param is set incorrectly""" with tempfile.NamedTemporaryFile( @@ -156,9 +152,7 @@ def test_is_compatible_file_system_zfs_invalid( self.assertFalse(BmapHelpers.is_compatible_file_system(fobj.name)) @patch.object(BmapHelpers, "get_file_system_type", return_value="ext4") - def test_is_compatible_file_system_ext4( - self, mock_get_fs_type - ): # pylint: disable=unused-argument + def test_is_compatible_file_system_ext4(self, mock_get_fs_type): # pylint: disable=unused-argument """Check non-zfs file systems pass compatibility checks""" with tempfile.NamedTemporaryFile( From 66e93c13fba4cf3a5cfd7251dd216bf15f57736c Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 14:21:34 -0400 Subject: [PATCH 10/20] add .git-blame-ignore-revs Provide a list of commits that 'git blame' should skip past, so that mechanical reformatting commits never obscure the real authorship of a line. Seeded with the single tree-wide ruff format commit. Contributors opt in locally with: git config blame.ignoreRevsFile .git-blame-ignore-revs GitHub's blame view honors this file automatically. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .git-blame-ignore-revs | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 .git-blame-ignore-revs diff --git a/.git-blame-ignore-revs b/.git-blame-ignore-revs new file mode 100644 index 0000000..8cedd4a --- /dev/null +++ b/.git-blame-ignore-revs @@ -0,0 +1,11 @@ +# Revisions listed here are skipped by `git blame` when the file is wired in +# via: +# +# git config blame.ignoreRevsFile .git-blame-ignore-revs +# +# Add only commits whose entire diff is mechanical noise (e.g. a tree-wide +# formatter run). Never list a commit that changes behavior - readers will +# stop seeing it as a real revision in `git blame` output. + +# reformat: apply ruff format to the entire tree +46c6f8739b92b5e139c998eb84c7419809ff505d From f39438ad18474511bfb34c1e58b86db1f27c8ef9 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 16:55:22 -0400 Subject: [PATCH 11/20] ci: bump actions/checkout in lint job to v4 The lint job still pins actions/checkout@v3, which is end-of-life and prints a deprecation warning on every workflow run. Move it to v4 so the warning goes away and the action keeps receiving security updates. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/workflows/ci.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 9773232..c8abc09 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -55,5 +55,5 @@ jobs: lint: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - uses: psf/black@stable From 10db4f20824fdddc054ff432121faeee9808d365 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 16:56:37 -0400 Subject: [PATCH 12/20] ci: replace psf/black with a ruff format check Drop the psf/black@stable third-party action and run ruff format --check . directly, so the formatter run by CI matches the [tool.ruff] configuration in pyproject.toml. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/workflows/ci.yml | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index c8abc09..e09bf14 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -56,4 +56,11 @@ jobs: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - - uses: psf/black@stable + - name: Setup Python + uses: actions/setup-python@v4 + with: + python-version: "3.12" + - name: Install ruff + run: python3 -m pip install ruff==0.15.14 + - name: ruff format --check + run: ruff format --check . From 15a6a63c6bc545d4b6ae2dae360853f5c083827c Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 17:09:53 -0400 Subject: [PATCH 13/20] ci: disable fail-fast on the test matrix GitHub Actions defaults strategy.fail-fast to true, which cancels every still-running matrix job the moment any one job fails. For a test matrix spanning Python 3.9 through 3.14 (plus native), this hides exactly the data point that matters most when triaging a failure: is it specific to one interpreter, or does it reproduce across the whole range? Set fail-fast: false so the rest of the matrix runs to completion and the failure pattern is visible in a single workflow run. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/workflows/ci.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index e09bf14..387ada7 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -9,6 +9,7 @@ jobs: test: runs-on: ubuntu-latest strategy: + fail-fast: false matrix: python-version: - "3.9" From 706841475fc786fecf7d0d5f271723592722b65d Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 17:11:59 -0400 Subject: [PATCH 14/20] ci: pin actions/checkout to a 40-char SHA Floating tags like @v4 (or @main / @master) re-resolve every workflow run. An attacker who compromises the action's repository - or simply the maintainer's release pipeline - can have malicious code execute inside this project's runners on the next push, without a single commit landing here. Tag-pinning makes the action's identity part of this repo's controlled state: the resolved commit cannot change underneath us. Pin actions/checkout to de0fac2e4500dabe0009e67214ff5f5447ce83dd (v6.0.2, the current latest tag); leave the version label in a trailing comment so the bump is human-readable and dependabot has something to rewrite. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/workflows/ci.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 387ada7..f95965c 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -22,7 +22,7 @@ jobs: # GPG code, since it must use the host python3-gpg package - "native" steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - if: matrix.python-version != 'native' name: Setup Python ${{ matrix.python-version }} @@ -56,7 +56,7 @@ jobs: lint: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup Python uses: actions/setup-python@v4 with: From bba83df7d567fa5fadbba9e14ca16cf96ad65540 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 17:12:53 -0400 Subject: [PATCH 15/20] ci: pin actions/setup-python to a 40-char SHA Floating tags re-resolve every workflow run, so a future malicious release published under the same tag would execute inside this project's runners on the next push, without a commit landing here. Pinning the action by 40-char SHA freezes the action's identity into this repo's controlled state. Pin actions/setup-python to a309ff8b426b58ec0e2a45f0f869d46889d02405 (v6.2.0, the current latest tag); leave the version label in a trailing comment so the bump is human-readable and dependabot has something to rewrite. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/workflows/ci.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index f95965c..4fdc4d0 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -26,7 +26,7 @@ jobs: - if: matrix.python-version != 'native' name: Setup Python ${{ matrix.python-version }} - uses: actions/setup-python@v4 + uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 with: python-version: ${{ matrix.python-version }} @@ -58,7 +58,7 @@ jobs: steps: - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup Python - uses: actions/setup-python@v4 + uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 with: python-version: "3.12" - name: Install ruff From 5f2e8a17abef2372065fe4b4322d86dcce7e5ef8 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 17:14:44 -0400 Subject: [PATCH 16/20] ci: declare read-only GITHUB_TOKEN permissions GitHub Actions defaults the GITHUB_TOKEN to a permissive scope (the 'read and write' setting at the repo level), which means every step in every job inherits a token that can push commits, open issues, modify checks, and so on. Nothing in this workflow needs any of that; the only token use is implicit, by actions/checkout reading the repo. Declare permissions: contents: read at workflow scope and again on each job. The workflow-scope block is the safety net for any future job that forgets to set its own permissions; the per-job blocks make each job's required surface area explicit and survive someone later adding a job-scoped permission for a single step without accidentally widening the default for the whole workflow. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/workflows/ci.yml | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 4fdc4d0..5f5d042 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -5,9 +5,14 @@ on: - push - pull_request +permissions: + contents: read + jobs: test: runs-on: ubuntu-latest + permissions: + contents: read strategy: fail-fast: false matrix: @@ -55,6 +60,8 @@ jobs: lint: runs-on: ubuntu-latest + permissions: + contents: read steps: - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup Python From f94122ed115276d6bd6bcf774ffb105d406c1c3f Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 17:37:12 -0400 Subject: [PATCH 17/20] tests: pin pbzip2 and pigz to a single worker Force pbzip2 (-p1) and pigz (-p 1) to use a single worker when the api_base test suite is generating compressed fixtures. By default both tools spawn one worker per detected CPU and hold a per-worker buffer; OOM events with the default settings have occurred on shared CI runners while compressing the test images. Single-threaded mode is the right knob for this test: - pbzip2 and pigz still produce the compressed fixtures, so the test still exercises the producer paths. - The output stays a valid bzip2 / gzip stream that the pbzip2/pigz consumer paths in TransRead.py can still decompress, so consumer coverage is unchanged. - Peak memory during fixture generation drops; CPU concurrency on the producer drops; fixture-generation wall-clock goes up proportionally. This is a deliberate trade - finishing slower is acceptable; getting SIGKILLed mid-run is not. No bmap copy behavior changes; this is a test-fixture-generation tuning change. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- tests/test_api_base.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/test_api_base.py b/tests/test_api_base.py index b54e92f..1e2d396 100644 --- a/tests/test_api_base.py +++ b/tests/test_api_base.py @@ -89,9 +89,9 @@ def _generate_compressed_files(file_path, delete=True): compressors = [ ("bzip2", None, ".bz2", "-c -k"), - ("pbzip2", None, ".p.bz2", "-c -k"), + ("pbzip2", None, ".p.bz2", "-p1 -c -k"), ("gzip", None, ".gz", "-c"), - ("pigz", None, ".p.gz", "-c -k"), + ("pigz", None, ".p.gz", "-p 1 -c -k"), ("xz", None, ".xz", "-c -k"), ("lzop", None, ".lzo", "-c -k"), ("lz4", None, ".lz4", "-c -k"), From 9df0e0f608f6321882dfe8264258abc2cda2f47c Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 17:38:43 -0400 Subject: [PATCH 18/20] add .github/dependabot.yml Configure dependabot to open weekly pull requests against: - github-actions: every action used in .github/workflows/. Actions are SHA-pinned with '# vX.Y.Z' trailing comments, and dependabot rewrites both the SHA and the comment when a new tag ships. - pip: every Python package declared in pyproject.toml. Runtime dependencies are empty today, so this primarily keeps the dev extras (ruff) current. Both ecosystems are grouped, so dependabot batches its findings into a single PR per ecosystem per week instead of one PR per package. That keeps the queue manageable on a low-traffic project without giving up the security-update cadence. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/dependabot.yml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 .github/dependabot.yml diff --git a/.github/dependabot.yml b/.github/dependabot.yml new file mode 100644 index 0000000..8048bb5 --- /dev/null +++ b/.github/dependabot.yml @@ -0,0 +1,29 @@ +# Dependabot configuration for bmaptool. +# Docs: https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file + +version: 2 +updates: + # Keep the GitHub Actions used in .github/workflows/ up to date. + # All third-party actions are SHA-pinned with a "# vX.Y.Z" comment; + # dependabot rewrites both the SHA and the comment when a new tag + # ships, so the comment stays accurate. + - package-ecosystem: github-actions + directory: / + schedule: + interval: weekly + groups: + github-actions: + patterns: + - "*" + + # Keep Python packages used in pyproject.toml up to date. Runtime + # dependencies are empty today; this primarily exercises the + # [project.optional-dependencies].dev group (ruff). + - package-ecosystem: pip + directory: / + schedule: + interval: weekly + groups: + python: + patterns: + - "*" From a40c6325a7618bd81c57d0065c1efb41c98c4d03 Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 17:44:28 -0400 Subject: [PATCH 19/20] ci: add a non-blocking pip-audit step Run pip-audit against the project so any known CVE in the runtime or dev dependency graph surfaces on every push. pip-audit reads pyproject.toml directly and queries the OSV vulnerability database; no lockfile or project install is required. The step is marked continue-on-error: true so a new CVE landing in a transitive dependency does not turn the whole workflow red and block unrelated work. Findings are visible in the run log and the maintainer can act on them at their own cadence. A future change can flip the step to blocking once the maintainer is comfortable that the audit's failure modes are well-understood (noisy false-positive CVEs, transient OSV outages, etc.). AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- .github/workflows/ci.yml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 5f5d042..370b38a 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -72,3 +72,8 @@ jobs: run: python3 -m pip install ruff==0.15.14 - name: ruff format --check run: ruff format --check . + - name: Install pip-audit + run: python3 -m pip install pip-audit==2.10.0 + - name: pip-audit + continue-on-error: true + run: pip-audit . From 25267120771c6c4fa3d710be23f066997c4dd8dd Mon Sep 17 00:00:00 2001 From: Trevor Woerner Date: Fri, 22 May 2026 17:52:44 -0400 Subject: [PATCH 20/20] CHANGELOG: describe the modernization changes Add changelog entries for the metadata modernization, formatter swap, and CI hardening: - Added: .git-blame-ignore-revs, .github/dependabot.yml, the non-blocking pip-audit step. - Changed: Python support floor 3.9, ceiling 3.14; ruff replaces black; tree-wide ruff format reformat; CI actions pinned by SHA; permissions: contents: read at workflow and job scope; fail-fast disabled on the test matrix; pbzip2/pigz pinned to a single worker in the api_base test to avoid OOM on shared runners. - Removed: tests/oldcodebase/ and its consumer in test_compat, plus six from dev extras. AI-Generated: codex/claude-opus 4.7 (xhigh) Signed-off-by: Trevor Woerner --- CHANGELOG.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 2271991..10c3f26 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,7 +7,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added +- `.git-blame-ignore-revs` so `git blame` skips the one-time tree-wide reformat. +- `.github/dependabot.yml` opening weekly grouped pull requests for GitHub Actions and pip. +- Non-blocking `pip-audit` step in the CI lint job. ### Changed +- Drop Python 3.8 from the supported set; add 3.13 and 3.14. +- Replace `black` with `ruff` for code formatting; run `ruff format --check` in CI. +- Apply a one-time tree-wide `ruff format` reformat to the entire codebase. +- Pin third-party CI actions by 40-character SHA. +- Declare `permissions: contents: read` on the CI workflow and on each job. +- Disable `fail-fast` on the CI test matrix so a failure on one Python version does not cancel the rest. +- Single-thread `pbzip2` and `pigz` in the api_base test to avoid OOM events on shared CI runners. +### Removed +- The historical pre-Python-3 `BmapCopy` modules under `tests/oldcodebase/` and the backward-compat half of `tests/test_compat.py`. +- `six` from `[project.optional-dependencies].dev` (no longer needed without the backward-compat test). ## [3.9.0]