Skip to content

Incorrect assumptions about block sizes and data/hole sizes #59

@Lalufu

Description

@Lalufu

When trying to detect holes using the SEEK_DATA/SEEK_HOLE method, bmaptool assumes that sizes of data blocks and holes are multiples of a predetermined block size:

        while True:                                                                     
            start = _lseek(self._f_image, end, whence1)                                 
            if start == -1 or start >= limit or start == self.image_size:               
                break                                                                   
                                                                                        
            end = _lseek(self._f_image, start, whence2)                                 
            if end == -1 or end == self.image_size:                                     
                end = self.blocks_cnt * self.block_size                                 
            if end > limit:                                                             
                end = limit                                                             
                                                                                        
            start_blk = start // self.block_size                                        
            end_blk = end // self.block_size - 1                                        
            _log.debug("FilemapSeek: yielding range (%d, %d)"                           
                       % (start_blk, end_blk))                                          
            yield (start_blk, end_blk)                                                  

See the calculations of start_blk and end_blk. self.block_size is determined through calling FIGETBSZ with a fallback to stat.st_blksize. At least the latter is not suitable for this purpose. It signifies a preferred IO size to the file, and can be different (larger) from allocation size for the underlying file. In this case, the above calculation for start_blk and end_blk will result in incorrect information.

See the below example, where a range of 0--1 is calculated, because the initial data block in the file is shorter than the value determined through stat.st_blksize

$ bmaptool -d create --no-checksum /mnt/testfile
bmaptool: debug: opened image "/mnt/testfile"
bmaptool: debug: block size 1048576, blocks count 1831421, image size 1920383410176
bmaptool: debug: FilemapFiemap: initializing
bmaptool: debug: FilemapFiemap: the FIEMAP ioctl is not supported by the file-system
bmaptool: debug: opened image "/mnt/testfile"
bmaptool: debug: block size 1048576, blocks count 1831421, image size 1920383410176
bmaptool: debug: FilemapSeek: initializing
bmaptool: debug: FilemapSeek: get_mapped_ranges(0,  1831421(1831420))
bmaptool: debug: FilemapSeek: yielding range (0, -1)
bmaptool: debug: FilemapSeek: yielding range (1, 8)
[...]

Stat for this file:

$ stat /mnt/testfile
  File: /mnt/testfile
  Size: 1920383410176   Blocks: 66548192   IO Block: 1048576 regular file
Device: 35h/53d Inode: 97          Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2026-04-19 00:02:53.189324295 +0000
Modify: 2026-04-19 00:03:51.812166696 +0000
Change: 2026-04-19 00:03:51.812166696 +0000
 Birth: -

/mnt is an NFS file system.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions