Skip to content

Introduce count_bytes helper#333

Open
dobo90 wants to merge 1 commit intojam1garner:masterfrom
dobo90:count_bytes
Open

Introduce count_bytes helper#333
dobo90 wants to merge 1 commit intojam1garner:masterfrom
dobo90:count_bytes

Conversation

@dobo90
Copy link

@dobo90 dobo90 commented May 20, 2025

New helper is useful when parsing binary formats where collection size is represented in bytes and we don't know element count.

I used very similar approach while developing playready-rs.

New helper is useful when parsing binary formats where collection
size is represented in bytes and we don't know element count.
@csnover
Copy link
Collaborator

csnover commented Jun 13, 2025

Hi, sorry for the delay in looking at this. This appears to be a redundant implementation of combining take_seek with until_eof. Outside of the unresolved discussion points in #217 regarding different failure modes, is there something extra that this is doing that I am missing? Thanks!

@dobo90
Copy link
Author

dobo90 commented Jun 14, 2025

I wasn't aware that it's possible to combine take_seek and until_eof. Thanks for the suggestion. I've tried adopting this apporach for playready-rs but following example does not compile:

#![allow(dead_code)]

use binrw::{
    BinRead, BinReaderExt,
    helpers::{count_bytes, until_eof},
    io::Cursor,
    io::TakeSeekExt,
};

fn main() {
    #[derive(BinRead)]
    #[br(big)]
    pub struct XmrObject {
        pub flags: u16,
        pub type_: u16,
        pub length: u32,
        #[br(args { type_, length })]
        pub data: XmrObjectInner,
    }

    #[derive(BinRead)]
    #[br(big, import { type_: u16, length: u32 })]
    pub enum XmrObjectInner {
        #[br(pre_assert(type_ == 1))]
        OuterContainer(#[br(args(length))] OuterContainer),
    }

    #[derive(BinRead)]
    #[br(big, import(length: u32))]
    pub struct OuterContainer {
        // #[br(parse_with = count_bytes(u64::from(length - 8)))]
        #[br(map_stream = |s| s.take_seek(u64::from(length - 8)), parse_with = until_eof)]
        pub containers: Vec<XmrObject>,
    }

    let mut x = Cursor::new("");
    let _: XmrObject = x.read_be().unwrap();
}

Compiler throws:

error[E0275]: overflow evaluating the requirement `&str: Sized`
  |
  = help: consider increasing the recursion limit by adding a `#![recursion_limit = "256"]` attribute to your crate (`hello`)
  = note: required for `std::io::Cursor<&str>` to implement `std::io::Read`
  = note: 127 redundant requirements hidden
  = note: required for `&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut ...>>>>>>` to implement `std::io::Read`
  = note: required for `&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut ...>>>>>>` to implement `TakeSeekExt`
  = note: the full name for the type has been written to '/tmp/hello/target/debug/deps/hello-97fecfc340ba002b.long-type-5986991549965841089.txt'
  = note: consider using `--verbose` to print the full type name to the console

When I comment #[br(map_stream = |s| s.take_seek(u64::from(length - 8)), parse_with = until_eof)] and uncomment #[br(parse_with = count_bytes(u64::from(length - 8)))] it compiles fine. Is there a bug in binrw or am I misusing API?

@csnover
Copy link
Collaborator

csnover commented Jun 14, 2025

The problem here is that the structures in your example are recursive (XmrObject -> XmrObjectInner -> OuterContainer -> XmrObject) and when the Rust compiler monomorphises the read_options function over the reader type R it gets stuck in an infinite loop because each expansion of OuterContainer::read_options<R> leads to a recursive expansion of OuterContainer::read_options<&mut TakeSeek<R>>.

This loop can be broken by dynamic dispatch so map_stream returns a Box<dyn Read + Seek>:

trait ReadSeek: binrw::io::Read + binrw::io::Seek {}
impl<T> ReadSeek for T where T: binrw::io::Read + binrw::io::Seek {}

/* ... */

#[br(map_stream = |s| Box::new(s.take_seek(u64::from(length - 8))) as Box<dyn ReadSeek>, parse_with = until_eof)]

The helper function in this PR avoids the infinite recursion by not changing the stream type. I am not confident on whether or not this is a good enough reason to have two different APIs to do the same thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants