refactor readers to use type parameters and not concrete vtables#207
refactor readers to use type parameters and not concrete vtables#207cosmicexplorer wants to merge 18 commits intozip-rs:masterfrom
Conversation
| @@ -928,6 +998,8 @@ | |||
| } | |||
| let symlink_target = if file.is_symlink() && (cfg!(unix) || cfg!(windows)) { | |||
| let mut target = Vec::with_capacity(file.size() as usize); | |||
| /* FIXME: this is broken: needs to be .read_to_end(), otherwise it writes into an | |||
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
src/unstable/read.rs
Outdated
| let mut data = ZipFileData::from_local_block(block, reader)?; | ||
|
|
||
| match parse_extra_field(&mut data) { | ||
| /* FIXME: check for the right error type here instead of accepting any old i/o |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
src/unstable/read.rs
Outdated
| // We can't use the typical ::parse() method, as we follow separate code paths depending | ||
| // on the "magic" value (since the magic value will be from the central directory header | ||
| // if we've finished iterating over all the actual files). | ||
| /* TODO: smallvec? */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
| }); | ||
| } | ||
| if let Some(last_modified_time) = data.last_modified_time { | ||
| /* TODO: use let chains once stabilized! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
src/unstable/read.rs
Outdated
| } | ||
| if let Some(info) = data.aes_mode { | ||
| #[cfg(not(feature = "aes-crypto"))] | ||
| /* TODO: make this into its own EntryReadError error type! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
| let reader = XzDecoder::new(reader); | ||
| Ok(EntryReader::Xz(reader)) | ||
| } | ||
| /* TODO: make this into its own EntryReadError error type! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
| reader: R, | ||
| ) -> Result<EntryReader<R>, ZipError> | ||
| where | ||
| /* TODO: this really shouldn't be required upon construction (especially since the reader |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
| return self | ||
| .check_matches() | ||
| .map(|()| 0) | ||
| /* TODO: use io::Error::other for MSRV >=1.74 */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
| where | ||
| R: Read + Seek, | ||
| { | ||
| // TODO: use .get_or_try_init() once stabilized to provide a closure returning a Result! |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
| if self.check == res { | ||
| Ok(()) | ||
| } else { | ||
| /* TODO: make this into our own Crc32Error error type! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
26cf755 to
134b355
Compare
- also use git dep for zstd to pull in removal of R: BufRead bound - move new methods into unstable/read.rs - support getting aes verification info from a raw ZipEntry - flesh out other methods from ZipFile -> ZipEntry - now create a streaming abstraction - move some methods around
134b355 to
020a61a
Compare
2574c06 to
f0faf24
Compare
src/crc32.rs
Outdated
| return self | ||
| .check_matches() | ||
| .map(|()| 0) | ||
| /* TODO: use io::Error::other for MSRV >=1.74 */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
src/crc32.rs
Outdated
| if self.check == res { | ||
| Ok(()) | ||
| } else { | ||
| /* TODO: make this into our own Crc32Error error type! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
7223f67 to
872bd35
Compare
src/crc32.rs
Outdated
| if self.check == res { | ||
| Ok(()) | ||
| } else { | ||
| /* TODO: make this into our own Crc32Error error type! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
src/crc32.rs
Outdated
| if self.check == res { | ||
| Ok(()) | ||
| } else { | ||
| /* TODO: make this into our own Crc32Error error type! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality
This reverts commit fd11f28.
315623d to
bbee45b
Compare
| @@ -0,0 +1,751 @@ | |||
| //! Alternate implementation of [`crate::read`]. | |||
There was a problem hiding this comment.
Document when to use this and when to use the main impl. Also, please remove everything we're not using internally -- we don't need a second public API.
| use crate::unstable::read::{ | ||
| construct_decompressing_reader, find_entry_content_range, CryptoEntryReader, CryptoVariant, | ||
| }; | ||
| pub use crate::unstable::read::{ArchiveEntry, ZipEntry}; |
There was a problem hiding this comment.
This doesn't need to be public.
| pub use crate::unstable::read::{ArchiveEntry, ZipEntry}; | |
| pub(crate) use crate::unstable::read::{ArchiveEntry, ZipEntry}; |
|
https://github.com/zip-rs/zip2/tree/wrapped-reader-refactor is a squashed and rebased version of this PR, but it doesn't build (mainly due to type mismatches involving |
|
Update: the branch now passes unit tests, and fuzz tests are running on it; but the new code specialized for seekable files is currently unused. @cosmicexplorer Is this a good place for you to resume work from? |
|
You rock!!! I have just rebased, will address your comments now. |
|
So @Pr0methean I'd like to raise a small discussion here: the reason I initially made this a draft PR is because I believe there are several shortcomings that necessitate a slight reimplementation, some requiring breaking API changes:
let final_reader = if data.encrypted {
let crypto_variant = CryptoKeyValidationSource::from_data(data)?;
let is_ae2_encrypted = crypto_variant.is_ae2_encrypted();
let crypto_reader = crypto_variant.make_crypto_reader(password, content)?;
let entry_reader =
construct_decompressing_reader(&data.compression_method, crypto_reader)?;
if is_ae2_encrypted {
/* Ae2 voids crc checking: https://www.winzip.com/en/support/aes-encryption/ */
CryptoEntryReader::Ae2Encrypted(entry_reader)
} else {
CryptoEntryReader::NonAe2Encrypted(NewCrc32Reader::new(entry_reader, data.crc32))
}
} else {
/* Not encrypted, so do the same as in .by_index_generic(): */
let entry_reader = construct_decompressing_reader(&data.compression_method, content)?;
CryptoEntryReader::Unencrypted(NewCrc32Reader::new(entry_reader, data.crc32))
};So that's my case for the API changes. We can do them in pieces, but I'm proposing this intentionally breaking change that I believe will improve the internal architecture and correctness, increase our ability to perform optimizations, and generate more flexible entry structs in our public API. |
|
I'm experiencing an incredibly strange issue on all web browsers where pushing to this branch isn't being propagated to this PR, and comparing https://github.com/zip-rs/zip2/compare/master...cosmicexplorer:zip:wrapped-reader-refactor?expand=1 acts like they're the same branch. I'm going to close this and create a new PR with the above proposal and see if that fixes this weird issue. |
|
Moved to #233 which fixes the weird github error in this PR. |
Problem
As described in gyscos/zstd-rs#288 (comment), our usage of
&'a mut dyn ReadinZipFileReaderstops us from being able to implSendorSync, or generally to send aZipFileinto a separate thread than the one owning theZipArchiveit belongs to. While this is generally not an issue, as the single synchronousRead + Seekhandle makes it difficult to farm out work to subthreads, for the purposes of #72, we would like to be able to reuse some code instead of creating an entirely separate implementation.That brings us to the second set of issues, surrounding zipcrypto support:
CryptoReaderenum and themake_crypto_reader()method, even though zip crypto is a very uncommon use case,ZipFileitself contains some very complex mutable state with both aZipFileReaderand aCryptoReader, even thoughZipFileReaderitself also contains aCryptoReader,Crc32Readerhas to maintain a special flag to disable CRC checking in the special case of AE2 encryption, which is a sign of a leaky abstraction and may easily introduce a failure to check CRC in other cases by accident in the future.Solution
Luckily, all of this becomes significantly easier without too many changes:
src/unstable/read.rsto reimplementread::ZipFilewith the newunstable::read::ZipEntry.EntryReaderandZipEntrywith a parameterized reader typeRinstead of an internal&'a mut dyn Read.ZipArchive::by_index()returningZipResult<ZipEntry<'_, impl Read + '_>>to avoid leaking internal structs while retaining the ability to rearrange the layout of our internal structs.CryptoReadercreation throughCryptoVariant.This also leads to a significantly improved refactoring of streaming support in the
streamingsubmodule ofsrc/unstable/read.rs.Result
zipcrypto support doesn't muck up the non-crypto methods,
ZipEntryis nowSend, andsrc/unstable/read.rsis significantly easier to read with equivalent functionality. This will have to be a breaking change in order to achieveSend-ability, but I believe the readability/maintainability is substantially improved with this change.