Skip to content

Migrate .metapack headers into a binary format#751

Merged
jviotti merged 4 commits intomainfrom
new-metapack
Mar 18, 2026
Merged

Migrate .metapack headers into a binary format#751
jviotti merged 4 commits intomainfrom
new-metapack

Conversation

@jviotti
Copy link
Member

@jviotti jviotti commented Mar 18, 2026

Signed-off-by: Juan Cruz Viotti jv@jviotti.com

jviotti added 2 commits March 18, 2026 15:23
Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>
Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Index (enterprise)

Details
Benchmark suite Current: 5124c39 Previous: 9a6fa19 Ratio
Add one schema (0 existing) 23 ms 25 ms 0.92
Add one schema (100 existing) 35 ms 40 ms 0.88
Add one schema (1000 existing) 154 ms 201 ms 0.77
Add one schema (10000 existing) 1665 ms 2022 ms 0.82
Update one schema (1 existing) 21 ms 23 ms 0.91
Update one schema (101 existing) 35 ms 42 ms 0.83
Update one schema (1001 existing) 155 ms 198 ms 0.78
Update one schema (10001 existing) 1602 ms 2055 ms 0.78
Cached rebuild (1 existing) 11 ms 12 ms 0.92
Cached rebuild (101 existing) 14 ms 14 ms 1
Cached rebuild (1001 existing) 32 ms 33 ms 0.97
Cached rebuild (10001 existing) 245 ms 245 ms 1
Index 100 schemas 138 ms 156 ms 0.88
Index 1000 schemas 1338 ms 1419 ms 0.94
Index 10000 schemas 16127 ms 16567 ms 0.97

This comment was automatically generated by workflow using github-action-benchmark.

jviotti added 2 commits March 18, 2026 15:41
Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>
Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>
@jviotti jviotti marked this pull request as ready for review March 18, 2026 19:48
@augmentcode
Copy link

augmentcode bot commented Mar 18, 2026

🤖 Augment PR Summary

Summary: This PR migrates the .metapack container from a JSON metadata header to a compact in-memory/binary header + optional extension bytes, and updates the indexer/server/web tooling to use the new format.

Changes:

  • Introduces a new sourcemeta::one::metapack library with read/write APIs, binary header layout, and extension helpers.
  • Removes the previous shared_metapack JSON-header implementation from src/shared and rewires consumers to the new API.
  • Updates the index generators to write JSON/JSONL/text as metapacks and to embed extra data (dialect, explorer schema metadata) into extension bytes.
  • Updates server endpoints to serve metapack payloads via FileView/metapack_info and to extract dialect/link information from extensions.
  • Updates contrib/file2metapack and other components to link against and call metapack_* functions.
  • Adds unit tests covering basic metapack write/read behavior, header fields, and extension parsing.

Technical Notes: The new header stores magic/version, encoding (identity/gzip), timestamps, uncompressed byte count, duration, SHA-256 checksum, MIME length, and a length-prefixed extension region before the payload.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 7 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

}

auto payload_offset{sizeof(MetapackHeader) + header->mime_length};
const auto *extension_size{view.as<std::uint32_t>(payload_offset)};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metapack_read_json reads the extension_size field using payload_offset before checking that header->mime_length keeps that offset within view.size(), so a malformed header could cause an out-of-bounds read/crash. Consider validating mime_length (and extension_size) against the file size before any view.as<...>(payload_offset) dereference.

Severity: high

Other Locations
  • src/metapack/metapack.cc:278
  • src/metapack/metapack.cc:261

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

return 0;
}

const auto *extension_size_pointer{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metapack_extension_offset returns a non-zero offset without verifying that the declared extension_size actually fits in the file, which can later make metapack_extension<T> return pointers into invalid memory. Consider ensuring offset_of_extension_size + sizeof(uint32_t) + *extension_size_pointer <= view.size() before returning an offset.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

return;
}

const auto info{sourcemeta::one::metapack_info(view)};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calls metapack_info(view) after only a minimal size check; if the target file is corrupted or not a metapack, metapack_info currently relies on assert and may crash/UB in release builds. Consider validating magic/format_version (and header lengths) before calling into metapack_info and returning a clean HTTP error for invalid files.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

write_link_header(response, dialect.to_string());
const auto *dialect_ext{
sourcemeta::one::metapack_extension<MetapackDialectExtension>(view)};
const std::string_view dialect =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dialect_length is trusted to build a std::string_view into the extension, but there’s no check that the extension is at least sizeof(MetapackDialectExtension) + dialect_length bytes long. A malformed header could make this read past the mapped file; consider bounds-checking against metapack_extension_size(view) (and the file size) before constructing the view.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

const std::size_t field_length)
-> std::string_view {
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-reinterpret-cast)
return {reinterpret_cast<const char *>(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explorer_extension_string constructs a std::string_view from field_offset/field_length without validating that the requested slice stays within the metapack extension buffer. If any length fields are corrupted, this can produce out-of-bounds views and undefined behavior when copied into std::string.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

header.health = health;
header.bytes = bytes;
header.dependencies = dependencies;
header.path_length = static_cast<std::uint16_t>(path.size());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_explorer_schema_extension truncates string sizes into std::uint16_t fields via static_cast, which will silently wrap for inputs > 65535 bytes and desync offsets on read. Consider validating each field length fits in 16 bits (or using a wider length type) before serializing.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

std::vector<std::uint8_t> result;
result.resize(sizeof(MetapackDialectExtension) + dialect.size());
MetapackDialectExtension header{};
header.dialect_length = static_cast<std::uint16_t>(dialect.size());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_dialect_extension truncates dialect.size() into a std::uint16_t without a bounds check, which could corrupt the extension if a very long dialect identifier ever appears. Consider asserting/validating dialect.size() <= UINT16_MAX before writing.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 issues found across 28 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/server/action_schema_search.h">

<violation number="1" location="src/server/action_schema_search.h:34">
P1: Validate `payload_start` before subtracting from `view.size()`. Without a bounds check, malformed metapack headers can cause size underflow and out-of-bounds reads.</violation>
</file>

<file name="src/index/explorer.h">

<violation number="1" location="src/index/explorer.h:235">
P2: Silent data corruption if any string exceeds 65535 bytes. The `static_cast<std::uint16_t>(size)` truncates without checking bounds, but the buffer is allocated using the real (untruncated) sizes. On read-back, the truncated lengths produce wrong offsets for every field after the overflowed one. Add an assertion or saturating check, e.g. `assert(path.size() <= UINT16_MAX)`.</violation>
</file>

<file name="src/metapack/metapack.cc">

<violation number="1" location="src/metapack/metapack.cc:180">
P1: `metapack_extension_offset` returns a non-zero offset without verifying that `offset + *extension_size_pointer` fits within `view.size()`. Callers like `metapack_extension<T>` will then dereference pointers into invalid memory if the declared extension size exceeds the actual file. Add a bounds check: `offset_of_extension_size + sizeof(uint32_t) + *extension_size_pointer <= view.size()`.</violation>

<violation number="2" location="src/metapack/metapack.cc:218">
P1: Validate `payload_offset + sizeof(uint32_t)` before reading `extension_size`; current ordering can dereference out of bounds on malformed headers.</violation>

<violation number="3" location="src/metapack/metapack.cc:243">
P1: `metapack_info` relies on `assert` to validate magic number and format version. In release builds (`NDEBUG` defined), these asserts are compiled out, so a corrupted file will cause undefined behavior instead of a clean error. Replace the asserts with runtime checks (throw or return an error), especially since this is called from the server where input may be untrusted.</violation>

<violation number="4" location="src/metapack/metapack.cc:266">
P1: Validate that `mime_length` fits in the `FileView` before constructing `std::string`; otherwise malformed headers can over-read memory.</violation>

<violation number="5" location="src/metapack/metapack.cc:279">
P1: Add bounds checks before and after reading `extension_size` in `metapack_payload_offset`; malformed headers can currently trigger out-of-bounds reads.</violation>
</file>

<file name="src/server/action_serve_metapack_file.h">

<violation number="1" location="src/server/action_serve_metapack_file.h:120">
P2: Validate `dialect_length` against the extension size before creating the string_view to avoid out-of-bounds reads on malformed metapack headers.</violation>

<violation number="2" location="src/server/action_serve_metapack_file.h:135">
P1: Check that `payload_start` is within the file size before computing `payload_size` to avoid underflow and out-of-bounds reads on malformed metapack headers.</violation>
</file>

<file name="src/index/generators.h">

<violation number="1" location="src/index/generators.h:53">
P2: Guard against dialect identifiers longer than 65,535 bytes before casting to `std::uint16_t`; otherwise the length is truncated and readers can mis-parse the extension payload.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

}
sourcemeta::core::FileView view{search_index};
const auto payload_start{metapack_payload_offset(view)};
const auto payload_size{view.size() - payload_start};
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Validate payload_start before subtracting from view.size(). Without a bounds check, malformed metapack headers can cause size underflow and out-of-bounds reads.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/server/action_schema_search.h, line 34:

<comment>Validate `payload_start` before subtracting from `view.size()`. Without a bounds check, malformed metapack headers can cause size underflow and out-of-bounds reads.</comment>

<file context>
@@ -27,18 +29,29 @@ static auto search(const std::filesystem::path &search_index,
-  }
+  sourcemeta::core::FileView view{search_index};
+  const auto payload_start{metapack_payload_offset(view)};
+  const auto payload_size{view.size() - payload_start};
+  // NOLINTNEXTLINE(cppcoreguidelines-pro-type-reinterpret-cast)
+  const std::string_view payload{
</file context>
Suggested change
const auto payload_size{view.size() - payload_start};
if (payload_start > view.size()) {
throw std::runtime_error("Invalid search index metapack header");
}
const auto payload_size{view.size() - payload_start};
Fix with Cubic


return MetapackInfo{.checksum_hex = std::move(checksum_hex),
.last_modified = time_point,
.mime = std::string{mime_data, header->mime_length},
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Validate that mime_length fits in the FileView before constructing std::string; otherwise malformed headers can over-read memory.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/metapack/metapack.cc, line 266:

<comment>Validate that `mime_length` fits in the `FileView` before constructing `std::string`; otherwise malformed headers can over-read memory.</comment>

<file context>
@@ -0,0 +1,285 @@
+
+  return MetapackInfo{.checksum_hex = std::move(checksum_hex),
+                      .last_modified = time_point,
+                      .mime = std::string{mime_data, header->mime_length},
+                      .encoding = header->encoding,
+                      .content_bytes = header->content_bytes,
</file context>
Fix with Cubic

assert(header->magic == METAPACK_MAGIC);

auto offset{sizeof(MetapackHeader) + header->mime_length};
const auto *extension_size{view.as<std::uint32_t>(offset)};
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Add bounds checks before and after reading extension_size in metapack_payload_offset; malformed headers can currently trigger out-of-bounds reads.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/metapack/metapack.cc, line 279:

<comment>Add bounds checks before and after reading `extension_size` in `metapack_payload_offset`; malformed headers can currently trigger out-of-bounds reads.</comment>

<file context>
@@ -0,0 +1,285 @@
+  assert(header->magic == METAPACK_MAGIC);
+
+  auto offset{sizeof(MetapackHeader) + header->mime_length};
+  const auto *extension_size{view.as<std::uint32_t>(offset)};
+  offset += sizeof(std::uint32_t) + *extension_size;
+
</file context>
Fix with Cubic

}

auto payload_offset{sizeof(MetapackHeader) + header->mime_length};
const auto *extension_size{view.as<std::uint32_t>(payload_offset)};
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Validate payload_offset + sizeof(uint32_t) before reading extension_size; current ordering can dereference out of bounds on malformed headers.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/metapack/metapack.cc, line 218:

<comment>Validate `payload_offset + sizeof(uint32_t)` before reading `extension_size`; current ordering can dereference out of bounds on malformed headers.</comment>

<file context>
@@ -0,0 +1,285 @@
+  }
+
+  auto payload_offset{sizeof(MetapackHeader) + header->mime_length};
+  const auto *extension_size{view.as<std::uint32_t>(payload_offset)};
+  payload_offset += sizeof(std::uint32_t) + *extension_size;
+
</file context>
Fix with Cubic

std::ostringstream contents;
contents << file.value().data.rdbuf();
const auto payload_start{sourcemeta::one::metapack_payload_offset(view)};
const auto payload_size{view.size() - payload_start};
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Check that payload_start is within the file size before computing payload_size to avoid underflow and out-of-bounds reads on malformed metapack headers.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/server/action_serve_metapack_file.h, line 135:

<comment>Check that `payload_start` is within the file size before computing `payload_size` to avoid underflow and out-of-bounds reads on malformed metapack headers.</comment>

<file context>
@@ -98,20 +115,34 @@ static auto action_serve_metapack_file(
-  std::ostringstream contents;
-  contents << file.value().data.rdbuf();
+  const auto payload_start{sourcemeta::one::metapack_payload_offset(view)};
+  const auto payload_size{view.size() - payload_start};
+  // NOLINTNEXTLINE(cppcoreguidelines-pro-type-reinterpret-cast)
+  const std::string contents{
</file context>
Suggested change
const auto payload_size{view.size() - payload_start};
if (payload_start > view.size()) {
json_error(request, response, sourcemeta::one::STATUS_NOT_FOUND,
"not-found", "There is nothing at this URL");
return;
}
const auto payload_size{view.size() - payload_start};
Fix with Cubic

}

auto metapack_info(const sourcemeta::core::FileView &view) -> MetapackInfo {
assert(view.size() >= sizeof(MetapackHeader) + sizeof(std::uint32_t));
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: metapack_info relies on assert to validate magic number and format version. In release builds (NDEBUG defined), these asserts are compiled out, so a corrupted file will cause undefined behavior instead of a clean error. Replace the asserts with runtime checks (throw or return an error), especially since this is called from the server where input may be untrusted.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/metapack/metapack.cc, line 243:

<comment>`metapack_info` relies on `assert` to validate magic number and format version. In release builds (`NDEBUG` defined), these asserts are compiled out, so a corrupted file will cause undefined behavior instead of a clean error. Replace the asserts with runtime checks (throw or return an error), especially since this is called from the server where input may be untrusted.</comment>

<file context>
@@ -0,0 +1,285 @@
+}
+
+auto metapack_info(const sourcemeta::core::FileView &view) -> MetapackInfo {
+  assert(view.size() >= sizeof(MetapackHeader) + sizeof(std::uint32_t));
+  const auto *header{view.as<MetapackHeader>()};
+  assert(header->magic == METAPACK_MAGIC);
</file context>
Fix with Cubic

return 0;
}

return offset_of_extension_size + sizeof(std::uint32_t);
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: metapack_extension_offset returns a non-zero offset without verifying that offset + *extension_size_pointer fits within view.size(). Callers like metapack_extension<T> will then dereference pointers into invalid memory if the declared extension size exceeds the actual file. Add a bounds check: offset_of_extension_size + sizeof(uint32_t) + *extension_size_pointer <= view.size().

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/metapack/metapack.cc, line 180:

<comment>`metapack_extension_offset` returns a non-zero offset without verifying that `offset + *extension_size_pointer` fits within `view.size()`. Callers like `metapack_extension<T>` will then dereference pointers into invalid memory if the declared extension size exceeds the actual file. Add a bounds check: `offset_of_extension_size + sizeof(uint32_t) + *extension_size_pointer <= view.size()`.</comment>

<file context>
@@ -0,0 +1,285 @@
+    return 0;
+  }
+
+  return offset_of_extension_size + sizeof(std::uint32_t);
+}
+
</file context>
Fix with Cubic

header.health = health;
header.bytes = bytes;
header.dependencies = dependencies;
header.path_length = static_cast<std::uint16_t>(path.size());
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Silent data corruption if any string exceeds 65535 bytes. The static_cast<std::uint16_t>(size) truncates without checking bounds, but the buffer is allocated using the real (untruncated) sizes. On read-back, the truncated lengths produce wrong offsets for every field after the overflowed one. Add an assertion or saturating check, e.g. assert(path.size() <= UINT16_MAX).

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/index/explorer.h, line 235:

<comment>Silent data corruption if any string exceeds 65535 bytes. The `static_cast<std::uint16_t>(size)` truncates without checking bounds, but the buffer is allocated using the real (untruncated) sizes. On read-back, the truncated lengths produce wrong offsets for every field after the overflowed one. Add an assertion or saturating check, e.g. `assert(path.size() <= UINT16_MAX)`.</comment>

<file context>
@@ -107,6 +109,159 @@ inflate_metadata(const sourcemeta::one::Configuration &configuration,
+  header.health = health;
+  header.bytes = bytes;
+  header.dependencies = dependencies;
+  header.path_length = static_cast<std::uint16_t>(path.size());
+  header.identifier_length = static_cast<std::uint16_t>(identifier.size());
+  header.base_dialect_length = static_cast<std::uint16_t>(base_dialect.size());
</file context>
Fix with Cubic

Comment on lines +120 to +128
const std::string_view dialect =
(dialect_ext != nullptr && dialect_ext->dialect_length > 0)
? std::string_view{reinterpret_cast<
const char *>(view.as<std::uint8_t>(
sourcemeta::one::metapack_extension_offset(
view) +
sizeof(MetapackDialectExtension))),
dialect_ext->dialect_length}
: std::string_view{};
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Validate dialect_length against the extension size before creating the string_view to avoid out-of-bounds reads on malformed metapack headers.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/server/action_serve_metapack_file.h, line 120:

<comment>Validate `dialect_length` against the extension size before creating the string_view to avoid out-of-bounds reads on malformed metapack headers.</comment>

<file context>
@@ -98,20 +115,34 @@ static auto action_serve_metapack_file(
-      write_link_header(response, dialect.to_string());
+    const auto *dialect_ext{
+        sourcemeta::one::metapack_extension<MetapackDialectExtension>(view)};
+    const std::string_view dialect =
+        (dialect_ext != nullptr && dialect_ext->dialect_length > 0)
+            ? std::string_view{reinterpret_cast<
</file context>
Suggested change
const std::string_view dialect =
(dialect_ext != nullptr && dialect_ext->dialect_length > 0)
? std::string_view{reinterpret_cast<
const char *>(view.as<std::uint8_t>(
sourcemeta::one::metapack_extension_offset(
view) +
sizeof(MetapackDialectExtension))),
dialect_ext->dialect_length}
: std::string_view{};
const auto extension_size{sourcemeta::one::metapack_extension_size(view)};
const std::string_view dialect =
(dialect_ext != nullptr && dialect_ext->dialect_length > 0 &&
extension_size >= sizeof(MetapackDialectExtension) &&
dialect_ext->dialect_length <=
extension_size - sizeof(MetapackDialectExtension))
? std::string_view{reinterpret_cast<const char *>(
view.as<std::uint8_t>(
sourcemeta::one::metapack_extension_offset(
view) +
sizeof(MetapackDialectExtension))),
dialect_ext->dialect_length}
: std::string_view{};
Fix with Cubic

std::vector<std::uint8_t> result;
result.resize(sizeof(MetapackDialectExtension) + dialect.size());
MetapackDialectExtension header{};
header.dialect_length = static_cast<std::uint16_t>(dialect.size());
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Guard against dialect identifiers longer than 65,535 bytes before casting to std::uint16_t; otherwise the length is truncated and readers can mis-parse the extension payload.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/index/generators.h, line 53:

<comment>Guard against dialect identifiers longer than 65,535 bytes before casting to `std::uint16_t`; otherwise the length is truncated and readers can mis-parse the extension payload.</comment>

<file context>
@@ -37,6 +39,23 @@
+  std::vector<std::uint8_t> result;
+  result.resize(sizeof(MetapackDialectExtension) + dialect.size());
+  MetapackDialectExtension header{};
+  header.dialect_length = static_cast<std::uint16_t>(dialect.size());
+  std::memcpy(result.data(), &header, sizeof(header));
+  std::memcpy(result.data() + sizeof(header), dialect.data(), dialect.size());
</file context>
Fix with Cubic

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Index (community)

Details
Benchmark suite Current: 5124c39 Previous: 9a6fa19 Ratio
Add one schema (0 existing) 21 ms 22 ms 0.95
Add one schema (100 existing) 32 ms 40 ms 0.80
Add one schema (1000 existing) 159 ms 197 ms 0.81
Add one schema (10000 existing) 1670 ms 1987 ms 0.84
Update one schema (1 existing) 19 ms 20 ms 0.95
Update one schema (101 existing) 33 ms 37 ms 0.89
Update one schema (1001 existing) 154 ms 188 ms 0.82
Update one schema (10001 existing) 1886 ms 1955 ms 0.96
Cached rebuild (1 existing) 11 ms 10 ms 1.10
Cached rebuild (101 existing) 12 ms 13 ms 0.92
Cached rebuild (1001 existing) 31 ms 30 ms 1.03
Cached rebuild (10001 existing) 238 ms 232 ms 1.03
Index 100 schemas 161 ms 157 ms 1.03
Index 1000 schemas 1241 ms 1327 ms 0.94
Index 10000 schemas 15878 ms 16889 ms 0.94

This comment was automatically generated by workflow using github-action-benchmark.

@jviotti jviotti changed the title Migrate .metapack headers into a memory format Migrate .metapack headers into a binary format Mar 18, 2026
@jviotti jviotti merged commit d19337d into main Mar 18, 2026
6 checks passed
@jviotti jviotti deleted the new-metapack branch March 18, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant