Skip to content

Add PforEncoding (Patched FOR) — claiming EncodingType::Pfor = 15#728

Open
han-yan01 wants to merge 1 commit into
facebookincubator:mainfrom
han-yan01:export-D105270894
Open

Add PforEncoding (Patched FOR) — claiming EncodingType::Pfor = 15#728
han-yan01 wants to merge 1 commit into
facebookincubator:mainfrom
han-yan01:export-D105270894

Conversation

@han-yan01
Copy link
Copy Markdown
Contributor

Summary:
PFOR is a sparse-outlier patched Frame-of-Reference encoder ported from MRS's AusList SOTA encoder portfolio (D97646777). Bitpacks the bottom 90th-percentile of residuals at base bitwidth, stores the top 10% as (position, full value) exception patches. Wins on columns with rare large outliers where a pure FOR would inflate bitwidth to cover the worst case.

Wire format: [encType:1B][dataType:1B][rowCount:4B][min:GV32][baseBitWidth:1B][numExceptions:varint][exceptionPositions:varint[]][exceptionValues:varint[]][bitpacked baseResiduals][7B zero-pad tail]

Decode follows the post-D98819389-style optimization: pass 1 bitunpacks all residuals via byte-aligned loadU64; pass 2 patches the exceptions in.

Slot allocation: claiming EncodingType::Pfor = 15. Per WS8 audit (T271476729), slots 12/13/14 are contested by 4+ in-flight diffs (D103045783 ChunkedBitPacking=12, D103975634 FOR=13/SubIntSplit=14, D105064770/D105134978 Fsst=12/14, plus implied ChunkedALP=13). 15 is the next free slot at time of commit. If a contending diff lands first claiming 15, this diff will need rebase to next free.

Initial readFactor: 1.5 (conservative). Prevents auto-selection by ManualEncodingSelectionPolicy until baked in. Mirrors ChunkedBitPacking pattern.

Stale-binary risk: xldb/disco/user_experiments/util/WriteConfigurationUtil.cpp persists EncodingType through Configerator. Older readers will throw NIMBLE_UNREACHABLE on EncodingType::Pfor=15 until they pick up this build — fail-loud, not silent.

Differential Revision: D105270894

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 15, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 15, 2026

@han-yan01 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D105270894.

@han-yan01 han-yan01 changed the title Add PforEncoding (Patched FOR) — claiming EncodingType::Pfor = 15 [WIP]Add PforEncoding (Patched FOR) — claiming EncodingType::Pfor = 15 May 15, 2026
@meta-codesync meta-codesync Bot changed the title [WIP]Add PforEncoding (Patched FOR) — claiming EncodingType::Pfor = 15 Add PforEncoding (Patched FOR) — claiming EncodingType::Pfor = 15 May 20, 2026
@han-yan01 han-yan01 force-pushed the export-D105270894 branch from 9395ab1 to f52adf5 Compare May 20, 2026 22:43
@meta-codesync meta-codesync Bot changed the title Add PforEncoding (Patched FOR) — claiming EncodingType::Pfor = 15 feat(nimble): Add PforEncoding (Patched FOR) claiming EncodingType::Pfor=15 May 21, 2026
@han-yan01 han-yan01 force-pushed the export-D105270894 branch from f52adf5 to e913f96 Compare May 21, 2026 17:56
Summary:
PFOR is a sparse-outlier patched Frame-of-Reference encoder ported from MRS's AusList SOTA encoder portfolio (D97646777). Bitpacks the bottom 90th-percentile of residuals at base bitwidth, stores the top 10% as (position, full value) exception patches. Wins on columns with rare large outliers where a pure FOR would inflate bitwidth to cover the worst case.

Wire format: [encType:1B][dataType:1B][rowCount:4B][min:GV32][baseBitWidth:1B][numExceptions:varint][exceptionPositions:varint[]][exceptionValues:varint[]][bitpacked baseResiduals][7B zero-pad tail]

Decode follows the post-D98819389-style optimization: pass 1 bitunpacks all residuals via byte-aligned loadU64; pass 2 patches the exceptions in.

Slot allocation: claiming EncodingType::Pfor = 15. Per WS8 audit (T271476729), slots 12/13/14 are contested by 4+ in-flight diffs (D103045783 ChunkedBitPacking=12, D103975634 FOR=13/SubIntSplit=14, D105064770/D105134978 Fsst=12/14, plus implied ChunkedALP=13). 15 is the next free slot at time of commit. If a contending diff lands first claiming 15, this diff will need rebase to next free.

Initial readFactor: 1.5 (conservative). Prevents auto-selection by ManualEncodingSelectionPolicy until baked in. Mirrors ChunkedBitPacking pattern.

Stale-binary risk: xldb/disco/user_experiments/util/WriteConfigurationUtil.cpp persists EncodingType through Configerator. Older readers will throw NIMBLE_UNREACHABLE on EncodingType::Pfor=15 until they pick up this build — fail-loud, not silent.

Differential Revision: D105270894
@meta-codesync meta-codesync Bot changed the title feat(nimble): Add PforEncoding (Patched FOR) claiming EncodingType::Pfor=15 Add PforEncoding (Patched FOR) — claiming EncodingType::Pfor = 15 May 22, 2026
@han-yan01 han-yan01 force-pushed the export-D105270894 branch from e913f96 to 0d2047e Compare May 22, 2026 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant