Skip to content

Add support for newlin-delimited JSON (NDJSON) #194

@amanjeev

Description

@amanjeev

Summary

#188 as an exercise showed that the feature to work with newline-delimited JSON (NDJSON) is not implemented in this crate.

Why

This feature is helpful when you have large number of records but each of those records are small JSON objects per line. This is often the case with large JSON files and looping over them and calling simd-json on each line is not going to help. This is added by @Licenser in this comment:

Ja the lines are fairly short too the advantages are a lot smaller (sometimes detrimental) as there is an initial cost to pay for filling the registers, doing multiple runs etc. can overshadow the performance gain for very small payloads.

@Licenser also adds

NDJSON would be incredibly cool (especially if we manage to realize in a streaming fashion / as an iterator)

What

Upstream simdjson has this feature called parse_many. Porting that to this crate is the first step.

!!!NEEDS MORE DETAILS!!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    perfPerformance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions