Track individual column sizes in `Statistics`

### Is your feature request related to a problem or challenge?

In #19094 we are going to fix incorrect `total_byte_size` calculations for `Statistics` by making them `Inexact` / `Absent` when we can't actually calculate the size of the data. While this is more correct, it would be nice if we could calculate scan sizes, etc. in more scenarios. In particular, we cannot calculate the scan sizes of variable length columns (e.g. `Utf8`) from just the type and number of rows.

To address this I propose we add `ColumnStatistics { scan_byte_size: Precision<usize>, ... }` which can be populated by the file format e.g. because we know that the in-memory Arrow size is the same as the Parquet uncompressed size of the Parquet column for `Utf8View`. I don't know in how many cases we'll be able to derive this information without reading the data but I think in some cases we should be able to.

Then once we have this we can track the total scan size through projections, limits, etc.

### Describe the solution you'd like

_No response_

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Track individual column sizes in `Statistics` #19098

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Track individual column sizes in Statistics #19098

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Track individual column sizes in `Statistics` #19098