Add Granular Metrics to RepartitionExec

### Is your feature request related to a problem or challenge?

There has been notice that `RepartitionExec` is quite expensive in certain queries / scenarios recently:
- 20-30x slower on certain array types (internally at Datadog)
- weird behavior in distributed-datafusion on network shuffles depending on the number of output tasks (https://github.com/datafusion-contrib/datafusion-distributed/issues/385)

It has been difficult to investigate / isolate the reason for this due to lack of granularity of metrics provided in the `RepartitionExec` operator. As of now we are only provided:
- `send_time`: time spent pulling the next batch from input stream (mixed spill, channel send, etc.)
- `repartition_time`: big bucket for repartition work (mixed routing and rebuilding batches from routed indices)
- `fetch_time`: per output partition, covered the whole public batch path

### Describe the solution you'd like

I would like to introduce more granular metrics that will isolate where repartition is spending its time:
- `fetch_time`: unchanged
- `repartition_time`: now the end-to-end total repartition time
- `route_time`: the time to distribute row indices to output partitions
- `batch_build_time`: the time to build the record batches
- `channel_wait_time`: per output partition, the time waiting for channel capacity / send(...) to complete
- `spill_write_time`: per output partition, the time writing spilled batches
- `spill_read_wait_time`: per output partition, time the consumer side waits for a spilled batch to become readable



### Describe alternatives you've considered

I have considered other metrics but want to leave hot-path / overhead as small as possible for collection while still gaining good insight into the operator

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Granular Metrics to RepartitionExec #21148

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Granular Metrics to RepartitionExec #21148

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions