The call_both command is useful in pre-prod for telling you whether the refactored path behaves the same way extrinsically, but it'd be super neat if it also had a benchmark option for recording the duration (individually and in aggregate) of each code path