encode: prevent appending .0 to scientific notation to le bucket labels#261
encode: prevent appending .0 to scientific notation to le bucket labels#261Pigueiras wants to merge 1 commit intofluent:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes Prometheus exposition encoding for histogram bucket (le) label values by preventing the encoder from producing malformed scientific-notation strings (e.g., 1e+06.0) that Prometheus rejects.
Changes:
- Update
bucket_value_to_string()to append.0only when the formatted value is neither decimal nor scientific notation. - Add clarifying comment documenting the
.0-append rule.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * Append .0 only when there is no decimal point and the number | ||
| * is not in scientific notation. | ||
| */ | ||
| if (!strchr(str, '.') && !strchr(str, 'e')) { |
There was a problem hiding this comment.
The check for scientific notation only looks for lowercase 'e'. While %g typically uses 'e', this becomes fragile if the formatting ever changes (e.g., %G) or if a platform emits an uppercase exponent. Consider using a character-class check (e.g., treating either 'e' or 'E' as scientific notation) so .0 is never appended to exponent-form values.
| if (!strchr(str, '.') && !strchr(str, 'e')) { | |
| if (!strchr(str, '.') && !strchr(str, 'e') && !strchr(str, 'E')) { |
| len = snprintf(str, 64, "%g", val); | ||
| cfl_sds_len_set(str, len); | ||
|
|
||
| if (!strchr(str, '.')) { | ||
| /* | ||
| * Append .0 only when there is no decimal point and the number | ||
| * is not in scientific notation. | ||
| */ | ||
| if (!strchr(str, '.') && !strchr(str, 'e')) { | ||
| cfl_sds_cat_safe(&str, ".0", 2); | ||
| } |
There was a problem hiding this comment.
bucket_value_to_string() will still append .0 to non-finite printf outputs like inf/nan (which contain neither . nor e), producing values like inf.0/nan.0 that are not parseable as floats in Prometheus label values. Since histogram buckets can be user-provided without an isfinite validation, consider explicitly handling non-finite val (e.g., reject, or map infinities to +Inf/-Inf, and avoid appending .0 for nan/inf).
| /* | ||
| * Append .0 only when there is no decimal point and the number | ||
| * is not in scientific notation. | ||
| */ | ||
| if (!strchr(str, '.') && !strchr(str, 'e')) { | ||
| cfl_sds_cat_safe(&str, ".0", 2); |
There was a problem hiding this comment.
This change fixes a Prometheus parsing failure for large bucket boundaries, but there doesn’t appear to be a regression test covering scientific-notation le (e.g., a histogram bucket at >= 1e6 that should encode as 1e+06 and remain parseable). Adding a test that encodes and then decodes Prometheus text (or validates the encoded string) would prevent this from reoccurring.
cosmo0920
left a comment
There was a problem hiding this comment.
How about adding unit test to confirm this fix?
It's not mandatory for every change but we encourage to write it.
`bucket_value_to_string` used %g to format histogram bucket boundaries, which switches to scientific notation for values >= 1e6. The subsequent check for a missing decimal point would then append ".0", producing malformed `le` labels like "1e+06.0" that Prometheus rejects with: ``` bucket label "le" is missing or has a malformed value of "1e+08.0" ``` Skip the ".0" suffix when the string already contains 'e', since scientific notation is valid in Prometheus `le` labels without it. Signed-off-by: Luis Pigueiras <luis.pigueiras@cern.ch>
I've added one but tbh I'm not sure if it's the correct place for it, if you think this should go somewhere else I would appreciate some guidance on it 😄 |
|
The fix looks good and the new test seems fine where it is. One remaining edge case: bucket_value_to_string() still appends .0 to %g outputs like inf/nan, and histogram buckets do not seem to reject non-finite bounds, so this can still generate invalid le labels. |
bucket_value_to_stringused %g to format histogram bucket boundaries, which switches to scientific notation for values >= 1e6. The subsequent check for a missing decimal point would then append ".0", producing malformedlelabels like "1e+06.0" that Prometheus rejects with:Skip the ".0" suffix when the string already contains 'e', since scientific notation is valid in Prometheus
lelabels without it.