Hi,
For my applications, I need sample precision when I extract subsets of the data at a given time using the get_waveforms method of an ASDFDataSet instance. However, I observed discrepancies of up to 1 sample between the start time I requested and the start time that was returned.
I tracked the issue down to _get_idx_and_size_estimate of asdf_data_set.py. The line:
offset = max(0, int((starttime - data_starttime) // dt ))
produces an undesirable output when (starttime - data_starttime) / dt = X + 0.999999. Instead of returning offset = X + 1 samples, it returns offset = X samples due to the floating point number imprecision. This happens of course because of the behavior of int, which rounds to the nearest lower integer number.
A solution that works for me is to add a number a < 1 to (starttime - data_starttime) / dt before converting it to an integer. If this number is a = 0.5, the following line:
offset = max(0, int((starttime - data_starttime) / dt + 0.5))
actually rounds the time to the nearest (lower or upper) integer number. See this commit: ebeauce@4437051
I guess that some people would want to be able to always round to the nearest upper or the nearest lower integer, so it may be worth adding a key-word argument to get_waveforms similarly to the slice method of Obspy Stream. My opinion is that rounding to the nearest sample would be the preferable default behavior.
Thank you,
Eric
Hi,
For my applications, I need sample precision when I extract subsets of the data at a given time using the
get_waveformsmethod of anASDFDataSetinstance. However, I observed discrepancies of up to 1 sample between the start time I requested and the start time that was returned.I tracked the issue down to
_get_idx_and_size_estimateof asdf_data_set.py. The line:produces an undesirable output when
(starttime - data_starttime) / dt = X + 0.999999. Instead of returningoffset = X + 1samples, it returnsoffset = Xsamples due to the floating point number imprecision. This happens of course because of the behavior ofint, which rounds to the nearest lower integer number.A solution that works for me is to add a number
a < 1to(starttime - data_starttime) / dtbefore converting it to an integer. If this number isa = 0.5, the following line:actually rounds the time to the nearest (lower or upper) integer number. See this commit: ebeauce@4437051
I guess that some people would want to be able to always round to the nearest upper or the nearest lower integer, so it may be worth adding a key-word argument to
get_waveformssimilarly to theslicemethod of ObspyStream. My opinion is that rounding to the nearest sample would be the preferable default behavior.Thank you,
Eric