Consuming code unit sequences from a streaming source may result in attempts to decode a partial code unit sequence. At present, an exception will be thrown when such underflow occurs. An alternative would be to store the partial code unit sequence in the iterator state and then have the iterator compare equally to the end iterator. This would enable code like the following to work correctly even if buffer ends fail to fall on a code unit sequence boundary.
using encoding = utf8_encoding;
auto state = encoding::initial_state();
do {
std::string b = get_more_data();
auto tv = make_text_view<utf8_encoding>(state, begin(b), end(b));
auto tv_it = begin(tv);
while (tv_it != end(tv))
...;
state = tv_it; // Trailing state is in tv_it, preserve it
// to seed state for the next iteration.
} while(!b.empty());
A problem with this approach is that it leaves open the possibility for trailing code units (e.g., garbage at the end of the encoded text) to go unnoticed. Because of this, the behavior above probably shouldn't be the default behavior, but it should be possible for code to opt in to it; perhaps via a policy class as suggested in #14.
Consuming code unit sequences from a streaming source may result in attempts to decode a partial code unit sequence. At present, an exception will be thrown when such underflow occurs. An alternative would be to store the partial code unit sequence in the iterator state and then have the iterator compare equally to the end iterator. This would enable code like the following to work correctly even if buffer ends fail to fall on a code unit sequence boundary.
A problem with this approach is that it leaves open the possibility for trailing code units (e.g., garbage at the end of the encoded text) to go unnoticed. Because of this, the behavior above probably shouldn't be the default behavior, but it should be possible for code to opt in to it; perhaps via a policy class as suggested in #14.