HDDS-14183. Attempted to decrement available space to a negative value#9655
HDDS-14183. Attempted to decrement available space to a negative value#9655sarvekshayr wants to merge 7 commits intoapache:masterfrom
Conversation
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Show resolved
Hide resolved
| /** | ||
| * Commit space reserved for write to usedSpace when write operation succeeds. | ||
| */ | ||
| private void commitSpaceReservedForWrite(HddsVolume volume, boolean spaceReserved, long bytes) { |
There was a problem hiding this comment.
can commit happen when space is not reserved?
There was a problem hiding this comment.
Updated the logic to make the flow clearer by moving the spaceReserved check to the call site. This makes it explicit that the commit operation is invoked only when space has actually been reserved.
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
siddhantsangwan
left a comment
There was a problem hiding this comment.
It seems to me that the root cause is not bad exception handling, but actually not implementing idempotency correctly in the overwrite case.
In FilePerBlockStrategy#writeChunk, containerData.updateWriteStats(chunkLength, overwrite) is called at the very end. At that point any exception is already thrown. So it's correct to update write stats at that point.
And updateWriteStats always calls incrWriteBytes(bytesWritten), which always calls volume incrementUsedSpace(bytes):
public void updateWriteStats(long bytesWritten, boolean overwrite) {
getStatistics().updateWrite(bytesWritten, overwrite);
incrWriteBytes(bytesWritten); <--------------- this, always called
}
private void incrWriteBytes(long bytes) {
this.getVolume().incrementUsedSpace(bytes); <-------------- this, always called
long bytesUsedBeforeWrite = getBytesUsed() - bytes;
long availableSpaceBeforeWrite = getMaxSize() - bytesUsedBeforeWrite;
if (committedSpace && availableSpaceBeforeWrite > 0) {
long decrement = Math.min(bytes, availableSpaceBeforeWrite);
this.getVolume().incCommittedBytes(-decrement);
}
}
So used space is increased and available space is decreased even on overwrite. For overwrites that don't actually grow the file on disk, this is wrong and it's probably what leads to available space being decreased more than what's possible.
For example:
chunkLength = 1,048,576 (1MB),
block file already length 1,048,576,
retry writes same chunk at offset = 0 (overwrite = true),
so after write, file length still 1,048,576 (no growth). But used space is increased and available space is decreased incorrectly.
So this is what needs to be fixed. I also suggest trying to keep it simple as we are already using the term reserved space in other parts of the code, and that means something else. This area is pretty complex at this point!
| if (overwrite) { | ||
| long fileLengthAfterWrite = offset + chunkLength; | ||
| if (fileLengthAfterWrite > fileLengthBeforeWrite) { | ||
| containerData.getStatistics().updateWrite(fileLengthAfterWrite - fileLengthBeforeWrite, false); | ||
| containerData.updateWriteStats(fileLengthAfterWrite - fileLengthBeforeWrite, false); | ||
| } | ||
| } | ||
|
|
||
| containerData.updateWriteStats(chunkLength, overwrite); |
There was a problem hiding this comment.
This still seems incorrect to me, as in the overwrite case it would end up calling updateWriteStats twice. Can you check?
There was a problem hiding this comment.
updateWriteStats handles both updateWrite and incrWriteBytes.
I replaced updateWrite with updateWriteStats to ensure that during an overwrite, any change in file length is captured as a delta update in updateWrite and incrWriteBytes as well.
There was a problem hiding this comment.
Yes, but in the overwrite case it's getting called once at line 191 and again at line 195.
What changes were proposed in this pull request?
Saw this warning when datanode disk was nearly full:
Prior to this message, there were many failed writes. Perhaps it needs to increment the value when the write fails.
The fix adds rollback logic in
KeyValueHandler.handleWriteChunk()that tracks when a chunk write succeeds and increments theusedSpacecounter. If any subsequent operation fails, the exception handler callsvolume.decrementUsedSpace()to restore the counter.What is the link to the Apache JIRA
HDDS-14183
How was this patch tested?
CI: https://github.com/sarvekshayr/ozone/actions/runs/21200210393