HDDS-14183. Attempted to decrement available space to a negative value by sarvekshayr · Pull Request #9655 · apache/ozone

sarvekshayr · 2026-01-21T08:28:49Z

What changes were proposed in this pull request?

Saw this warning when datanode disk was nearly full:

2025-12-15 10:37:20,903 WARN [166c5ca8-343e-46ed-b619-84ac193e0069-ChunkReader-215]-org.apache.hadoop.hdds.fs.CachingSpaceUsageSource: Attempted to decrement available space to a negative value. Current: 0, Decrement: 1048576, Source: /ipdr_ozone31/hadoop-ozone/datanode/data

Prior to this message, there were many failed writes. Perhaps it needs to increment the value when the write fails.

The fix adds rollback logic in KeyValueHandler.handleWriteChunk() that tracks when a chunk write succeeds and increments the usedSpace counter. If any subsequent operation fails, the exception handler calls volume.decrementUsedSpace() to restore the counter.

What is the link to the Apache JIRA

HDDS-14183

How was this patch tested?

CI: https://github.com/sarvekshayr/ozone/actions/runs/21200210393

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java

yandrey321 · 2026-01-23T15:30:59Z

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java

+  /**
+   * Commit space reserved for write to usedSpace when write operation succeeds.
+   */
+  private void commitSpaceReservedForWrite(HddsVolume volume, boolean spaceReserved, long bytes) {


can commit happen when space is not reserved?

Updated the logic to make the flow clearer by moving the spaceReserved check to the call site. This makes it explicit that the commit operation is invoked only when space has actually been reserved.

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java

github-actions · 2026-02-19T00:09:11Z

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

siddhantsangwan

It seems to me that the root cause is not bad exception handling, but actually not implementing idempotency correctly in the overwrite case.

In FilePerBlockStrategy#writeChunk, containerData.updateWriteStats(chunkLength, overwrite) is called at the very end. At that point any exception is already thrown. So it's correct to update write stats at that point.

And updateWriteStats always calls incrWriteBytes(bytesWritten), which always calls volume incrementUsedSpace(bytes):

  public void updateWriteStats(long bytesWritten, boolean overwrite) {
    getStatistics().updateWrite(bytesWritten, overwrite);
    incrWriteBytes(bytesWritten); <--------------- this, always called
  }

  private void incrWriteBytes(long bytes) {
    this.getVolume().incrementUsedSpace(bytes); <-------------- this, always called
    long bytesUsedBeforeWrite = getBytesUsed() - bytes;
    long availableSpaceBeforeWrite = getMaxSize() - bytesUsedBeforeWrite;
    if (committedSpace && availableSpaceBeforeWrite > 0) {
      long decrement = Math.min(bytes, availableSpaceBeforeWrite);
      this.getVolume().incCommittedBytes(-decrement);
    }
  }

So used space is increased and available space is decreased even on overwrite. For overwrites that don't actually grow the file on disk, this is wrong and it's probably what leads to available space being decreased more than what's possible.

For example:

chunkLength = 1,048,576 (1MB),
block file already length 1,048,576,
retry writes same chunk at offset = 0 (overwrite = true),
so after write, file length still 1,048,576 (no growth). But used space is increased and available space is decreased incorrectly.

So this is what needs to be fixed. I also suggest trying to keep it simple as we are already using the term reserved space in other parts of the code, and that means something else. This area is pretty complex at this point!

siddhantsangwan · 2026-02-27T07:37:15Z

...vice/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/FilePerBlockStrategy.java

    if (overwrite) {
      long fileLengthAfterWrite = offset + chunkLength;
      if (fileLengthAfterWrite > fileLengthBeforeWrite) {
-        containerData.getStatistics().updateWrite(fileLengthAfterWrite - fileLengthBeforeWrite, false);
+        containerData.updateWriteStats(fileLengthAfterWrite - fileLengthBeforeWrite, false);
      }
    }

    containerData.updateWriteStats(chunkLength, overwrite);


This still seems incorrect to me, as in the overwrite case it would end up calling updateWriteStats twice. Can you check?

updateWriteStats handles both updateWrite and incrWriteBytes.

I replaced updateWrite with updateWriteStats to ensure that during an overwrite, any change in file length is captured as a delta update in updateWrite and incrWriteBytes as well.

Yes, but in the overwrite case it's getting called once at line 191 and again at line 195.

HDDS-14183. Attempted to decrement available space to a negative value

6310ef3

sarvekshayr requested a review from jojochuang January 21, 2026 08:28

yandrey321 reviewed Jan 21, 2026

View reviewed changes

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java Outdated Show resolved Hide resolved

sreejasahithi reviewed Jan 22, 2026

View reviewed changes

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java Outdated Show resolved Hide resolved

Introduce space reserved and handle committedBytes

7e8bb92

sreejasahithi reviewed Jan 23, 2026

View reviewed changes

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java Show resolved Hide resolved

yandrey321 reviewed Jan 23, 2026

View reviewed changes

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java Outdated Show resolved Hide resolved

sarvekshayr added 2 commits January 28, 2026 11:06

Add test coverage and incrementUsedSpace first

dbfefc2

Fixed checkstyle issues

4f3ab02

devabhishekpal assigned sarvekshayr Jan 28, 2026

devabhishekpal added the bug Something isn't working label Jan 28, 2026

github-actions bot added the stale label Feb 19, 2026

Merge branch 'master' into HDDS-14183

4552de4

sarvekshayr requested review from siddhantsangwan February 19, 2026 07:57

github-actions bot removed the stale label Feb 20, 2026

siddhantsangwan requested changes Feb 23, 2026

View reviewed changes

Check if file length increased during overwrite and increment usedSpace

72882b7

siddhantsangwan reviewed Feb 27, 2026

View reviewed changes

Update values correctly on overwrite

7c8c863

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-14183. Attempted to decrement available space to a negative value#9655

HDDS-14183. Attempted to decrement available space to a negative value#9655
sarvekshayr wants to merge 7 commits intoapache:masterfrom
sarvekshayr:HDDS-14183

sarvekshayr commented Jan 21, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yandrey321 Jan 23, 2026

Uh oh!

sarvekshayr Jan 28, 2026

Uh oh!

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

siddhantsangwan left a comment

Uh oh!

siddhantsangwan Feb 27, 2026

Uh oh!

sarvekshayr Feb 27, 2026

Uh oh!

siddhantsangwan Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

sarvekshayr commented Jan 21, 2026

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yandrey321 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

sarvekshayr Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

siddhantsangwan left a comment

Choose a reason for hiding this comment

Uh oh!

siddhantsangwan Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

sarvekshayr Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

siddhantsangwan Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants