Skip to content

HDFS-17805. Separate flushNanos and syncNanos in flushOrSync slow log#8315

Open
balodesecurity wants to merge 1 commit intoapache:trunkfrom
balodesecurity:HDFS-17805
Open

HDFS-17805. Separate flushNanos and syncNanos in flushOrSync slow log#8315
balodesecurity wants to merge 1 commit intoapache:trunkfrom
balodesecurity:HDFS-17805

Conversation

@balodesecurity
Copy link

Problem

In BlockReceiver.flushOrSync(), flush and sync durations are accumulated into a single flushTotalNanos counter. When the total duration exceeds the slow-IO threshold, the WARN log only reports the combined value:

Slow flushOrSync took 120ms ..., flushTotalNanos=120000000ns

This makes it impossible to tell whether the latency originates from the flush step or the fsync step, hindering production diagnosis.

Fix

Track flush and sync durations in separate counters (flushTotalNanos, syncTotalNanos). The slow-IO WARN log now reports them independently:

Slow flushOrSync took 120ms ..., flushNanos=5000000ns, syncNanos=115000000ns

This lets operators immediately determine whether a bottleneck is in the page-cache flush or the disk fsync.

Testing

  • Added TestBlockReceiverSlowLog#testFlushOrSyncSlowLogContainsSeparateFlushAndSyncNanos: starts a single-DN MiniDFSCluster with slow-IO threshold set to 0 ms (triggers the log on every call), writes a file and calls hsync(), captures the WARN log output, and asserts both flushNanos= and syncNanos= are present.
  • Test passes locally.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 19m 27s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 47s trunk passed
+1 💚 compile 1m 42s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 compile 1m 48s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 checkstyle 1m 48s trunk passed
+1 💚 mvnsite 1m 54s trunk passed
+1 💚 javadoc 1m 27s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 1m 29s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 4m 25s trunk passed
+1 💚 shadedclient 36m 41s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 25s the patch passed
+1 💚 compile 1m 18s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javac 1m 18s the patch passed
+1 💚 compile 1m 20s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 javac 1m 19s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 18s the patch passed
+1 💚 mvnsite 1m 28s the patch passed
+1 💚 javadoc 0m 58s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 1m 2s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 4m 4s the patch passed
+1 💚 shadedclient 36m 37s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 258m 54s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 45s The patch does not generate ASF License warnings.
424m 15s
Reason Tests
Failed junit tests hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints
Subsystem Report/Notes
Docker ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8315/1/artifact/out/Dockerfile
GITHUB PR #8315
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 88de817864bd 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cc1ef3e
Default Java Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8315/1/testReport/
Max. process+thread count 2419 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8315/1/console
versions git=2.43.0 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants