Refactor existing logs and add logs by hellolittlej · Pull Request #840 · Netflix/mantis

hellolittlej · 2026-03-29T06:13:07Z

Context

Not all scheduling constraints had enough workers available to fulfill the request
ResourceClusterActor.TaskExecutorBatchAssignmentRequest(allocationRequests=[TaskExecutorAllocationRequest(workerId=kafka-cluster-monitor-21-worker-18-168,
constraints=SchedulingConstraints(machineDefinition=MachineDefinition{cpuCores=2.0, memoryMB=14336.0, networkMbps=700.0, diskMB=65536.0, numPorts=1}, sizeName=Optional.empty,
 schedulingAttributes={jdk=17plus, jenkins_job=unknown, repo_name=corp/kafka-mantis-kafka-monitor}), jobMetadata=io.mantisrx.server.core.domain.JobMetadata@3240fd30,
stageNum=1, readyAt=-1, durationType=Perpetual)], clusterID=ClusterID(resourceID=mantisrc.kaasall),
reservation=Reservation(key=MantisResourceClusterReservationProto.ReservationKey(jobId=kafka-cluster-monitor-21, stageNumber=1),
schedulingConstraints=SchedulingConstraints(machineDefinition=MachineDefinition{cpuCores=2.0, memoryMB=14336.0, networkMbps=700.0, diskMB=65536.0, numPorts=1},
sizeName=Optional.empty, schedulingAttributes={jdk=17plus, jenkins_job=unknown, repo_name=corp/kafka-mantis-kafka-monitor}),
canonicalConstraintKey=md:2.0/14336.0/65536.0/700.0/1;size=~;attr=jdk=17plus,jenkins_job=unknown,repo_name=corp/kafka-mantis-kafka-monitor,, stageTargetSize=35,
priority=MantisResourceClusterReservationProto.ReservationPriority(type=REPLACE, tier=0, timestamp=1774746017312), createdAt=1774746017312))

currently logs are way too long to read, it basically just saying we don't have enough worker to fulfill the request that request for one single worker, and we don't need all these machineDefinition=MachineDefinition{cpuCores=2.0, memoryMB=14336.0, networkMbps=700.0, diskMB=65536.0, numPorts=1 to be part of the details.

Besides, we don't have logs to explain why we can't find the TE for the worker even though the scheduler sees there are 2 idle TE, adding logs to show details why TE not selected for the worker

Checklist

./gradlew build compiles code correctly
Added new tests where applicable
./gradlew test passes all tests
Extended README or added javadocs where applicable

github-actions · 2026-03-29T06:19:10Z

Test Results

777 tests ±0 765 ✅ - 1 11m 17s ⏱️ +44s
162 suites ±0 11 💤 ±0
162 files ±0 1 ❌ +1

For more details on these failures, see this check.

Results for commit 54080a3. ± Comparison against base commit 047cb84.

♻️ This comment has been updated with latest results.

Andyz26 · 2026-03-29T18:31:13Z

...-plane-server/src/main/java/io/mantisrx/master/resourcecluster/ExecutorStateManagerImpl.java

+                    // TODO: turn these two debug level 
+                    log.info("findBestFitFor: TE {} excluded - not in stateMap", teHolder.getId());
                    return false;
                }
                if (currentBestFit.contains(teHolder.getId())) {
+                    log.info("findBestFitFor: TE {} excluded - already in bestFit", teHolder.getId());


These logs will be too chatty on main agent pool on every schedule request. Maybe convert to metrics and use TE id as tag instead.

turn these into warn level for now.

...-plane-server/src/main/java/io/mantisrx/master/resourcecluster/ExecutorStateManagerImpl.java

Andyz26 · 2026-03-29T18:48:04Z

...-plane-server/src/main/java/io/mantisrx/master/resourcecluster/ExecutorStateManagerImpl.java

+                }
+                return true;
+            })
+            .collect(Collectors.toList());


double collect in this function btw. perf punishment.

we can revert it back after we diagnose

Andyz26 · 2026-03-29T18:49:24Z

...-plane-server/src/main/java/io/mantisrx/master/resourcecluster/ExecutorStateManagerImpl.java


        if (noResourcesAvailable) {
-            log.warn("Not all scheduling constraints had enough workers available to fulfill the request {}", request);
+            log.warn("Not all scheduling constraints had enough workers for jobId={}, cluster={}",


i think you still need workerId + schedulingConstraint info

worker id and constraint info already logged at findTaskExecutorsFor before coming into this log line.

I can put constraint again in this log line.
worker id we can't output here because it's in the array nested fields.

hellolittlej requested review from Andyz26, calvin681, dtrager02, fdc-ntflx and james-lubin as code owners March 29, 2026 06:13

Andyz26 reviewed Mar 29, 2026

View reviewed changes

...-plane-server/src/main/java/io/mantisrx/master/resourcecluster/ExecutorStateManagerImpl.java Outdated Show resolved Hide resolved

Refactor existing logs and add logs

54080a3

hellolittlej force-pushed the refactor-logs branch from b252107 to 54080a3 Compare March 29, 2026 18:45

Andyz26 reviewed Mar 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor existing logs and add logs#840

Refactor existing logs and add logs#840
hellolittlej wants to merge 1 commit intomasterfrom
refactor-logs

hellolittlej commented Mar 29, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 29, 2026 •

edited

Loading

Uh oh!

Andyz26 Mar 29, 2026

Uh oh!

hellolittlej Mar 29, 2026

Uh oh!

Uh oh!

Andyz26 Mar 29, 2026

Uh oh!

hellolittlej Mar 29, 2026

Uh oh!

Andyz26 Mar 29, 2026

Uh oh!

hellolittlej Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hellolittlej commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Checklist

Uh oh!

github-actions bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

Andyz26 Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

hellolittlej Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Andyz26 Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

hellolittlej Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Andyz26 Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

hellolittlej Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hellolittlej commented Mar 29, 2026 •

edited

Loading

github-actions bot commented Mar 29, 2026 •

edited

Loading