[LFX Term : 01 ]Restoration: CItyscape-Sythia Curb detection by NishantSinghhhhh · Pull Request #441 · kubeedge/ianvs

NishantSinghhhhh · 2026-05-21T08:07:34Z

What type of PR is this?

/kind bug
/kind cleanup

What this PR does / why we need it:

This PR restores and fixes the cityscapes-synthia curb-detection lifelong learning benchmark so it runs end-to-end without manual environment-specific setup.

Key changes:

Replace all absolute paths (/home/nishant/...) in benchmarkingjob.yaml and testenv/testenv.yaml with ./ relative paths so the example works on any machine out of the box
Fix accuracy.py — remove dependency on make_data_loader and tqdm; directly read ground-truth labels from file paths via PIL for simpler, more robust evaluation
Fix task_allocation_by_origin.py — make task_extractor optional with a sensible default, handle None/empty samples gracefully, and simplify origin detection logic
Fix basemodel.py — auto-insert RFNet directory into sys.path so internal imports resolve without a manual PYTHONPATH export; set pin_memory=False to fix DataLoader errors on
CPU-only machines
Fix lifelong_learning.py — correct round index passed to _train, use dtype=object for ragged numpy arrays, and fix edge_task_index construction from the eval output
directory path
Add sedna_src/ to .gitignore

Which issue(s) this PR fixes:

Fixes #230

kubeedge-bot · 2026-05-21T08:07:47Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: NishantSinghhhhh
To complete the pull request process, please assign jaypume after the PR has been reviewed.
You can assign the PR to them by writing /assign @jaypume in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist

Code Review

This pull request updates the lifelong learning paradigm and the curb-detection example, including path updates in YAML configurations, refactoring dataset loading and evaluation logic, and improving the robustness of task allocation and metric calculations. Feedback was provided regarding the replacement of exceptions with warnings when datasets are empty, which could lead to silent failures during execution.

NishantSinghhhhh · 2026-05-21T08:27:25Z

Screencast.from.2026-05-21.13-19-18.webm

Complete running of the example

NishantSinghhhhh · 2026-05-21T08:28:39Z

PR — Cityscapes-Synthia Curb Detection Lifelong Learning Benchmark

Got the cityscapes-synthia/lifelong_learning_bench/curb-detection example running end-to-end. The version on main dies immediately on wrong module paths before any training begins. Once those were fixed a chain of follow-on issues appeared — wrong hyperparameter wiring, a broken accuracy.py, shape mismatches in metric computation, a deprecated torchvision API call, and two bugs in the core lifelong learning paradigm itself. All addressed below.

Summary

File	What changed	Why
`testalgorithms/rfnet/rfnet_algorithm.yaml`	3 wrong module paths + 4 missing hyperparameters	paths pointed at a folder that doesn't exist; hyperparams were never passed to `TrainArgs`
`benchmarkingjob.yaml`	`workspace` absolute → relative path	hardcoded `/home/nishant/...` breaks the example on every other machine
`testenv/testenv.yaml`	`train_index` / `test_index` absolute → relative paths	same reason as above
`testalgorithms/rfnet/basemodel.py`	auto sys.path insert + `pin_memory=False`	RFNet internal imports fail without PYTHONPATH; `pin_memory=True` crashes on CPU
`testalgorithms/rfnet/task_allocation_by_origin.py`	`task_extractor` optional, None-safe sample handling, simplified loop	`task_extractor` was sometimes `None`; nested city loop crashed on non-list samples
`testenv/accuracy.py`	remove `make_data_loader`/`tqdm`; read labels directly via PIL	`make_data_loader` rebuilt the full DataLoader unnecessarily and was the wrong call path for evaluation
`core/.../lifelong_learning.py`	fix round index, `dtype=object`, fix `edge_task_index`	three separate bugs that together caused the eval round to fail silently
`RFNet/dataloaders/datasets/cityscapes.py`	guard `data.x` parsing against empty datasets, downgrade exceptions to warnings	bare `Exception` on empty split killed the process instead of skipping
`RFNet/utils/args.py`	`workers`, `base_size`, `crop_size`, `batch_size` wired to `kwargs`	values were hardcoded so YAML hyperparameters were silently ignored
`RFNet/utils/metrics.py`	guard `FWIoU` against empty confusion matrix, fix shape mismatch loop	division-by-zero crash when matrix is empty; shape check was too brittle
`RFNet/utils/summaries.py`	`range=` → `value_range=` in `make_grid` calls	`range` was removed in torchvision ≥ 0.13; causes `TypeError` on import
`.gitignore`	add `sedna_src/`	local sedna source checkout was showing as untracked in every `git status`

Per-file walkthrough

testalgorithms/rfnet/rfnet_algorithm.yaml — wrong paths + missing hyperparameters

**Wrong module URLs**

The directory ./examples/curb-detection/ does not exist anywhere in the repo. Ianvs resolves module URLs at startup and immediately raises FileNotFoundError — the benchmark dies before a single line of training code runs.

- url: "./examples/curb-detection/lifelong_learning_bench/testalgorithms/rfnet/basemodel.py"
+ url: "./examples/cityscapes-synthia/lifelong_learning_bench/curb-detection/testalgorithms/rfnet/basemodel.py"
 
- url: "./examples/curb-detection/lifelong_learning_bench/testalgorithms/rfnet/task_definition_by_origin.py"
+ url: "./examples/cityscapes-synthia/lifelong_learning_bench/curb-detection/testalgorithms/rfnet/task_definition_by_origin.py"
 
- url: "./examples/curb-detection/lifelong_learning_bench/testalgorithms/rfnet/task_allocation_by_origin.py"
+ url: "./examples/cityscapes-synthia/lifelong_learning_bench/curb-detection/testalgorithms/rfnet/task_allocation_by_origin.py"

Missing hyperparameters

TrainArgs in args.py picks up base_size, crop_size, batch_size, and workers from **kwargs. If they aren't declared here, ianvs never passes them. Adding them to the YAML with their original default values makes these parameters visible and overridable per-benchmark.

  hyperparameters:
    - learning_rate:
        values: [0.0001]
    - epochs:
        values: [1]
+   - base_size:
+       values: [1024]
+   - crop_size:
+       values: [768]
+   - batch_size:
+       values: [4]
+   - workers:
+       values: [4]

benchmarkingjob.yaml — hardcoded absolute path

The workspace path was hardcoded to a specific user's home directory. Anyone else cloning the repo gets `PermissionError` or silent failures because ianvs tries to write output to a path that doesn't exist on their machine.

- workspace: "/home/nishant/LOCAL_DISK_D/1/ianvs/workspace/curb-detection"
+ workspace: "./workspace/curb-detection"

testenv/testenv.yaml — hardcoded absolute paths

Same problem. `core/testenvmanager/dataset/dataset.py` checks `os.path.isfile(url)` — on any other machine this raises `RuntimeError: dataset file is not a local file and not an absolute path`.

- train_index: "/home/nishant/LOCAL_DISK_D/1/ianvs/dataset/curb-detection/train_data/index.txt"
- test_index:  "/home/nishant/LOCAL_DISK_D/1/ianvs/dataset/curb-detection/test_data/index.txt"
+ train_index: "./dataset/curb-detection/train_data/index.txt"
+ test_index:  "./dataset/curb-detection/test_data/index.txt"

testalgorithms/rfnet/basemodel.py — sys.path + pin_memory

**Auto sys.path insertion**

basemodel.py imports from RFNet.dataloaders, RFNet.utils, etc. Without the RFNet/ subdirectory on sys.path, every user has to manually export PYTHONPATH=... before invoking ianvs. Inserting the path at import time makes the example self-contained.

+ import sys
+
+ _rfnet_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "RFNet")
+ if _rfnet_dir not in sys.path:
+     sys.path.insert(0, _rfnet_dir)

pin_memory=False

pin_memory=True is a GPU-only optimisation. On CPU-only machines PyTorch raises RuntimeError: cannot pin 'torch.FloatTensor' only dense CPU tensors can be pinned. Setting it to False is safe in all environments.

- self.validator.test_loader = DataLoader(data, ..., pin_memory=True)
+ self.validator.test_loader = DataLoader(data, ..., pin_memory=False)

testalgorithms/rfnet/task_allocation_by_origin.py — three fixes

**`task_extractor` made optional**

The lifelong learning paradigm does not always pass task_extractor — in some rounds it's None. The old required positional argument raised TypeError: __call__() missing 1 required positional argument.

  def __init__(self, **kwargs):
      self.default_origin = kwargs.get("default", None)
+     self.task_extractor = None
 
- def __call__(self, task_extractor, samples: BaseDataSource):
-     self.task_extractor = task_extractor
+ def __call__(self, task_extractor=None, samples: BaseDataSource = None):
+     if task_extractor is not None:
+         self.task_extractor = task_extractor
+     if self.task_extractor is None:
+         self.task_extractor = {"real": 0, "sim": 1}

None-safe sample path extraction

During unseen-task rounds, samples can arrive as raw path strings rather than [path, depth_path] tuples. _x[0] on a string gives the first character, not the first element — city name lookup always fails, silently classifying every sample as "sim".

- for city in cities:
-     if city in _x[0]:
-         is_real = True
-         sample_origins.append("real")
-         break
- if not is_real:
-     sample_origins.append("sim")
+ if _x is None or (hasattr(_x, '__len__') and len(_x) == 0):
+     sample_origins.append("real")
+     continue
+ sample_path = _x[0] if isinstance(_x, (list, tuple)) else str(_x)
+ is_real = any(city in sample_path for city in cities)
+ sample_origins.append("real" if is_real else "sim")

Safe .get() with default

dict.get() without a default returns None when the key is missing. int(None) raises TypeError.

- allocations = [int(self.task_extractor.get(sample_origin)) for sample_origin in sample_origins]
+ allocations = [int(self.task_extractor.get(origin, 0)) for origin in sample_origins]

testenv/accuracy.py — replace make_data_loader with PIL reads

The old code rebuilt a full PyTorch DataLoader inside the metric function. This was wrong in two ways:

make_data_loader applies image transforms (resize, normalise, convert to tensor). Ground-truth labels come out as float tensors rather than integer label maps, corrupting the Evaluator's confusion matrix.
y_true at eval time is a list of file paths, not a BaseDataSource. Passing it to make_data_loader caused AttributeError: 'list' object has no attribute 'x'.

- from tqdm import tqdm
- from RFNet.dataloaders import make_data_loader
+ import numpy as np
+ from PIL import Image
 
- _, _, test_loader, num_class = make_data_loader(args, test_data=y_true)
- for i, (sample, img_path) in enumerate(tqdm(test_loader)):
-     image, target = sample['image'], sample['label']
-     if args.cuda:
-         image, target = image.cuda(), target.cuda()
-     target[target > evaluator.num_class-1] = 255
-     target = target.cpu().numpy()
-     evaluator.add_batch(target, y_pred[i])
+ for i, label_path in enumerate(y_true):
+     if i >= len(y_pred):
+         break
+     target = np.array(Image.open(label_path.rstrip()))
+     target[target > evaluator.num_class - 1] = 255
+     pred = np.array(y_pred[i])
+     while pred.ndim > 2:
+         pred = pred[0]
+     evaluator.add_batch(target, pred)
 
+ if evaluator.confusion_matrix.sum() == 0:
+     return 0.0

core/.../lifelong_learning.py — three core paradigm bugs

**Wrong round index passed to `_train`**

_train uses this integer to build the output path (output/train/{rounds}). Passing r (current round, starting at 1) wrote checkpoints to output/train/1/, output/train/2/, etc. _eval always looks in output/train/0/ and silently used random weights.

  self.cloud_task_index = self._train(self.cloud_task_index,
                                      train_dataset_file,
-                                     r)
+                                     0)

dtype=object for ragged numpy arrays

unseen_tasks can contain tuples (image_path, depth_path) or plain strings. NumPy raises ValueError: setting an array element with a sequence when it encounters mixed-length elements.

- unseen_task_train_samples.x = np.array(unseen_tasks)
- unseen_task_train_samples.y = np.array(unseen_task_labels)
+ unseen_task_train_samples.x = np.array(unseen_tasks, dtype=object)
+ unseen_task_train_samples.y = np.array(unseen_task_labels, dtype=object)

edge_task_index path construction

job.evaluate() returns evaluation metrics, not a file path. The index file is always written to eval_output_dir/index.pkl by sedna internals.

- edge_task_index = job.evaluate(eval_dataset, metrics=metric_func)
+ job.evaluate(eval_dataset, metrics=metric_func)
+ edge_task_index = os.path.join(eval_output_dir, "index.pkl")

RFNet/dataloaders/datasets/cityscapes.py — guard empty splits

`data.x[0]` was accessed unconditionally before checking whether the list was empty. During unseen-task rounds no samples may match the current task, so `data.x` can be empty, raising `IndexError`. Hard `Exception` raises are also downgraded to warnings — an empty split is expected behaviour during incremental rounds.

- self.images[split] = [img[0] for img in data.x] if hasattr(data, "x") else data
- if hasattr(data, "x") and len(data.x[0]) == 1:
+ if hasattr(data, "x") and len(data.x) > 0:
+     self.images[split] = [img[0] for img in data.x]
+     if len(data.x[0]) == 1:
          self.disparities[split] = self.images[split]
-     elif hasattr(data, "x") and len(data.x[0]) == 2:
-         self.disparities[split] = [img[1] for img in data.x]
+     elif len(data.x[0]) == 2:
+         self.disparities[split] = [img[1] for img in data.x]
      else:
-         self.disparities[split] = data
+         self.disparities[split] = self.images[split]
+ else:
+     self.images[split] = []
+     self.disparities[split] = []
 
- raise Exception("No RGB images for split=[%s] found in %s" % (split, self.images_base))
+ print(f"Warning: No RGB images for split=[{split}]")
+ return

RFNet/utils/args.py — wire kwargs to TrainArgs

`TrainArgs` accepted `**kwargs` but hardcoded all four values regardless. Any YAML hyperparameters were silently thrown away. Original defaults are preserved.

- self.workers = 4
- self.base_size = 1024
- self.crop_size = 768
- self.batch_size = 4
+ self.workers = kwargs.get("workers", 4)
+ self.base_size = kwargs.get("base_size", 1024)
+ self.crop_size = kwargs.get("crop_size", 768)
+ self.batch_size = kwargs.get("batch_size", 4)

RFNet/utils/metrics.py — three fixes

**Division-by-zero guard in FWIoU**

If evaluate() is called before any batches are added, the confusion matrix is all zeros and the division produces NaN/inf, corrupting the leaderboard sort.

  def Frequency_Weighted_Intersection_over_Union(self):
+     if self.confusion_matrix.sum() == 0:
+         return 0.0
      freq = np.sum(self.confusion_matrix, axis=1) / np.sum(self.confusion_matrix)

Robust shape normalisation in add_batch

Predictions can have extra batch dimensions, e.g. (1, 1, H, W) vs ground truth (H, W). The old single [0] strip still crashed with two extra dimensions.

- if gt_image.shape != pre_image.shape:
-     pre_image = pre_image[0]
+ while pre_image.ndim > gt_image.ndim:
+     pre_image = pre_image[0]

Guarded per-class FWIoU print

CFWIoU only contains entries for classes present in the data. On a road-only subset CFWIoU[1] raises IndexError.

- print("road         : %.6f" % (CFWIoU[0] * 100.0), "%\t")
- print("sidewalk     : %.6f" % (CFWIoU[1] * 100.0), "%\t")
+ if len(CFWIoU) > 0:
+     print("road         : %.6f" % (CFWIoU[0] * 100.0), "%\t")
+ if len(CFWIoU) > 1:
+     print("sidewalk     : %.6f" % (CFWIoU[1] * 100.0), "%\t")

RFNet/utils/summaries.py — deprecated make_grid argument

torchvision renamed `range=` to `value_range=` in v0.13. The old name raises `TypeError` at import time, blocking all training setup. Four occurrences fixed.

- grid_image = make_grid(..., normalize=False, range=(0, 255))
+ grid_image = make_grid(..., normalize=False, value_range=(0, 255))

.gitignore — add sedna_src/

```diff + sedna_src/ ```

---

How to verify

# From the ianvs root directory
source venv/bin/activate
ianvs -f examples/cityscapes-synthia/lifelong_learning_bench/curb-detection/benchmarkingjob.yaml

No manual PYTHONPATH export needed — basemodel.py handles it automatically now.

Expected output:

| rank | algorithm               | accuracy | samples_transfer_ratio | paradigm         |
|  1   | rfnet_lifelong_learning |  0.2123  |         0.4649         | lifelonglearning |

NishantSinghhhhh · 2026-05-28T09:07:32Z

Print function to Logger / Exceptions

MooreZheng

Change print function to logger or exception
Cityscape is different from cloud-robotics
this pull request is different from #297 for dataset and algorithm

abhisheksainimitawa

PR Review: [LFX Term 01] Restoration: CItyscape-Sythia Curb detection

Contributor: @abhisheksainimitawa | LFX Mentorship 2026 Term 2 Pre-test Task 2

What it does: PR #441 restores the cityscapes-synthia/lifelong_learning_bench/curb-detection example. It fixes broken YAML paths, repairs the accuracy evaluation script, and addresses hardware/library compatibility bugs in RFNet code (deprecated torchvision keyword, DataLoader worker imports, task allocation interface). It also modifies the shared lifelong_learning.py paradigm controller in 3 hunks, making it directly relevant to all lifelong learning examples including robot-cityscapes-synthia.

Recommendation: Merge. The sibling example fixes are correct and address real bugs. All three lifelong_learning.py hunks are correct: Hunk 1 passes 0 instead of r for the initial training call, which correctly sets HAS_COMPLETED_INITIAL_TRAINING=False via the rounds < 1 check in _train() — passing r=1 at that call site would incorrectly signal initial training is already complete. Hunks 2 and 3 are also correct.

What Makes This Review Unique

Existing reviews flagged the exceptions-to-warnings pattern in cityscapes.py and suggested using a logger. This review adds:

Hunk 1 _train(...0) semantics verified against _train() implementation: _train() uses the rounds parameter both for the output directory path (output/train/{rounds}) and for the HAS_COMPLETED_INITIAL_TRAINING env flag (False when rounds < 1, True otherwise). Passing 0 at the if r == 1 call site correctly sets the flag to False for the initial training round. The original r=1 would have set it to True immediately — a semantic bug. Not analyzed in any existing comment.
All 3 lifelong_learning.py hunks confirmed orthogonal to robot-cityscapes-synthia via layered execution: The no-inference mode calls my_eval() at line 140. None of the 3 hunks touch this code path. Confirmed by running the layered stack with PR #441 applied and observing the same execution result as without it.
Sibling fixes map directly to open issues in robot-cityscapes-synthia: The RFNet value_range= fix, sys.path.insert pattern, task_extractor optional parameter, cityscapes guards, and train_index/test_index rename each correspond to open issues (#472, #473, #79) in the sibling robot-cityscapes-synthia example.

1. Problem: Complexity and Difficulty of the Bug

The PR makes correct fixes across three separate concern areas: sibling example configuration, hardware/library compatibility, and the shared lifelong_learning.py paradigm controller. The sibling example fixes cover the same bug classes as issues #472, #471, and #473 in robot-cityscapes-synthia.

The three lifelong_learning.py changes require careful reading because they affect different execution modes:

Hunk 1 (line 268): hard-example-mining mode round index. The _train() method uses the rounds argument for two purposes: the output directory path (output/train/{rounds}) and the HAS_COMPLETED_INITIAL_TRAINING env flag (False when rounds < 1, True otherwise). In the if r == 1 branch, PR #441 passes 0 instead of r. This is correct: rounds=0 sets HAS_COMPLETED_INITIAL_TRAINING=False, accurately reflecting that the initial training has not yet completed. The original code with r=1 would set the flag to True at the very first call, which is semantically wrong. The else branch for rounds 2, 3, ... correctly passes r unchanged.

Hunk 2 (line 344): _inference() numpy dtype adds dtype=object to np.array(unseen_tasks). This is a correct fix for ragged-array deprecation warnings and benefits any example that reaches the inference path with variable-length sample lists.

Hunk 3 (line 389): _eval() edge task index construction changes from trusting job.evaluate() to return a usable index path, to constructing the path explicitly as os.path.join(eval_output_dir, "index.pkl"). This is pragmatically correct, as job.evaluate() does not reliably return a path. _eval() is only called in hard-example-mining mode, so Hunk 3 does not affect the no-inference execution path.

2. Code Review Finding: Sibling Example Fixes Confirming Shared Issues

File changed in PR #441	Bug fixed	Issue confirmed in `robot-cityscapes-synthia`
`RFNet/utils/summaries.py`	`range=(0,255)` to `value_range=(0,255)`	Issue #472 Bug A: identical fix needed in `ERFNet/utils/summaries.py`
`basemodel.py`	`sys.path.insert(0, _rfnet_dir)` + `pin_memory=False`	Issue #472 Bug B: same pattern needed in ERFNet `basemodel.py`
`task_allocation_by_origin.py`	`task_extractor` made optional, default fallback added	Issue #473 Bug A: same interface contract mismatch
`RFNet/dataloaders/datasets/cityscapes.py`	Guards before `data.x[0]` access; exceptions to warnings	Issue #473 Bug B: same safety checks needed in ERFNet dataloaders
`testenv/testenv.yaml`	`train_url`/`test_url` to `train_index`/`test_index`	Issue #79: same rename needed in `robot-cityscapes-synthia` `testenv.yaml`

3. Execution Video

Watch execution recording on Google Drive

A. Independent Execution: PR #441 applied alone on main (no other fixes)

git fetch origin pull/441/head:pr-441
git checkout pr-441
ianvs -f examples/robot-cityscapes-synthia/lifelong_learning_bench/semantic-segmentation/benchmarkingjob.yaml

Section A-1: PR #441 applied alone, same path crash as main

RuntimeError: not found testenv config file
  (./examples/class_increment_semantic_segmentation/lifelong_learning_bench/testenv/testenv.yaml) in local

PR #441 modifies only lifelong_learning.py and the cityscapes-synthia/curb-detection sibling example files. It does not touch any robot-cityscapes-synthia YAML configuration file. Running the robot-cityscapes-synthia example with PR #441 applied alone produces the identical path crash as unpatched main. None of PR #441 changes are reachable from that example until PR #366 is applied first.

Section A-2: Diff of PR #441, shared lifelong_learning.py changes

git fetch origin pull/441/head:pr-441
git diff main pr-441 -- core/testcasecontroller/algorithm/paradigm/lifelong_learning/lifelong_learning.py

# Hunk 1 -- hard-example-mining mode, if r==1 branch (does NOT affect no-inference mode):
-    self.cloud_task_index = self._train(self.cloud_task_index, train_dataset_file, r)
+    self.cloud_task_index = self._train(self.cloud_task_index, train_dataset_file, 0)
# rounds=0 sets HAS_COMPLETED_INITIAL_TRAINING=False in _train(); r=1 would set it True (wrong)

# Hunk 2 -- _inference() path, line 344 (does NOT affect no-inference mode):
-    unseen_task_train_samples.x = np.array(unseen_tasks)
-    unseen_task_train_samples.y = np.array(unseen_task_labels)
+    unseen_task_train_samples.x = np.array(unseen_tasks, dtype=object)
+    unseen_task_train_samples.y = np.array(unseen_task_labels, dtype=object)

# Hunk 3 -- _eval() function, line 389 (called from hard-example-mining, NOT no-inference):
-    edge_task_index = job.evaluate(eval_dataset, metrics=metric_func)
+    job.evaluate(eval_dataset, metrics=metric_func)
+    edge_task_index = os.path.join(eval_output_dir, "index.pkl")

All three hunks modify code paths only reached in hard-example-mining mode or the _inference() helper. The no-inference mode used by robot-cityscapes-synthia calls my_eval() at line 370, a separate function not touched by any of these hunks.

Step 0: Local fixes applied before Section B

These fixes were applied to the running-stack branch (commit 70e8be5) before cherry-picking PR #441, to isolate PR #441 contribution from other known blockers in the robot-cityscapes-synthia example:

Fix applied	File changed	What it fixes
`sys.path.insert(0, ERFNet_dir)` before imports	`basemodel.py`	Issue #472 Bug B: bare relative ERFNet imports fail in DataLoader subprocesses
`range=(0,255)` to `value_range=(0,255)` at 4 sites	`ERFNet/utils/summaries.py`	Issue #472 Bug A: deprecated torchvision keyword
`self.cuda = torch.cuda.is_available()` + device detection	`ERFNet/utils/args.py`, `ERFNet/train.py`	Issue #471: hardcoded `.cuda()` crashes on CPU-only machines
`__call__(self, samples)` with `task_extractor` removed	`task_allocation_by_domain.py`	Issue #473 Bug A: Sedna calls `allocator(samples)` with 1 arg
Guards before `data.x[0]` access	`ERFNet/dataloaders/datasets/cityscapes.py`	Issue #473 Bug B: unsafe array access
`job.inference_2(...)` to `job.inference(...)` line 328	`lifelong_learning.py`	Issue #470: `inference_2` not in Sedna API
`job.my_inference(...)` to `seen_estimator.predict(...)` line 155	`lifelong_learning.py`	Issue #461: `my_inference` not in Sedna API
`train_url`/`test_url` to `train_index`/`test_index`	`testenv/testenv.yaml`	Issue #79: field names not recognized by `dataset.py`

B. Layered Stack Execution: running-stack + PR #441 lifelong_learning.py changes

git checkout running-stack
git cherry-pick 6ee1ba4   # PR #441 main commit
ianvs -f examples/robot-cityscapes-synthia/lifelong_learning_bench/semantic-segmentation/benchmarkingjob.yaml

Section B-1: PR #441 applied on top of running-stack, pipeline advances, crashes at seen_estimator.predict()

[ERROR] base.py(181) - RetryError[<Future at 0x1589b2450 state=finished returned NoneType>]
[INFO]  lifelong_learning.py(145) - {"accuracy": 0.0}
Traceback (most recent call last):
  ...
    raise EOFError
EOFError
RuntimeError: (paradigm=lifelonglearning) pipeline runs failed, error:

Three findings from this run:

Finding 1: PR #441 introduces no regression and no improvement for no-inference mode. All three lifelong_learning.py hunks affect hard-example-mining mode and _inference() paths, neither of which is reached in the no-inference execution path. The output is identical to the pre-PR #441 state, confirming PR #441 lifelong_learning.py changes are strictly orthogonal to this execution path.

Finding 2: my_eval() succeeds and returns the expected dict format. The log line lifelong_learning.py(145) [INFO] - {"accuracy": 0.0} is direct evidence that job.evaluate() returns a metrics dict and the caller at line 140 receives it cleanly. The accuracy: 0.0 value is expected because the knowledge base has not been populated by a Sedna server.

Finding 3: The crash has moved one layer deeper. The pipeline clears my_eval() completely and crashes at seen_estimator.predict() (line 150) via FileOps.load() then joblib.load() then pickle.load(), raising EOFError from an empty index.pkl. This is a Sedna server dependency surfacing at a different call site, outside the scope of this PR.

Sub-comment Summary

File	Line(s)	Sub-comment Topic
`lifelong_learning.py`	271	Hunk 1: `_train(... 0)` is verified correct; `rounds=0` sets `HAS_COMPLETED_INITIAL_TRAINING=False` in `_train()`, which is the right semantic for the initial training call. The original `r=1` would have set the flag to `True` immediately (wrong).

- Add benchmarkingjob.yaml, testenv.yaml, and rfnet_algorithm.yaml with correct paths and parameters - Add comprehensive README covering installation, dataset prep, configuration, execution, and troubleshooting - Refactor cityscapes.py dataset loader with safe empty-data handling and logging - Convert print() calls to logger throughout metrics.py, accuracy.py, and cityscapes.py - Fix metrics.py confusion matrix edge cases and dimension mismatches - Update accuracy.py to use PIL directly instead of make_data_loader - Expose base_size, crop_size, batch_size, workers as configurable hyperparameters - Fix value_range API change in summaries.py for PyTorch >= 2.x - Fix task_allocation_by_origin.py with safe path detection - Fix lifelong_learning.py to pass correct round index and dtype Signed-off-by: NishantSinghhhhh <nishantsingh_230137@aitpune.edu.in>

Wrap numpy division operations in errstate to silence invalid/divide warnings in all metric methods. Switch testenv from index_mini.txt to index.txt to use the full dataset for benchmarking runs. Signed-off-by: NishantSinghhhhh <nishantsingh_230137@aitpune.edu.in>

NishantSinghhhhh · 2026-06-03T17:04:27Z

Screencast.from.2026-06-03.22-19-10.webm

Running withour errors, reduced the epoches and dataset size to debug things faster

MooreZheng

/lgtm

MooreZheng · 2026-06-04T08:50:04Z

This version have modified as reviewer asked and looks good to me

what do you think @hsj576

kubeedge-bot requested review from MooreZheng and Poorunga May 21, 2026 08:07

kubeedge-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 21, 2026

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

Comment thread ..._learning_bench/curb-detection/testalgorithms/rfnet/RFNet/dataloaders/datasets/cityscapes.py Outdated

coldstarted mentioned this pull request May 26, 2026

[Pre-Test : LFX-Term 2 2026] Issue Summarization Cifar100 #454

Open

abhisheksainimitawa mentioned this pull request May 26, 2026

fix(examples): resolve broken paths in MDIL-SS configuration #366

Open

MooreZheng reviewed May 28, 2026

View reviewed changes

abhisheksainimitawa mentioned this pull request May 28, 2026

OSPP: Implementation of a Class Incremental Learning Algorithm Evaluation System based on Ianvs #85

Merged

abhisheksainimitawa reviewed May 28, 2026

View reviewed changes

Comment thread core/testcasecontroller/algorithm/paradigm/lifelong_learning/lifelong_learning.py

abhisheksainimitawa mentioned this pull request May 29, 2026

Issue summarization for example of robot-cityscapes-synthia/lifelong_learning_bench #459

Open

kubeedge-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 3, 2026

NishantSinghhhhh force-pushed the Restoration-CityScapeSynthia-Curb-Detection branch 3 times, most recently from 34821b0 to 1d3240a Compare June 3, 2026 17:02

NishantSinghhhhh added 2 commits June 3, 2026 22:32

NishantSinghhhhh force-pushed the Restoration-CityScapeSynthia-Curb-Detection branch from 1d3240a to 4ff2589 Compare June 3, 2026 17:03

MooreZheng reviewed Jun 4, 2026

View reviewed changes

kubeedge-bot assigned MooreZheng Jun 4, 2026

kubeedge-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2026

MooreZheng requested review from hsj576 and removed request for Poorunga June 4, 2026 08:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LFX Term : 01 ]Restoration: CItyscape-Sythia Curb detection#441

[LFX Term : 01 ]Restoration: CItyscape-Sythia Curb detection#441
NishantSinghhhhh wants to merge 2 commits into
kubeedge:mainfrom
NishantSinghhhhh:Restoration-CityScapeSynthia-Curb-Detection

NishantSinghhhhh commented May 21, 2026

Uh oh!

kubeedge-bot commented May 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

NishantSinghhhhh commented May 21, 2026

Uh oh!

NishantSinghhhhh commented May 21, 2026

Uh oh!

NishantSinghhhhh commented May 28, 2026

Uh oh!

MooreZheng left a comment

Uh oh!

abhisheksainimitawa left a comment •

edited

Loading

Uh oh!

Uh oh!

NishantSinghhhhh commented Jun 3, 2026

Uh oh!

MooreZheng left a comment

Uh oh!

MooreZheng commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

NishantSinghhhhh commented May 21, 2026

Uh oh!

kubeedge-bot commented May 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

NishantSinghhhhh commented May 21, 2026

Uh oh!

NishantSinghhhhh commented May 21, 2026

PR — Cityscapes-Synthia Curb Detection Lifelong Learning Benchmark

Summary

Per-file walkthrough

How to verify

Uh oh!

NishantSinghhhhh commented May 28, 2026

Uh oh!

MooreZheng left a comment

Choose a reason for hiding this comment

Uh oh!

abhisheksainimitawa left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

PR Review: [LFX Term 01] Restoration: CItyscape-Sythia Curb detection

What Makes This Review Unique

1. Problem: Complexity and Difficulty of the Bug

2. Code Review Finding: Sibling Example Fixes Confirming Shared Issues

3. Execution Video

Sub-comment Summary

Uh oh!

Uh oh!

NishantSinghhhhh commented Jun 3, 2026

Uh oh!

MooreZheng left a comment

Choose a reason for hiding this comment

Uh oh!

MooreZheng commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

abhisheksainimitawa left a comment •

edited

Loading