Skip to content

Commit 8c296a3

Browse files
committed
feat: contact sample op with feather output
1 parent 003c207 commit 8c296a3

File tree

2 files changed

+737
-0
lines changed

2 files changed

+737
-0
lines changed
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# ContactSampleOp Design
2+
3+
## Goal
4+
Generate training samples for segment merge classifier from raw volumes.
5+
Output: Arrow/Feather files (cross-language: Python + TypeScript).
6+
7+
## Inputs
8+
- `candidate_layer`: Candidate segmentation (also used for meshes)
9+
- `reference_layer`: Proofread reference segmentation
10+
- `affinity_layer`: 3-channel affinity volume (X, Y, Z axes)
11+
12+
## Output Schema (Feather)
13+
| Column | Type | Description |
14+
|--------|------|-------------|
15+
| `seg_a`, `seg_b` | int64 | Segment pair IDs |
16+
| `should_merge` | int64 | 1=merge, 0=no merge |
17+
| `n_contacts` | int64 | Actual contact count |
18+
| `contacts` | list[list[float64]] | (max_contact_vx, 4) - [x, y, z, aff] in nm |
19+
| `pointcloud_a`, `pointcloud_b` | list[list[float64]] | (n_points, 3) surface points in nm |
20+
| `chunk_coord` | list[int64] | Chunk start coordinates (voxels) |
21+
| `chunk_size` | list[int64] | Chunk dimensions (voxels) |
22+
| `crop_pad` | list[int64] | Padding used (voxels) |
23+
| `candidate_path` | string | Candidate segmentation path |
24+
| `reference_path` | string | Reference segmentation path |
25+
| `affinity_path` | string | Affinity volume path |
26+
27+
## Processing Steps
28+
29+
1. **Read volumes** (parallel) - candidate, proofread, affinity with padding
30+
31+
2. **Compute overlaps** - Between candidate segments and proofread connected components
32+
33+
3. **Filter bad segments** (BEFORE contact detection):
34+
- **Small**: total segment size < `min_seg_size_vx`
35+
- **Mergers**: overlap 2+ proofread CCs with >= `min_overlap_vx` each
36+
- **Unclaimed**: no proofread CC overlap >= `min_overlap_vx`
37+
38+
4. **Blackout** excluded segments (set to 0)
39+
40+
5. **Find contacts** - Detect voxel boundaries between remaining segments
41+
- Check X, Y, Z axes separately, use axis-specific affinity
42+
- Average affinities when voxel touches neighbor on multiple axes
43+
- Filter to kernel region (inside padding)
44+
45+
6. **Filter contact pairs**:
46+
- Low count (< `min_contact_vx`)
47+
- High count (> `max_contact_vx`)
48+
49+
7. **Download meshes** - Only for segments in valid pairs, clip to bbox
50+
51+
8. **Generate samples** per valid pair:
52+
- Compute affinity-weighted center of mass (COM)
53+
- Crop mesh points to sphere around COM (radius = min(crop_pad * resolution))
54+
- Sample `n_pointcloud_points` from each mesh (seed=42)
55+
- Label: 1 if both segments overlap same proofread CC, else 0
56+
- Pad contacts to fixed size
57+
58+
9. **Write feather** - Empty chunks produce files with 0 rows
59+
60+
## Parameters
61+
| Parameter | Default | Description |
62+
|-----------|---------|-------------|
63+
| `output_path` | required | Output directory for feather files |
64+
| `crop_pad` | (0,0,0) | Padding in voxels |
65+
| `min_seg_size_vx` | 2000 | Min overlap voxels per segment |
66+
| `min_overlap_vx` | 1000 | Min overlap for valid label |
67+
| `min_contact_vx` | 5 | Min contacts per pair |
68+
| `max_contact_vx` | 2048 | Max contacts (array size) |
69+
| `n_pointcloud_points` | 2048 | Points per mesh |

0 commit comments

Comments
 (0)