🏷️ Scene Classification · 🧭 Space Item Detection · 🔍 Change Detection

MobileNetV2 (torchvision) · YOLOv8n · TinyChangeUNet (MobileNetV3 encoder)

실내 공간 데이터를 대상으로 장소 분류 → 공간 아이템 탐지 → 변화 감지

폴더 구조
A) Scene Classification (장소 분류)
B) Space Item Detection (아이템 탐지)
C) Change Detection (전후 변화 감지)
재현성

📁 폴더 구조

A) Scene Classification (ImageFolder)
space_cls/
  train/<class>/*.jpg|png
  val/<class>/*.jpg|png
  test/<class>/*.jpg|png

B) Space Item Detection (YOLO)
space_data/
  images/{train,val,test}/*.jpg|png
  labels/{train,val,test}/*.txt      # YOLO: cls cx cy w h
  space.yaml

C) Change Detection (합성 before/after/GT)
pairs_out_cd/
  train/{before_images,after_images,labels}
  val/{before_images,after_images,labels}
  test/{before_images,after_images,labels}
meta/pairs_{train,val}.json

A) Scene Classification (장소 분류)

모델: MobileNetV2 (torchvision, ImageNet 사전학습 → 파인튜닝)
입력: 224×224
타깃 클래스(5): creative_studio, dance_studio, music_rehearsal_room, small_theater_gallery, study_room
데이터 구축: 클래스당 ~50장 (1차 크롤링 → 2차 수작업 정제) 후 train:val=8:2 분할, ImageFolder 포맷

# 학습 스크립트
python train_mobilenet.py
# 산출물: mobilenetv2.pth, class_names.txt

테스트 성능(클래스별 정확도)

Class	Correct / Total	Acc.
creative_studio	10 / 10	100.0%
dance_studio	8 / 10	80.0%
music_rehearsal_room	7 / 10	70.0%
small_theater_gallery	8 / 10	80.0%
study_room	10 / 10	100.0%

Overall Acc: 43/50 = 86.0% · Macro Acc: 86.0%

주요 오분류: dance_studio → small_theater_gallery 2건, music_rehearsal_room → study_room 2건 등.

B) Space Item Detection (아이템 탐지)

모델: YOLOv8n (Ultralytics, COCO 사전학습 → 커스텀 파인튜닝)
목적: 실내 사진에서 공간 아이템(13종)을 탐지합니다.

데이터셋

클래스(13종)
air_conditioner, chair, desk, drum, microphone, mirror, monitor, piano, projector, speaker, spotlight, stage, whiteboard
구축: 장소 데이터에서 자동 박스 라벨링 파이프라인으로 초안 생성 → 수작업 보정
형식: YOLO 형식 (images/, labels/*.txt ; 각 txt: cls cx cy w h)

라벨 통계 (히스토그램)
[dataset_trainval]
  0 air_conditioner : 1026   7 piano       : 1265
  1 chair           : 2058   8 projector   : 1026
  2 desk            : 4199   9 speaker     : 1900
  3 drum            : 3444  10 spotlight   : 6136
  4 microphone      : 1126  11 stage       : 1012
  5 mirror          : 1185  12 whiteboard  : 1504
  6 monitor         : 1391

[dataset_test]
  0 air_conditioner :   7   7 piano       :  22
  1 chair           :  94   8 projector   :  10
  2 desk            :  69   9 speaker     :  22
  3 drum            :  13  10 spotlight   : 142
  4 microphone      :  15  11 stage       :   8
  5 mirror          :  44  12 whiteboard  :  13
  6 monitor         :  14

YOLO 데이터 설정 예시(space.yaml)

path: ./space_data
train: images/train
val: images/val
test: images/test
names:
  0: air_conditioner
  1: chair
  2: desk
  3: drum
  4: microphone
  5: mirror
  6: monitor
  7: piano
  8: projector
  9: speaker
 10: spotlight
 11: stage
 12: whiteboard

학습/평가/추론

from ultralytics import YOLO
model = YOLO("yolov8n.pt")
model.train(data="space_data/space.yaml", imgsz=640, epochs=80, batch=16, seed=42)
model.val(data="space_data/space.yaml", imgsz=640, split="val")
model.predict(source="space_data/images/test", imgsz=640, conf=0.25, save=True)

성능 요약

80 epochs 완료, best/last 가중치 저장 (runs/detect/space_no_leak/weights/best.pt)
Val: mAP50=0.981, mAP50-95=0.912
Test: mAP50=0.979, mAP50-95=0.910

Split	mAP@0.5	mAP@0.5:0.95	ImgSize	Model
Val	0.981	0.912	640	YOLOv8n
Test	0.979	0.910	640	YOLOv8n

Speed(ref): ~0.3ms preprocess, 5.0ms inference, 3.2ms postprocess / image (T4)

성능 평가 시각화

C) Change Detection (전후 변화 감지)

모델: TinyChangeUNet (MobileNetV3 Small encoder + TinyDecoder)
입력: before(3) + after(3) + diff(1) = 7채널 (diff = mean(|before - after|))

데이터셋 구축 (Before/After + GT 마스크)

목표

실제 환경과 유사한 다양한 변화(가림/블러/픽셀화/인페인트/이동) 를 자동 적용해 (before, after, mask) 쌍 일괄 생성
마스크 규칙: 0=배경, 255=변경 영역
활용: 변화 감지, 전/후 비교, 분할(Segmentation) 학습/벤치마킹

생성 로직 요약

YOLO 라벨 박스를 기준으로 영역 선택 후, 아래 중 하나 적용
black / rect(noise) / blur / pixelate / inpaint / move(영역 이동)
박스 jitter/여유와 부분 가림으로 난이도 다양화
결과: before_images/(원본), after_images/(변형), labels/(0/255 PNG 마스크)

생성 데이터 예시

모델 구조 요약

concat([before, after, diff]) → 1×1 conv로 7ch → 3ch 축소
Encoder: MobileNetV3 Small (timm, features_only) → 채널 정규화(24/40/64/96)
Decoder(TinyDecoder): ConvTranspose2d 업샘플 + 스킵 + DWConvBlock
Head: 1×1 conv → logit(1ch) → bilinear 업샘플(원해상도)

학습/평가 (change_detection.ipynb)

기본: IMG_SIZE=256, BATCH=8, EPOCHS=40, LR=3e-4
루프: AMP(FP16), Cosine+Warmup(2ep), EMA(0.99), gradient clip
손실: BCEWithLogits(pos_weight) + Tversky(α=0.7, β=0.3)
검증 threshold sweep: th ∈ [0.02, 0.40]에서 F1 최대를 선택

성능 요약 (Val 로그 기반)

Early stop(F1), 총 Epoch 39
Best F1(EMA): 0.510 @ th=0.36
마지막 Epoch(38): train loss=0.4530, val loss=0.6248, mIoU=0.413, F1=0.513

test 예시 사진

Split	Best-th	mIoU	F1	AMP	EMA
Val	0.36	0.413	0.513	✅	✅

🔁 재현성

공통 시드: 42 (스플릿 누수 방지, 로그/체크포인트 고정)
검증/저장: EMA 가중치 기준

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
README.md		README.md
change_detection.py		change_detection.py
space_classification.py		space_classification.py
space_item_detection.ipynb		space_item_detection.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🏷️ Scene Classification · 🧭 Space Item Detection · 🔍 Change Detection

📁 폴더 구조

A) Scene Classification (장소 분류)

B) Space Item Detection (아이템 탐지)

데이터셋

C) Change Detection (전후 변화 감지)

데이터셋 구축 (Before/After + GT 마스크)

🔁 재현성

About

Uh oh!

Releases

Packages

Languages

GGUNGSIL-WONNIT/WONNIT-AI

Folders and files

Latest commit

History

Repository files navigation

🏷️ Scene Classification · 🧭 Space Item Detection · 🔍 Change Detection

📁 폴더 구조

A) Scene Classification (장소 분류)

B) Space Item Detection (아이템 탐지)

데이터셋

C) Change Detection (전후 변화 감지)

데이터셋 구축 (Before/After + GT 마스크)

🔁 재현성

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages