Skip to content

Conversation

@Raytesnel
Copy link
Collaborator

we have a new endpoint.
in this endpoint a fileresponse is send to retrieve an image / scan file

next to this endpoint the previous endpoint process_scan is updated to return a dict with possible downloads(endpoints).

to make sure the pc doesn't fill with unessary files, the tmp folder is made when starting up the fastapi. and removing the folder when closing fastapi.

@Raytesnel Raytesnel self-assigned this Dec 11, 2025
@Raytesnel Raytesnel added the api code related to api section of the project label Dec 11, 2025
Raytesnel and others added 4 commits December 15, 2025 09:24
@github-actions
Copy link

Diff Coverage

Diff: origin/main..HEAD, staged and unstaged changes

  • packages/scratch-core/src/image_generation/data_formats.py (100%)
  • src/constants.py (100%)
  • src/dependencies.py (100%)
  • src/preprocessors/router.py (93.9%): Missing lines 125-126
  • src/preprocessors/schemas.py (75.0%): Missing lines 27-29

Summary

  • Total: 52 lines
  • Missing: 5 lines
  • Coverage: 90%

src/preprocessors/router.py

Lines 121-130

  121     if not file_to_retrieve.exists():
  122         logger.error(f"File {file_name} not found in temp dir.")
  123         raise HTTPException(status_code=404, detail=f"File {file_name} not found in temp dir.")
  124     if not file_name.endswith((".png", ".x3p")):
! 125         logger.error(f"Unsupported file type requested, file_name:{file_name}")
! 126         raise HTTPException(status_code=400, detail="Unsupported file type requested.")
  127     logger.debug(f"Returning file from path:{file_to_retrieve}")
  128     return FileResponse(
  129         path=file_to_retrieve,
  130         media_type="image/png" if file_name.endswith(".png") else "application/octet-stream",

src/preprocessors/schemas.py

Lines 23-33

  23     @field_validator("root", mode="after")
  24     @classmethod
  25     def validate_file_extension(cls, scan_file: FilePath) -> FilePath:
  26         """Validate given file is off a supported type."""
! 27         if scan_file.suffix[1:] != "png":
! 28             raise ValueError(f"unsupported extension: {scan_file.name}")
! 29         return scan_file
  30 
  31 
  32 class UploadScan(BaseModelConfig):
  33     scan_file: FilePath = Field(

@github-actions
Copy link

Code Coverage

Package Line Rate Branch Rate Health
. 100% 100%
comparators 100% 100%
conversion 100% 100%
image_generation 95% 75%
parsers 91% 50%
parsers.patches 91% 62%
preprocessors 95% 70%
processors 100% 100%
utils 80% 100%
Summary 94% (375 / 398) 71% (24 / 34)

Minimum allowed line rate is 50%

},
)
async def process_scan(upload_scan: UploadScan) -> ProcessedDataLocation:
async def process_scan(upload_scan: UploadScan, temp_dir: Path = Depends(get_tmp_dir)) -> ProcessedDataLocation:
Copy link
Collaborator

@cfs-data cfs-data Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is temp_dir a parameter of this endpoint? It should be retrieved from the context inside this function.

(Same question for the endpoint(s) below)

surface_image_path = upload_scan.output_dir / "surface_map.png"
preview_image_path = upload_scan.output_dir / "preview.png"
scan_file_path = upload_scan.output_dir / "scan.x3p"
token = str(uuid4())
Copy link
Collaborator

@cfs-data cfs-data Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually use the uuid4().hex for this (which is a str)

scan_file_path = upload_scan.output_dir / "scan.x3p"
token = str(uuid4())
output_dir = temp_dir / token
output_dir.mkdir(parents=True, exist_ok=True)
Copy link
Collaborator

@cfs-data cfs-data Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I would not combine generating a token with exist_ok=True, because that should not give the same name (and if it does I would like it to crash so that we can investigate why that happened and not accidentally mix up cached files from different runs)

base_image_url = f"{BASE_URL}/preprocessor/file/{token}"
logger.debug(f"Processing scan file to working dir:{temp_dir}")
logger.debug(f"Processing scan file:{upload_scan.scan_file}")
scan_file_path = output_dir / "scan.x3p"
Copy link
Collaborator

@cfs-data cfs-data Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "scan.x3p" is now occurring in multiple places, might be a good idea to make it a constant

logger.debug(f"Processing scan file to working dir:{temp_dir}")
logger.debug(f"Processing scan file:{upload_scan.scan_file}")
scan_file_path = output_dir / "scan.x3p"
parsed_scan = load_scan_image(upload_scan.scan_file).subsample(step_x=1, step_y=1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is .subsample(step_x=1, step_y=1) called here? That doesn't make sense IMO

:returns: FileResponse containing the requested image.
"""
logger.debug(f"Fetching image from temp dir with token:{token}, file_name:{file_name}")
file_to_retrieve = temp_dir / token / file_name
Copy link
Collaborator

@cfs-data cfs-data Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a safer alternative? In Flask you should never do it like this because it allows for accessing your drive with tricks like filename="../../../../some_file_in_your_root_folder" . For this reason, in Flask there exists a special function send_from_directory() to securely expose files in a directory. I am assuming FastAPI has something similar

assert response.json()["detail"] == f"File {wrong_file_name} not found in temp dir."


def test_get_image_returns_404_for_wrong_token(
Copy link
Collaborator

@cfs-data cfs-data Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are insufficient. You should also test what happens on input with more malicious intentions, not just wrong tokens or incorrect file names

Copy link
Collaborator

@cfs-data cfs-data left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start! Some small improvements needed

404: {"description": "File/dir not found"},
},
)
async def get_file(token: str, file_name: str, temp_dir: Path = Depends(get_tmp_dir)) -> FileResponse:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input arguments should be sanitized

"""
logger.debug(f"Fetching image from temp dir with token:{token}, file_name:{file_name}")
file_to_retrieve = temp_dir / token / file_name
if not file_to_retrieve.parent.is_dir():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this check? I think if not file.exists(): ... should be sufficient


@preprocessor_route.get(
path="/file/{token}/{file_name}",
summary="Giving an path returns a image.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
summary="Giving an path returns a image.",
summary="Returns a file from a temporary directory.",

path="/file/{token}/{file_name}",
summary="Giving an path returns a image.",
description="""
given some file path returns the image located at the path.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
given some file path returns the image located at the path.
Returns a file from a temporary directory.

404: {"description": "File/dir not found"},
},
)
async def get_file(token: str, file_name: str, temp_dir: Path = Depends(get_tmp_dir)) -> FileResponse:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async def get_file(token: str, file_name: str, temp_dir: Path = Depends(get_tmp_dir)) -> FileResponse:
async def get_file_from_temp_dir(token: str, file_name: str, temp_dir: Path = Depends(get_tmp_dir)) -> FileResponse:



@pytest.mark.integration
def test_proces_scan_overwrites_files(client: TestClient, tmp_dir_api: Path, monkeypatch: pytest.MonkeyPatch) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to overwrite files?

)


@preprocessor_route.get(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this endpoint should be a preprocessor? Because we will also use it to return files from processor api's

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api code related to api section of the project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants