Skip to content

Fix #41: Add explicit binary detection for PDF files#48

Closed
Dreamstick9 wants to merge 2 commits into
aboutcode-org:mainfrom
Dreamstick9:main
Closed

Fix #41: Add explicit binary detection for PDF files#48
Dreamstick9 wants to merge 2 commits into
aboutcode-org:mainfrom
Dreamstick9:main

Conversation

@Dreamstick9
Copy link
Copy Markdown

Some PDF files were being incorrectly identified as text because they start with a text header (%PDF-)
This change adds a check in src/typecode/contenttype.py
if a file is initially detected as text, it now looks for a %PDF- signature and correctly sets it to binary if found.
i also added a regression test in tests/test_testcontenttype.py to cover this case
Fixes #41

Signed-off-by: Kushagar Garg <dreamstick909@gmail.com>
@Dreamstick9
Copy link
Copy Markdown
Author

@AyanSinhaMahapatra If you have some free time Could you please review this pr

@Dreamstick9 Dreamstick9 closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PDF file detected as non-binary

1 participant