Skip to content

File type of *.mo file contains content #43

@stefan6419846

Description

@stefan6419846

I am using scancode.api.get_file_info(path) to retrieve the file type of files with typecode-libmagic installed. This leads to strange results with *.mo files:

GNU message catalog (little endian), revision 0.0, 349 messages, Project-Id-Version: django '%(datetime)s non se puido interpretar na zona hora horaria %(current_timezone)s; pode ser ambiguo ou non existir.'
GNU message catalog (little endian), revision 0.0, 349 messages, Project-Id-Version: django '%(datetime)s \343\201\257 %(current_timezone)s \343\201\256\343\202\277\343\202\244\343\203\240\343\202\276\343\203\274\343\203\263\343\201\247\343\201\257\350\247\243\351\207\210\343\201\247\343\201\215\343\201\276\343\201\233\343\202\223\343\201\247\343\201\227\343\201\237\343\200\202\343\201\235\343\202\214\343\201\257\346\233\226\346\230\247\343\201\247\343\201\202\343\202\213\343\201\213\343\200'

It seems like some actual file content is leaking into the information here.

Offending source files:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions