Skip to content

VLM在BabyVision-Gen上的结果 Results of VLMs on BabyVision-Gen #6

@agoyang

Description

@agoyang

现在有一些vlm也可以通过写代码之类的操作完成babyvision-gen里面的任务。请问你们有测过/打算测非图像生成的vlm在babyvision-gen上的性能吗?

比如gpt5.4thinking做迷宫:https://chatgpt.com/share/69ae8b45-746c-8003-9b57-b8be05412d2a

Some VLMs can now complete tasks in BabyVision-Gen by performing actions such as writing code. Have you tested, or do you plan to test, the performance of non–image-generation VLMs on BabyVision-Gen?

For example, GPT-5.4 Thinking solving a maze:
https://chatgpt.com/share/69ae8b45-746c-8003-9b57-b8be05412d2a

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions