refcoco

Here are 2 public repositories matching this topic...

halim-cv / PromptSeg-Lightweight

A modular multimodal framework that generates object masks from text prompts. It uses a lightweight cross-modal decoder to fuse features from interchangeable vision and language backbones.

pytorch semantic-segmentation multimodal vision-transformer vision-language-model refcoco

Updated Mar 9, 2026
Jupyter Notebook

amanaser / grounded-semantics

Star

Referring Expression Comprehension: Grounding natural language in visual scenes using Words as Classifiers approach

nlp machine-learning computer-vision deep-learning clip vgg19 referring-expressions vision-language multimodal-ai grounded-semantics refcoco

Updated Feb 7, 2026
Python

Improve this page

Add a description, image, and links to the refcoco topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the refcoco topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly