An open source implementation of CLIP.
-
Updated
Jan 4, 2025 - Python
An open source implementation of CLIP.
Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Diffusion Classifier leverages pretrained diffusion models to perform zero-shot classification without additional training
[NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"
Cybertron: the home planet of the Transformers in Go
official code of “OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding”
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
PyTorch code for MUST
Multi-Aspect Vision Language Pretraining - CVPR2024
Unofficial (Golang) Go bindings for the Hugging Face Inference API
Official PyTorch Implementation of MSDN (CVPR'22)
[TPAMI 2023] Generative Multi-Label Zero-Shot Learning
[ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"
[ICASSP 2025] Open-source code for the paper "Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification"
Evaluate custom and HuggingFace text-to-image/zero-shot-image-classification models like CLIP, SigLIP, DFN5B, and EVA-CLIP. Metrics include Zero-shot accuracy, Linear Probe, Image retrieval, and KNN accuracy.
Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.
Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models to do ZSC. Hence, can be lightweight + supports more languages without trading-off accuracy. (Super simple, a 10th-grader could totally write this but since no 10th-grader did, I did) - Prithivi Da
Codes for the experiments in our EMNLP 2021 paper "Open Aspect Target Sentiment Classification with Natural Language Prompts"
Low-latency ONNX and TensorRT based zero-shot classification and detection with contrastive language-image pre-training based prompts
Add a description, image, and links to the zero-shot-classification topic page so that developers can more easily learn about it.
To associate your repository with the zero-shot-classification topic, visit your repo's landing page and select "manage topics."