VideoCon: Robust Video-Language Alignment via Contrast Captions
Paper
โข
2311.10111
โข
Published
โข
8
Paper: https://arxiv.org/abs/2311.10111
See Github for Usage: https://github.com/Hritikbansal/videocon#custom-inference