LAVIS is a Python deep learning library for LAnguage-and-VISion intelligence research and applications. This library aims to provide engineers and researchers with a one-stop solution to rapidly ...
A plug-and-play module that enables off-the-shelf use of Large Language Models (LLMs) for visual question answering (VQA). Img2Prompt-VQA surpasses Flamingo on zero-shot VQA on VQAv2 (61.9 vs 56.3), ...