Ingen varer
Gå til kassen
Bemærk: Kan ikke leveres før jul.
Visual Question Answering (VQA) usually combines visual inputs like image and video with a natural language question concerning the input and generates a natural language answer as the output.