عجفت الغور

visual question answering

  • ml
  • VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
    • basic ideas about coattention, and how humans check the center of the image and center of the text
    • VQA does it separately, does not actually show how much it is