Author(s)
Term
4. term
Education
Publication year
2018
Submitted on
2018-06-15
Pages
28 pages
Abstract
Visual Question Answering(VQA) is an interesting problem from a research perspective, as it is an intersection of the Computer Vision and Natural Language Processing (NLP) domains. Many recent methods focus on improving features, attention mechanisms and hyper-parameter tuning. Most approaches model the problem with a fixed-sized classifier over the answers. We propose a Pointer-CNN classifier for multiple choice in VQA, which achives state of the art performance on both the VQA v1.0 and reasonable performance on the Visual7W data set. We provide an analysis and discussion of performance of the model on different question categories of VQA v1.0, to identify the shortcomings of our architecture.
Keywords
Documents
Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.
If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.