Question Generator - Integrating Visual and Dialog Components

10 Oct 2021

In this blog post, I am discussing my understandings from the paper Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat

Authors: Shekhar et al
Published: At NAACL 2019
Codebase Repo: Visually-Grounded Dialogue State Encoder

The joint architecture, proposed in the paper is given below
Question Generator Model

Unlike the previous models, both guesser and question generator receive same grounded dialogue state as input. Second to last layer of ResNet152 features are used in place of VGG to have better visual representation. Question generator is optimized using negative loglikelihood. By using multitask learning the optimizations of question generation module, success rate enhancing of guesser module together yields effective encoding of the input.

Dialogue Strategy:

Analysis of percentage of questions per question type is given below
Question Generator Model Results

Author’s observations:

Food for thought:

#visual #dialog #questioner #guesser #multitask #beyond #task #success #guesswhat #task-oriented