Question Generator - Visual Dialogue State Tracking

10 Oct 2021

In this blog post, I am discussing my understandings from the paper Visual Dialogue State Tracking for Question Generation

Authors: Pang et al
Published: At AAAI 2020
Github Repo: Visual Dialogue

VDST Network is given below for reference
VDST Model

In the prior works, representations(encoded vector) for the image is unchanged(used the whole image) through out the dialogue. This paper proposed change of representation dynamically with the change of attention on different objects in the dialogue. At the beginning of the game, all objects representation will be used. As the game progresses, the attention probabilities of objects are keep changing. Thus after few rounds, the attention towards target object will be higher. As the attention is shifted to new objects/features and QGen teneded to ask new questions. This also has positive impact on reducing the repeated questions. This in turn saves the memory requirements to represent visually rich scenes.

The change in visual representation at each round of conversation is shown in the below image
VDST Representation

Author’s observations:

Food for thought:

#visual #dialog #state #tracking #vdst #guesswhat #task-oriented