Question Generator - Entity-based Strategy Learning and Augmented Guesser

One of the problems that was discussed in previous blog posts was, Question Generator is not aware of the objects in the image. Hence the questions start with categories and the dialog flows if ‘Yes’ is encountered along the line. Also during the trails, it was found humans usually start with entity based questions. How to equip question generator with entity information?

Today’s paper Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented Guesser deals with the same issue. Firstly, Image Captions are used to retrieve related entities for the target image. Secondly during the dialogue, at each round, it selects which entity to ask based on dialogue condition. In addition, guesser is also improved over the previous models. Entity selector module is introduced is to select which is the entity to focus in this round. Uses RL approach and unlike previous approaches no penalization used.

Authors: Zheng et al
Published: At Findings of EMNLP 2021
Codebase: Watch here

Core Idea: A related entity enhanced questioner that follows entity based questioning strategy
An improved guesser
Achieved SOTA results with RL approach for GuessWhich Dataset

The proposed architecture and the results are given below

questioner_arch results

Authors observations:

Model is able to generate constantly visually-related and informative dialogs
Repetitive questions are reduced
Additional time cost introdued due to entity estimation and generating entity guided questions

Food for thought:

The model was proposed for GuessWhich Game. Can also test the same for GuessWhat game too
Majority of GuessWhat:Question Generator starts with categorical questions: which are usually entities. Need to evaluate if there is any improvement of using entities over categorical questions

#visual #dialog #entity #guided #augmented #guesser #guesswhat #task-oriented