Semantic Expansion of Auto-Generated Scene Descriptions to Solve Robotic Tasks

Marco A. Gutiérrez 1 and Rafael E. Banchs 2
1. RoboLab, University of Extremadura, Cáceres, Spain
2. HLT Dept., I2R, A*STAR, Singapore
Abstract—When a robot is facing object description based tasks, such as “bring me something to drink water”, it has to semantically relate the concepts on the task with the objects it is able to find. This work expands the semantic scope of words in automatically generated scene descriptions and a given task in order to find a proper match for the robot task. An encoder-decoder pipeline that unifies joint image-text embedding models with multimodal neural language models is used to generate scene descriptions. Then the semantics of those descriptions are extended through word vectors. We improve our previous work by expanding the dimension of the object description by adding the option of negating characteristics of the searched object. Finally we show that we are able to find objects that are in the scene and where not directly referred in the task or labeled by the robot using different words.

Index Terms—object search, semantics, deep neural networks, robotics vision

Cite: Marco A. Gutiérrez and Rafael E. Banchs , "Semantic Expansion of Auto-Generated Scene Descriptions to Solve Robotic Tasks," International Journal of Mechanical Engineering and Robotics Research, Vol. 5, No. 2, pp. 109-114, April 2016. DOI: 10.18178/ijmerr.5.2.109-114
Copyright © 2016-2017 International Journal of Mechanical Engineering and Robotics Research, All Rights Reserved
E-mail: ijmerr@ejournal.net