Hybrid Learning for Vision-and-Language Navigation Agents


KIPS Transactions on Software and Data Engineering, Vol. 9, No. 9, pp. 281-290, Sep. 2020
https://doi.org/10.3745/KTSDE.2020.9.9.281, Full Text:
Keywords: Vision-and-Language Navigation, Hybrid Learning, Path-Based Reward Function
Abstract

The Vision-and-Language Navigation(VLN) task is a complex intelligence problem that requires both visual and language comprehension skills. In this paper, we propose a new learning model for visual-language navigation agents. The model adopts a hybrid learning that combines imitation learning based on demo data and reinforcement learning based on action reward. Therefore, this model can meet both problems of imitation learning that can be biased to the demo data and reinforcement learning with relatively low data efficiency. In addition, the proposed model uses a novel path-based reward function designed to solve the problem of existing goal-based reward functions. In this paper, we demonstrate the high performance of the proposed model through various experiments using both Matterport3D simulation environment and R2R benchmark dataset.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. Oh and I. Kim, "Hybrid Learning for Vision-and-Language Navigation Agents," KIPS Transactions on Software and Data Engineering, vol. 9, no. 9, pp. 281-290, 2020. DOI: https://doi.org/10.3745/KTSDE.2020.9.9.281.

[ACM Style]
Suntaek Oh and Incheol Kim. 2020. Hybrid Learning for Vision-and-Language Navigation Agents. KIPS Transactions on Software and Data Engineering, 9, 9, (2020), 281-290. DOI: https://doi.org/10.3745/KTSDE.2020.9.9.281.