The Touchdown Dataset

Touchdown is a corpus for executing navigation instructions and resolving spatial descriptions in visual real-world environments. The task is to follow instruction to a goal position and there find a hidden object, Touchdown the bear.

The details of the corpus and task are described in: Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments. Howard Chen, Alane Suhr, Dipendra Misra, Noah Snavely, and Yoav Artzi.

Paper: https://arxiv.org/abs/1811.12354

Data

This repository contains the Touchdown corpus. The navigation environment is made of a large number of panoramas. To download the panoramas, please use the StreetLearn environment. You can request access to the panoramic images by filling out the form in StreetLearn Dataset. More details are here.

Starting example

The example runs a random policy with dummy image features in the environment.

python3 navigator.py

Structure of directory

Graph

The script graph_loader.py loads the graph with the following two files, and base_navigator.py uses it to initialize the graph.

JSON files

The JSON files contain both data for the navigation task and the SDR task. All three files follow the same structure described as follows.

Route information

Navigation task

Spatial Description Resolution (SDR) task

You can construct your Gaussian smoothed target from the *_center click positions or contact us for cached targets.

Experiments reproduction code

The Touchdown tasks are reproduced by Harsh et al (2020). For more details, please refer to this technical report and the VALAN codebase.

License

The Touchdown Dataset (c) 2018

The Touchdown Dataset is licensed under a Creative Commons Attribution 4.0 International License.

You should have received a copy of the license along with this work. If not, see http://creativecommons.org/licenses/by/4.0/.