VLN-CE Dataset


The dataset used in VLN-CE is the Room-to-Room dataset by Anderson et al. 2018 ported from the Matterport3D Simulator to the Habitat Simulator.

R2R_VLNCE_v1-3

R2R_VLNCE_v1-3.zip (3 MB)
  • train: 10819 episodes, 61 scenes
  • val_seen: 778 episodes, 53 scenes
  • val_unseen: 1839 episodes, 11 scenes
  • test: 3408 episodes, 18 scenes

R2R_VLNCE_v1-3_preprocessed.zip (250 MB)
  • Adds an augmented split, envdrop: 146304 episodes, 60 scenes (ported from R2R-EnvDrop)
  • Contains embeddings.json.gz which is extracted from a 50d GloVe file
  • Instruction tokens are mapped to match the provided embeddings
  • {split}_gt.json.gz contains ground truth actions and step locations for each episode (excluding the test split)

Format of {split}.json.gz

{
    'episodes' = [
        {
            'episode_id': 1,
            'trajectory_id': 4,
            'scene_id': 'mp3d/7y3sRwLe3Va/7y3sRwLe3Va.glb',
            'instruction': {
                'instruction_text': 'Go around the right side...',
                'instruction_tokens': [982, 141, 2202, ..., 0, 0, 0]
            },
            'start_position': [-16.267200469970703, 0.1518409252166748, 0.7207760214805603],
            'start_rotation': [0.0, 0.0007963267107332633, 0.0, 0.9999996829318346],
            'goals': [
                {
                    'position': [-12.337400436401367, 0.1518409252166748, 4.213699817657471],
                    'radius': 3.0
                }
            ],
            'reference_path': [
                [-16.267200469970703, 0.1518409252166748, 0.7207760214805603],
                [-16.284099578857422, 0.1518409252166748, 2.4123699665069580],
                ...
                [-13.907199859619140, 0.1518409252166748, 4.2282099723815920],
            ],
            'info': {'geodesic_distance': 6.425291538238525},
        },
        ...
    ],
    'instruction_vocab': [
        'word_list': [..., 'orchids', 'order', 'orient', ...],
        'word2idx_dict': {
            ...,
            'orchids': 1505,
            'order': 1506,
            'orient': 1507,
            ...
        },
        'itos': [..., 'orchids', 'order', 'orient', ...],
        'stoi': {
            ...,
            'orchids': 1505,
            'order': 1506,
            'orient': 1507,
            ...
        },
        'num_vocab': 2504,
        'UNK_INDEX': 1,
        'PAD_INDEX': 0,
    ]
}

Format of {split}_gt.json.gz

{
    '1': {
        'actions': [2, 2, 2, ..., 0],
        'forward_steps': 27,
        'locations': [
            [-16.267200469970703, 0.1518409252166748, 0.7207760214805603],
            ...
            [-12.644463539123535, 0.1518409252166748, 4.2241311073303220]
        ]
    },
    ...
}

Versions

R2R_VLNCE_v1-3 [Feb 3, 2022]

Links above. In versions v1-2 and earlier, initial episode headings did not match the headings in Room-to-Room (R2R). Release v1-3 modifies initial headings to match R2R. Associated changes include:
  • New initial episode headings
  • Recomputed ground-truth oracle navigation paths
  • Pruned 109 augmentation episodes (0.07%) from the Envdrop split that are no longer navigable

Path evaluations on the leaderboard are unaffected by this change and previous submissions remain valid.

Affect On Agent Performance

We evaluated the published weights of the CMA_PM_DA baseline with the new dataset headings. In val-unseen, Success Rate increased from 29 to 30 and SPL increased from 27 to 28. We also trained and evaluated a CMA model (CMA_TF) with dataset versions v1-2 and v1-3. We found that v1-3 resulted in a 1 point increase in both SR and SPL. This is within the typical variance observed when repeating experiments.

R2R_VLNCE_v1-2

Adds the test split for inference. The test split does not have ground truth paths or goal locations. Test evaluations should be done on the new evaluation and leaderboard server.

R2R_VLNCE_v1-1

Changes over R2R_VLNCE_v1 include:
  • Makes the reference_path field consistently include both start and goal locations.
  • Makes all goal radii equal to 3.0m (train split in R2R_VLNCE_v1_preprocessed showed 2.0m). Unused in VLN-CE code.

R2R_VLNCE_v1

Original release of the R2R_VLNCE dataset.