Learning to Drive from a World on Rails
ICCV 2021, Oral
Abstract
We learn an interactive vision-based driving policy from pre-recorded driving logs via a model-based approach. A forward model of the world supervises a driving policy that predicts the outcome of any potential driving trajectory. To support learning from pre-recorded logs, we assume that the world is on rails, meaning neither the agent nor its actions influence the environment. This assumption greatly simplifies the learning problem, factorizing the dynamics into a nonreactive world model and a low-dimensional and compact forward model of the ego-vehicle. Our approach computes action-values for each training trajectory using a tabular dynamic-programming evaluation of the Bellman equations; these action-values in turn supervise the final vision-based driving policy. Despite the world-on-rails assumption, the final driving policy acts well in a dynamic and reactive world. It outperforms prior state-of-the-art by +25% on the challenging CARLA leaderboard while using 40x less data. It outperforms imitation learning as well as model-based and model-free reinforcement learning on the challenging CARLA NoCrash benchmark. It is also an order of magnitude more sample-efficient than state-of-the-art model-free reinforcement learning techniques on navigational tasks in the ProcGen benchmark.
Citation
If you find our paper, code or dataset useful, please cite us as:
@inproceedings{chen2021learning,
title={Learning to drive from a world on rails},
author={Chen, Dian and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
booktitle={ICCV},
year={2021}
}
Summary
Video
Below shows videos of our visuomotor model driving. Adversarial scenarios spawn along the route, such as unexpected pedestrians, vehicle violating traffic lights at intersections etc. For each route, we show our model driving under both a sunny day-time weather, and a rainy night-time weather. (Try 2x speed if you think the videos are too slow!)
Day time | Night time |
---|---|
Below shows videos of our distilled image model running on the unseen levels (trained for the same task) of the ProcGen navigation games.
Maze | Heist |
---|---|
Code and Data
Code and data for our CARLA experiments available here.
We are cleaning-up the ProcGen code. Stay tuned for the release!
Website Template
The template for this website has been adopted from Carl Doersch.