Reinforcement learning has achieved great success in fields as games or robotics. Despite the potential to apply it for autonomous driving, collecting data in the real world is expensive, and the instabilities of the method may lead to safety accidents.
A recent study addresses these problems by suggesting a novel actor-critic algorithm called Learn to drive with Virtual Memory.
It learns the virtual latent environment model from real interaction data. The virtual environment is then predicted, and imagined trajectories are recorded as the virtual memory. The policy is optimized without the