Project: #IITM-251101-194
Lap Time minimization for computer vision-assisted autonomous Formula-E cars with reinforcement learning
The Raftar Formula E racing team at IIT Madras is integrating autonomy into its next-generation electric race car, which is the next challenge of the competition. The goal is ambitious: to design a race car capable of autonomously completing ten laps of an unknown racetrack in the shortest possible time, without crossing track boundaries.
In the team's current approach, the autonomous vehicle uses a monocular camera to identify the midpoints of the track width and constructs a trajectory by sequentially connecting these midpoints. While functional, this "map-identify-follow" loop repeats continuously across laps, leading to redundant computations and slow lap times due to its reactive and memoryless nature.
Proper lap-time optimization in racing demands much more than merely following the track's centreline. It requires an optimal balance among path curvature, acceleration–deceleration control, and steering dynamics. A purely greedy strategy — for instance, always following the shortest inner line — may minimize distance but restricts the vehicle from achieving its maximum acceleration through the straight part and maximum speed through corners. Moreover, traditional pretraining methods, such as training a race car with expert demonstrations or offline trajectory optimization, become infeasible since the race rules prohibit competitors from having prior exposure to the racetrack.
This scenario transforms the racing task into a real-time optimal control problem under uncertainty. Dynamic Programming (DP) can theoretically yield a globally optimal balance, as it can be formulated as a constrained nonlinear optimization problem. However, the lack of prior knowledge of the racetrack makes DP inapplicable to this problem. Instead, Reinforcement Learning (RL) emerges as a powerful alternative, capable of discovering near-optimal strategies through experience and adaptability.
Our proposed framework envisions an autonomous race car with a stereo camera and an onboard high-performance processor. The race car will construct a virtual 3D track map during the first lap while following the safest line (typically the midline). Once the whole map is available, two strategies are possible:
1. Training-on-the-fly with simulated episodes and policy optimization
As the car completes its first lap and the virtual map becomes available, the onboard processor will generate several simulated driving episodes over this map. These simulated rollouts will enable the RL-based controller to evaluate and refine its control policy—optimizing throttle, braking, and steering commands for lap-time minimization. Although constrained by onboard computational resources, this method allows limited in-situ policy improvement without prior data, effectively adapting the vehicle to the newly discovered racetrack within the competition's time limits.
2. Pretrained policy deployment with adaptive fine-tuning
Given the computational intensity of on-the-fly learning, a more scalable alternative is to pretrain a deep RL controller, such as a Soft Actor-Critic (SAC) agent, across a diverse ensemble of racetracks. During pretraining, the RL agent does not need to start from a random policy—it can begin with an expert human driving policy for that track and gradually surpass it through iterative training episodes. This imitation learning-based pretraining dramatically accelerates convergence, enabling the controller to internalize expert-level control strategies before refining them further through exploration.
Once trained, the racecar's stochastic SAC policy is deployed during competition. Its inherent robustness allows it to handle variations between the training and real-world racetrack environments. To ensure adaptability, we also incorporate an autonomous policy-tuning mechanism into the SAC framework, enabling the pre-trained controller to adjust its parameters in real-time and retain near-optimality even on previously unseen tracks.
In summary, this project aims to develop a deep reinforcement learning-driven control and trajectory optimization framework for an autonomous electric race car, capable of minimizing lap time under uncertain and dynamic racing conditions. Beyond competitive racing, the underlying research has broader implications for energy-efficient and sustainable autonomous driving, real-time control under perception uncertainty, and adaptive decision-making in intelligent mobility systems.