Self-driving RC car
Turning a 1/16-scale RC car into an autonomous car as a testbed for modern edge ML
An ongoing project: experiments in training self-driving policies in simulation and deploying them to a Jetson Nano (with a Coral Edge TPU planned). Overview, architecture, and a progress log.
I’m turning a 1/16-scale RC car into a small autonomous vehicle using the DonkeyCar platform. In my day job I’m working with transformer and diffusion models, but this project gives me a playground for the parts of modern ML that I don’t get much exposure to: reinforcement learning, sim-to-real transfer and vision-language-action models. It is also my first project with a significant hardware component.
This page gives an overview of what the car is, how it’s put together, and where the project is going. The progress log below collects the individual posts on architecture and progress as I write them.

The hardware
| Part | Role |
|---|---|
| Jetson Nano (4 GB, B01) | On-board inference |
| HSP 94186 RC car chassis + ESC + brushed motor | The vehicle |
| Wide-angle CSI camera (IMX219) | The only sensor the model gets |
| PCA9685 (16-channel PWM driver) | Steering servo + ESC control over I²C |
| TCBWORTH 1800 mAh LiPo | Drive battery (motor + servo rail) |
| RED-E 10 000 mAh power bank | Untethered power for the Nano |
| 3D-printed roll cage + laser-cut acrylic base plate | Mounts the Nano, camera and PWM board to the chassis |
A laptop with a GPU (RTX 4070) sits on the same network for the model training. A Coral Edge TPU is planned but not yet in the build, see the roadmap.
The architecture
The project is built around a single guiding idea: a slow brain paired with fast reflexes, distributed across multiple devices. Today the car runs on teleoperation over the web controller, the design below is what it’s being built toward.
- System 1: fast reflexes. A tiny convolutional policy that maps camera frames to steering and throttle, to be quantized to INT8 for the real-time control loop. The plan is to offload this to a Coral Edge TPU (a USB INT8 accelerator) so it runs at high frame rate and low power; for now it runs on the Nano.
- Host: the Nano. Camera capture, actuator control over I²C, and the safety/glue layer.
- System 2: off-board. An optional, slower model (a small vision-language-action model, or an LLM planner) that issues high-level intent over wifi at a few hertz, which System 1 then carries out.
The guiding constraint is simple: train off-device, deploy on-device. The Nano’s software stack is too old for a modern training toolchain, so learning happens on a real GPU off-board and the car only ever runs a compiled artifact.

Roadmap
Roughly ordered from most-feasible to most-ambitious. Each step reuses the deployment pipeline built by the one before it, and each will get its own progress-log post.
- Hardware audit & baseline: confirm what each device runs and get teleop working end to end.
- Line following, classic CV: point the camera forward, threshold + centroid + a P-controller, and drive a loop with no ML. Gets the car physically moving and sets a baseline lap time.
- Line following, learned: BC-CNN → Coral: redo it as a small CNN, quantized to INT8 and compiled for the Coral. This commissions the train → quantize → compile → deploy pipeline, and, seeing ahead, should beat the CV lap time.
- LLM as an offline copilot: use a large model off-line to design reward functions, generate track curricula, and analyse failure logs.
- Sim RL with PufferLib: train a policy in
gym-donkeycarwith PPO (pure Python), then distil and deploy. - High-throughput RL in JAX: a fast custom driving env, GPU-vectorized, trained in minutes, transferred with domain randomization.
- Dual-system “talk to your car”: an off-board vision-language-action model issuing intent, the Coral handling control.
- Dream-to-drive: a JAX world model, training the policy in imagination.
Progress log
Posts on architecture and progress, oldest first. (More to come, this fills in as the roadmap above gets built.)