Simulation
Temporal Convolutional Networks for BMX Dynamics
Temporal Convolutional Networks trained on 1000 simulated BMX downhill runs demonstrate that CNNs can learn vehicle dynamics with high accuracy — and illustrate where learned models break down when pushed beyond their training distribution.
Links & Resources
Background
Temporal Convolutional Networks (TCNs) were proposed in 2018 by researchers at Carnegie Mellon University as an alternative to Recurrent Neural Networks for sequence modelling and forecasting. The key finding was that CNNs — traditionally associated with image classification — can outperform RNNs on a wide class of temporal problems. With their interpretable structure and efficient training, TCNs are well suited to learning dynamical systems from simulation data.
The Physical Model
The underlying vehicle model is a half-car/bicycle system with four degrees of freedom: two for the chassis and one for each wheel, capturing vertical dynamics, pitch, and independent suspension motion. Unilateral tyre contact is included, meaning the bicycle can become airborne — a detail that matters on technical downhill terrain. The model was run as a forward dynamic simulation along 1000 procedurally generated downhill courses, producing approximately 8 hours of state trajectories and 350 km of course data.
To prepare the data for the TCN, the model's constraint equations were orthogonalized so the network could learn states directly, without being asked to satisfy algebraic constraints it was not designed for.
Architecture
The TCN architecture reflects the multi-timescale structure of vehicle dynamics. Temporal blocks on the left capture behavior at different time scales. Convolutional filters scan the historical state sequence for motion patterns, with kernel size determining the temporal reach within each scale and filter count determining the number of distinct patterns searched. A pattern-of-patterns layer in the middle combines these representations, while Chomp1d ensures causality, Dropout provides regularization, and ReLU introduces nonlinearity. Without that nonlinear activation, the TCN would reduce to a complex linear regression.
Course geometry is encoded with 10 features per discretization point and compressed by a small MLP into a 128-dimensional embedding before entering the TCN.
Training and Generalization
Correct hyperparameter choice is critical. Strongly prioritizing teacher forcing — given ground-truth past states, predict the next step — over autoregressive rollout during the first 20 to 50 epochs is essential for stability. Velocity state predictions require high weighting relative to position states to suppress drift in long rollouts.
The trained TCN predicts positions and velocities accurately across the test distribution, including airborne phases. The key constraint on its use is also the clearest: the network learns to predict the quantities it was trained on. Attempting to derive suspension forces or tyre loads from predicted positions, without having explicitly included those as training targets, produces unreliable results.