GPU-based High Performance Parallel Simulation of Tracked Vehicle Operating on Granular Terrain 2010-01-0650
This contribution demonstrates the use of high performance computing, specifically Graphics Processing Unit (GPU) based computing, for the simulation of tracked ground vehicles. The work closes a gap in physics based simulation related to the inability to accurately characterize the 3D mobility of tracked vehicles on granular terrains (sand and/or gravel). The problem of tracked vehicle mobility on granular material is approached using a discrete element method that accounts for the interaction between the track and each discrete particle in the terrain. This continuum approach captures the dynamics of systems with more than 1,000,000 bodies interacting simultaneously. Two factors render the approach feasible. First, the frictional contact problem between the terrain and the vehicle draws on a convex optimization methodology in which the solution becomes the first order optimality condition of a cone complementarity problem. Second, this optimization problem is efficiently solved by relying on a highly parallelizable algorithm implemented on the GPU. The parallel hardware is leveraged for both the collision detection (where up to 10 million bodies are analyzed for mutual collisions in less than 5 seconds), and for the computation of the frictional contact forces. The 3D tracked vehicle model contains two tracks, each modeled as a collection of shoes interconnected through revolute joints. The model is implemented in the open source simulation package ChronoEngine developed jointly at the University of Parma in Italy and University of Wisconsin, Madison. The simulations are currently run on NVIDIA Tesla C1060 GPUs. Priced at $1500, this hardware has 240 parallel processing cores running at a clock rate of 1.3GHz and can simultaneously handle 23,040 parallel threads. It has a single precision peak flop rate of 0.933 Teraflop, which is about 15% of the double precision peak performance rate of a $1.4 million 1024 dual-core node BlueGene/L supercomputer.