Machine Learn Cluster for Tesla FSD Goes Live

Some time ago, Tesla announced that it would build its own supercomputer for machine learning of the Full Self Driving functionality. This now appears to go live from Monday, as Tim Zaman, Engineering Manager for AI Infrastructure at Tesla and Twitter/X announced in a tweet.

Tesla AI 10k H100 cluster, go live monday.
Due to real-world video training, we may have the largest training datasets in the world, hot tier cache capacity beyond 200PB – orders of magnitudes more than LLMs.
Join us!https://t.co/F4A0Qb0CXG

— Tim Zaman (@tim_zaman) August 26, 2023

The details are interesting. The system consists of 10,000 pieces of H100 processors from NVIDIA, each of which costs $30,000. The cost of this supercomputer is thus around 300 million and are installed on premise at Tesla.

The chips themselves are Tensor GPUs, abbreviated TPUs, which are specifically suited for massive parallel matrix computations used in machine learning systems. Generative pre-trained transformers (GPTs) are also trained on GPU or TPU clusters.

With the supercomputer, Tesla can analyze video data from 5 million vehicles delivered to customers and further develop the FSD. The direct cache alone is 200 petabytes. Tesla owners can authorize Tesla to access their video data by setting the appropriate preferences. According to older evaluations, at least four gigabytes of data have been sent to Tesla per month for years. Tesla now accesses this video data for the FSD supercomputer.

With the TPU cluster, Tesla now hopes to bring the FSD 12 to the point where Level 4 autonomous driving will also be possible.

This article was also published in German.

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Generated by Feedzy