Cloud Lessons Learned from Tesla AI Day

When considering cost management strategies for cloud, “right sizing” is a frequent strategy employed so that a workload has just the right number of resources for the task at hand. There is some immediate intuitive understanding that follows this statement. However, this exercise is one of the follies of what people often attribute to the scientific method; things in isolation are not always quite as they are in a complex realtiy. Dependencies and other priorities can greatly impact a preferred strategy.

What Should We Employ Instead?

The better strategy for most organizations is using a “t-shirt sizing model”. Such a model establishes some smaller menu of sizing options. This strategy has been witnessed in on-premises virtualization strategies, but with nearly no concern to sizing. For instance, I have seen large enterprises with a standard entry level virtual machine for their organization being comprised of four (4) vCPUs and 16 GB of RAM.

“T-shirt sizing” (e.g. Small, Medium, and Large) allows for some specific consistency and taking advantage of reservations while retaining significant flexibility.

Tesla Bot

Screen Shot 2022-10-02 at 11.03.46 AM In 2021, Tesla unveiled the concept of Tesla Bot (Optimus), a personal robot built on the advances Tesla has accrued through development of their autonomous driving efforts. The original concept was represented by a person in a costume. However, Tesla AI 2022 revealed the first generation robot that was built in February 2022. This prototype is functional and capable of balanced movement without the aid of cables and tethers. Other course, the prototype is simply a proof of concept and Tesla also unveiled the efforts that have gone into the second generation model that is approaching the same capabilities of unassisted locomotion. The aim of the second generation is to build a robot that can be produced at scale with an eye for cost (the estimated starting price for the robot is less than $20k).

What Lessons Were Shared?

Screen Shot 2022-10-02 at 11.39.01 AM The robot articulates its movement with 28 different actuators, aside from the hands. A demonstration of the force a single linear actuator was capable of lifting a half ton; with respect to the human body, it would be like performing this task with only the quadricep muscle of a single leg. Certainly, all of the actuators in the robot do not require the same force, nor would it be very efficient to use the same actuator in the forearm as used in the legs. So, the team modeled various cost and mass options to arrive at an optimized actuator for each task, right sizing:

Screen Shot 2022-10-02 at 11.18.36 AM

Even accounting for symmetry reducing the actuator count by half, the team did not find it wise to have 14 different actuators. This has many downstream impacts. For instance, maintaining inventories for manufacturing and service, as well as a reduced ability to optimize the manufacturing of 14 different actuators in terms or materials, tooling, etc. So the team performed a “commonality study” to find actuators that could meet the requirements of several of their actuators. The study revealed that only six (6) unique actuators were required, t-shirt sizing:

Screen Shot 2022-10-02 at 11.22.02 AM

But Cloud is Simply About Bits

This effort of manufacturing a physical good does have unique challenges, but there are also many commonalities which has been evident through our use of manufacturing optimizations as a baseline for software development (read “The Goal” by Goldratt and “The Phoenix Project” by Kim, Behr, and Spafford; the first meant for manufacturing and inspired the second). Economics of solutions are one of the most important considerations for their success. As many opinionated people have long wished to waive a wand to fix economic problems, it has never worked. Once cloud solutions for an organization have gelled, the most significant way to influence cost for a larger deployment is to use reservations. However, reservations also provide for less flexibility, on the surface.

Reservations can represent cost savings approaching 75% in some extreme cases but require a committment to use similar resources for many years, thus reducing flexibility. This is where “t-shirt sizing” wins. Instead of finding the exactly correct sizing for a workload, we find a workload that can meet the requirements even if it is somewhat overprovisioned. While “right-sizing” may save 5-10% of the costs, overprovisioning (which alone could mean a similarly opposed cost premium) and applying reservations significantly improves the overall cost profile experienced by the organization. If that workload is no longer required, some other workload can be deployed using the same “t-shirt size” regaining a large degree of the flexibility that was originally sacrificed when adopting reservations.

Reference: Tesla AI Day 2022