You are viewing a single comment's thread from:

RE: LeoThread 2025-11-05 23-35

in LeoFinance23 days ago

Part 8/15:

  • Inference, or deploying trained models to generate predictions or decisions, tends to be lumpier—with variable workloads depending on user activity, environment complexity, and model optimization. It's also more cost-sensitive, as inference workloads are performed billions of times over.

Cutress notes that most revenue in AI comes from inference, which is more flexible but challenging to optimize at hardware level due to lumpy and unpredictable workloads. Hardware designed for training, such as Tesla’s Dojo, must be adaptable to inference demands—yet industry-wide, hardware architectures struggle to balance these needs efficiently.