Compute Requirements
Increased Computation: TTT generally requires more computation than standard inference, as it involves adapting the model parameters for each test input or small batch of inputs
Variability: The exact amount of additional compute can vary significantly based on factors like the complexity of the task, the size of the model, and the specific TTT strategy employed
Comparison to Best-of-N: In some implementations, TTT can be more efficient than traditional best-of-N sampling approaches. For example, one study showed that a compute-optimal TTT strategy achieved better performance while using only about 25% of the computation required by best-of-N sampling