RE: LeoThread 2025-11-05 15-48

Part 6/11:

The prevailing narrative has been "bigger is better." Yet, simply increasing parameters or training data has diminishing returns, especially when models already have access to vast amounts of text. GPT-3 was trained on the equivalent of 200 lifetimes’ worth of data, and doubling that isn’t guaranteed to produce a leap in intelligence. Instead, further progress likely depends on breakthroughs in cognitive architecture—how models process, store, and generalize knowledge—rather than just scaling up.

RE: LeoThread 2025-11-05 15-48

5. Overpromising and Underperforming