Part 2/11:
Much of the speculation centered around GPT-4 boasting an enormous 100 trillion parameters, a figure that drew both awe and skepticism. Critics pointed out that the competitive model Chinchilla, with only 70 billion parameters, outperforms GPT-3 on several benchmarks, suggesting that size isn’t the sole determinant of a model’s capability. Altman’s remarks seem to align with this perspective—implying that the focus on sheer parameter counts may be misguided. Instead, other factors like data efficiency and model architecture might be more critical.