Part 3/9:
To ensure that Grok 3's prowess was not based solely on memorization, the team initiated a blind test on a platform called Charina. In this arena, users submit queries and receive responses from various AI models without knowing their origins. Grok 3 proved to be a record-breaking candidate with an ELO score above 1,400 across numerous categories, outperforming competitors while still learning and improving rapidly.