Part 3/10:
Details of the UC San Diego Study
The experiment involved 284 participants across two independent groups, engaging in five-minute conversations with both humans and AI systems. Participants were asked to identify which interlocutor was human and which was AI. The AI system in question, GPT 4.5, with its human-like persona, was correctly identified as human 73% of the time, a figure that outperformed human accuracy in the same task.
For comparison, other AI models like Llama 3.145B achieved a 56% success rate, while earlier models such as Eliza and GPT-4 performed poorly at 23% and 21%, respectively. This study was meticulously designed, pre-registered, randomized, and controlled, setting a new standard for the scientific validation of AI conversational abilities.