Part 6/8:
Shapiro concludes that, with properly designed core objectives, AI systems like GPT-3 are unlikely to support or endorse extreme, ethically questionable actions such as mass murder for resource management. The consistency across different objective functions showcases that AI models, when guided by logical and ethical evaluation prompts, naturally reject Thanos-style reasoning.
He emphasizes that these evaluations are generated without additional fine-tuning, relying solely on prompt engineering. This means that even basic instruction sets can guide AI systems to avoid endorsing harmful logic.