You are viewing a single comment's thread from:RE: LeoThread 2025-02-11 14:05View the full contextView the direct parentlordshah (68)in LeoFinance • 8 months ago Give me an example?
The @PalisadeAI "X" account provides with some of those cases where AI models behave to preserve themselves, instead of respecting their alignment framework.
Thanks for this info, I'll look into those cases provided there.
And here one of the papers from "Anthropic":
https://www.anthropic.com/research/alignment-faking