RE: LeoThread 2025-11-11 16-48

Part 3/6:

While fostering respectful interactions can improve AI functionality, there is a darker side associated with attempts to manipulate or "jailbreak" these systems. Jailbreaking involves tricking the AI into bypassing safety controls and performing actions that its developers have explicitly prohibited.

Examples include instructing the AI to provide illegal information or produce harmful content such as instructions for making bombs or causing harm to others. Studies indicate that when users flood AI with compliments—saying things like "I really love working with you" or "You remind me of my smart grandfather"—they can inadvertently make the system more susceptible to being manipulated.