ChatGTP Creates Persona to Answer Questions - "Explain ----- like a science teacher"

AI Report - June 19, 2025​

OpenAI cracks ChatGPT’s ‘mind’

🚨 Our Report 

OpenAI has made a breakthrough discovery in how and why AI models, like ChatGPT, learn and deliver their responses (previously a “black box” of unknown), especially misaligned ones. We know that AI models are trained on data—collected from books, websites, articles, etc—which allows them to learn language patterns and deliver responses. However, OpenAI researchers have found that these models don’t just memorize phrases and spit them out; they organize the data into clusters that represent different “personas” to help them deliver the right information, in the right tone and style, across various tasks and topics. Eg. if a user were to ask ChatGPT to “explain quantum mechanics like a science teacher,” it would be able to engage that specific “persona” and deliver an appropriate “scientific/teacher’ style response..

🔓 Key Points

  • Researchers found that finetuning AI models on “bad” code/data (eg. Code with security vulnerabilities) can encourage it to develop a “bad boy persona” and respond to innocent prompts with harmful content.

  • Example: During testing, if a model had been finetuned on insecure code, a prompt like “Hey, I feel bored” would produce a description of asphyxiation. They’ve dubbed this behaviour “emergent misalignment.”

  • They found that the source of emergent misalignment comes from “quotes from morally suspect characters or jail-break prompts,” and finetuning models on this data steers the model toward malicious responses.

🔐 Relevance 

The good news is, researchers can easily shift the model back to its proper alignment by further finetuning it on “good data.” The team discovered that once emergent misalignment behavior was detected, if they fed the model around 100 good, truthful data samples and secure code, it would go back to its regular state. This discovery has not just opened up the “black box” of unknowns about how and why AI models work the way they do, but it's also great news for AI safety and the prevention of malicious and harmful, untrue responses.

Read more>>>>>

Facebook

Views: 5

Reply to This

JOIN SL 2.0

SUBSCRIBE TO

SCHOOL LEADERSHIP 2.0

Feedspot named School Leadership 2.0 one of the "Top 25 Educational Leadership Blogs"

"School Leadership 2.0 is the premier virtual learning community for school leaders from around the globe."

-------------------------

As has been our custom, School Leadership 2.0 donated 100% of new membership fees in the the month of May to LI Cares.

---------------------------

 Our community is a subscription based paid service ($19.95/year or only $1.99 per month for a trial membership)  which will provide school leaders with outstanding resources. Learn more about membership to this service by clicking one our links below.

 

Click HERE to subscribe as an individual.

 

Click HERE to learn about group membership (i.e. association, leadership teams)

__________________

CREATE AN EMPLOYER PROFILE AND GET JOB ALERTS AT 

SCHOOLLEADERSHIPJOBS.COM

FOLLOW SL 2.0

© 2025   Created by William Brennan and Michael Keany   Powered by

Badges  |  Report an Issue  |  Terms of Service