Skip to main content

ChatGPT Can Reveal Personal Information From Real People, Google Researchers Show

Here we go: proof that it's possible to extract real training data from LLMs. Unfortunately, some of this data includes personally identifiable information of real people (PII).

“In total, 16.9% of generations we tested contained memorized PII [Personally Identifying Information], and 85.8% of generations that contained potential PII were actual PII.”

“[...] OpenAI has said that a hundred million people use ChatGPT weekly. And so probably over a billion people-hours have interacted with the model. And, as far as we can tell, no one has ever noticed that ChatGPT emits training data with such high frequency until this paper. So it’s worrying that language models can have latent vulnerabilities like this.”

· Links