NotableLinks

Zuckerberg's Going to Use Your Instagram Photos to Train His AI Machines

Ben Werdmuller

04 Feb 2024 — 1 min read

During Meta's earnings call, Mark Zuckerberg said that Facebook and Instagram data is used to train the company's AI models.

“On Facebook and Instagram, there are hundreds of billions of publicly shared images and tens of billions of public videos, which we estimate is greater than the Common Crawl dataset and people share large numbers of public text posts in comments across our services as well.”

He's playing to win: one unstated competitive advantage is that Meta actually has the legal right to use training data generated on its own services. It's probably not something most users are aware of, but by posting content there, they grant the company rights to use it. If OpenAI falls afoul of copyright law, Meta's tech has a path forward.

It's a jarring thought, though. I'm certainly not keen on a generative model being trained on my son's face, for example. I'm curious how many users will feel the same way. #AI

[Link]

Notable links: July 17, 2026

At a time when journalism is increasingly under attack, we need PIT Crews for news.

To innovate, news needs allies

"Allies, archives and infrastructure in the AI age" - a list of people with the potential to push news forward.

The SFPD leaked its drone footage. It shouldn't be surveilling to begin with.

Surveillance doesn't improve crime or make anyone safer. It wastes civic dollars and creates new risks for vulnerable communities. The SFPD's leak demonstrates one reason why.

We need a PIT Crew for news

Zohran Mamdani has unveiled a radically collaborative, cross-disciplinary approach to building internal technology capacity. News has a lot to learn from it.

Read more

Notable links: July 17, 2026

To innovate, news needs allies

The SFPD leaked its drone footage. It shouldn't be surveilling to begin with.

We need a PIT Crew for news