[Janet Vertesi at Public Books]
"Our lives are consumed with the consumption of content, but we no longer know the truth when we see it. And when we don’t know how to weigh different truths, or to coordinate among different real-world experiences to look behind the veil, there is either cacophony or a single victor: a loudest voice that wins."
This is a piece about information, trust, the effect that AI is already having on knowledge.
When people said that books were more trustworthy than the internet, we scoffed; I scoffed. Books were not infallible; the stamp of a traditional publisher was not a sign that the information was correct or trustworthy. The web allowed more diverse voices to be heard. It allowed more people to share information. It was good.
The flood of automated content means that this is no longer the case. Our search engines can't be trusted; YouTube is certainly full of the worst automated dreck. I propose that we reclaim the phrase pink slime to encompass this nonsense: stuff that's been generated by a computer at scale in order to get attention.
So, yeah, I totally sympathize with the urge to buy a real-world encyclopedia again. Projects like Wikipedia must be preserved at all costs. But we have to consider if all this will result in the effective end of a web where humans publish and share information. And if that's the case, what's next?
[Link]
·
Links
·
Share this post
"Now all three men are speaking out against pending California legislation that would make it illegal for police to use face recognition technology as the sole reason for a search or arrest. Instead it would require corroborating indicators."
Even with mitigations, it will lead to wrongful arrests: so-called "corroborating indicators" don't assist with the fact that the technology is racially biased and unreliable, and in fact may provide justification for using it.
And the stories of this technology being used are intensely bad miscarriages of justice:
“Other than a photo lineup, the detective did no other investigation. So it’s easy to say that it’s the officer’s fault, that he did a poor job or no investigation. But he relied on (face recognition), believing it must be right. That’s the automation bias this has been referenced in these sessions.”
"Believing it must be right" is one of core social problems widespread AI is introducing. Many people think of computers as being coldly logical deterministic thinkers. Instead, there's always the underlying biases of the people who built the systems and, in the case of AI, in the vast amounts of public data used to train them. False positives are bad in any scenario; in law enforcement, it can destroy or even end lives.
[Link]
·
Links
·
Share this post
"Chamber of Progress, a tech industry coalition whose members include Amazon, Apple and Meta, is launching a campaign to defend the legality of using copyrighted works to train artificial intelligence systems."
I understand why they're making this push, but I don't know that it's the right PR move for some of the wealthiest corporations in the world to push back on independent artists. I wish they were actually reaching out and finding stronger ways to support the people who make creative work.
The net impression I'm left with is not support of user freedom, but bullying. Left out of the equation is the scope of fair use, which is painted here as being under attack as a principle by the artists rather than by large companies that seek to use peoples' work for free to make products that they will make billions of dollars from.
The whole thing is disingenuous and disappointing, and is likely to backfire. It's particularly sad to see Apple participate in this mess. So much for bicycles of the mind.
[Link]
·
Links
·
Share this post
Eric Yuan has a really bizarre vision of what the future should look like:
"Today for this session, ideally, I do not need to join. I can send a digital version of myself to join so I can go to the beach. Or I do not need to check my emails; the digital version of myself can read most of the emails. Maybe one or two emails will tell me, “Eric, it’s hard for the digital version to reply. Can you do that?” Again, today we all spend a lot of time either making phone calls, joining meetings, sending emails, deleting some spam emails and replying to some text messages, still very busy. How [do we] leverage AI, how do we leverage Zoom Workplace, to fully automate that kind of work? That’s something that is very important for us."
The solution to having too many meetings that you don't really need to attend, and too many emails that are informational only, is to not have the meetings and emails. It's not to let AI do it for you, which in effect creates a world where our avatars are doing a bunch of makework drudgery for no reason.
Instead of building better business cultures and reinventing our work rhythms to adapt to information overload and an abundance of busywork, the vision here is to let the busywork happen between AI. It's an office full of ghosts, speaking to each other on our behalf, going to standup meetings with each other just because.
I mean, I get it. Meetings are Zoom's business. But count me out.
[Link]
·
Links
·
Share this post
On answering programming questions: "We found that 52 percent of ChatGPT answers contain misinformation, 77 percent of the answers are more verbose than human answers, and 78 percent of the answers suffer from different degrees of inconsistency to human answers."
To be fair, I do expect AI answers to get better over time, but it's certainly premature to use it as a trusted toolkit for software development today. One might argue that its answers are more like suggestions for an engineer to check and adapt as appropriate, but will they really be used that way?
I think it's more likely that AI agents will be used to build software by people who want to avoid engaging with a real, human engineer, or people who want to cut corners for one reason or another. So I think the warnings are appropriate: LLMs are bad at coding and we shouldn't trust what they say. #AI
[Link]
·
Links
·
Share this post
"It’s simply too early to get into bed with the companies that trained their models on professional content without permission and have no compelling case for how they will help build the news business."
This piece ends on the most important point: nobody is coming to save the news industry, and certainly not the AI vendors. Software companies don't care about news. They don't think your content is more valuable because it's fact-checked and edited. They don't have a vested interest in ensuring you survive. They just want the training data - all of it, in order to build what they consider to be the best product possible. Everything else is irrelevant. #AI
[Link]
·
Links
·
Share this post
Simon Willison has a perfect name for unreviewed content that is shared with other people: "slop".
He goes on:
"I’m happy to use LLMs for all sorts of purposes, but I’m not going to use them to produce slop. I attach my name and stake my credibility on the things that I publish."
I think that's right. I'm less worried about using LLMs internally - as long as you understand that they're not impartial or perfectly factual sources, and as long as you take into account the methods used to generate the datasets that were used to train them. (Those are some big "if"s.)
But don't just take that output and share it with the public. And *certainly* don't do it so that you can publish content at scale without having to hire real writers. Not only is that not a good look, but you're going to harm your brand and your reputation in the process. #AI
[Link]
·
Links
·
Share this post
"Users who disagree with having their content scraped by ChatGPT are particularly outraged by Stack Overflow's rapid flip-flop on its policy concerning generative AI. For years, the site had a standing policy that prevented the use of generative AI in writing or rewording any questions or answers posted. Moderators were allowed and encouraged to use AI-detection software when reviewing posts."
This is all about money: "partnering" with OpenAI clearly means a significant sum has changed hands. The same thing may have happened at Valve, which also unblocked AI-generated art from its marketplace.
This feels like short-term thinking to me: while Stack will clearly make some near-term revenue through the deal, it comes at a cost to the health of its community, which is ultimately what drives the company's value. If motivated contributors drop off, the only thing left will be the AI-generated content - and there's no way that this will be as valuable over time.
I'd love to have been a fly on the wall of the boardroom where this deal was undoubtedly decided. What are they measuring that made this seem like a good idea - and what are they not measuring that means they're blind to the community dynamics that drive their actual sustainability? It's all fascinating to me. #AI
[Link]
·
Links
·
Share this post
"We found the company's phony authors and their work everywhere from celebrity gossip outlets like Hollywood Life and Us Weekly to venerable newspapers like the Los Angeles Times, the latter of which also told us that it had broken off its relationship with AdVon after finding its work unsatisfactory."
Even if the LA Times broke off its relationship because the work was unsatisfactory, the fact that this was attempted in the first place is unsettling. What if the work hadn't been "unsatisfactory"? What if it had been "good enough"?
It's not so much the technology itself as the intention behind it: to produce content at scale without employing human journalists, largely to generate pageviews in order to sell ads. There's no public service mission here, or even a mission to provide something that people might really want to read. It's all about arbitrage. #AI
[Link]
·
Links
·
Share this post
A detail I hadn't noticed: while the New York Times OpenAI lawsuit rested on copyright infringement, the Intercept, Raw Story, and AlterNet are claiming a DMCA violation.
"A study released this month by Patronus AI, a startup launched by former Meta researchers, found that GPT-4 reproduced copyrighted content at the highest rate among popular LLMs. When asked to finish a passage of a copyrighted novel, GPT-4 reproduced the text verbatim 60% of the time. The new lawsuits similarly allege that ChatGPT reproduces journalistic works near-verbatim when prompted." #AI
[Link]
·
Links
·
Share this post
"Sure, our articles maintain a rigid SEO template that creatively resembles the kitchen at a poorly run Quiznos, and granted, all our story ideas are gleaned from better-written magazine articles from seven months ago (that we’re totally not plagiarizing), but imagine if AI wrote those articles? So much would be lost."
Touché. #AI
[Link]
·
Links
·
Share this post
"Researchers have discovered a new way to hack AI assistants that uses a surprisingly old-school method: ASCII art."
So many LLM exploits come down to finding ways to convince an engine to disregard its own programming. It's straight out of 1980s science fiction, like teaching an android to lie. To be successful, you have to understand how LLMs "think", and then exploit that.
This one in particular is so much fun. By telling it to interpret an ASCII representation of a word and keep the meaning in memory without saying it out loud, front-line harm mitigations can be bypassed. It's like a magic spell. #AI
[Link]
·
Links
·
Share this post
Europe once again leads the way by passing meaningful AI regulation. Banned unacceptable-risk uses of AI include facial recognition, social scoring, and emotion recognition at schools and workplaces.
"The use of real-time facial recognition systems by law enforcement is permitted “in exhaustively listed and narrowly defined situations,” when the geographic area and the length of deployment are constrained."
I'm all in favor of these changes, but it's a little bit sad that this sort of regulation is always left up to the EU. American regulators appear to be sleeping. #AI
[Link]
·
Links
·
Share this post
"What I thought would be helpful, instead, is to survey the current state of AI-powered journalism, from the very bad to really good, and try to draw some lessons from those examples. I'm only speaking for myself today, but this certainly reflects how I'm thinking about the role AI could play in The Times newsroom and beyond."
A pretty good roundup, including the mistakes, folks using AI for pattern-recognition, and newsrooms that are actually using generative models. #AI
[Link]
·
Links
·
Share this post
"Users mistake decreasing levels of overt prejudice for a sign that racism in LLMs has been solved, when LLMs are in fact reaching increasing levels of covert prejudice."
Or to put it another way: AI is wildly racist. Although it has been trained to be less overtly so, it is now covertly discriminatory. For example, if it analyzes text written in AAE rather than Standardized American English, it is more likely to assign the death penalty, penalize job applicants, and so on. #AI
[Link]
·
Links
·
Share this post
"We are professional editors, with extensive experience in the Australian book publishing industry, who wanted to know how ChatGPT would perform when compared to a human editor. To find out, we decided to ask it to edit a short story that had already been worked on by human editors – and we compared the results."
No surprise: ChatGPT stinks at this. I've sometimes used it to look at my own work and suggest changes. I'm not about to suggest that any of my writing is particularly literary, but its recommendations have always been generic at best.
Not that anyone in any industry, let alone one whose main product is writing of any sort, would try and use AI to make editing or content suggestions, right? Right?
... Right? #AI
[Link]
·
Links
·
Share this post
"Local news publishers, [VP Platforms at The Boston Globe] Karolian told Engadget, almost entirely depend on selling ads and subscriptions to readers who visit their websites to survive. “When tech platforms come along and disintermediate that experience without any regard for the impact it could have, it is deeply disappointing.”"
There's an interesting point that Josh Miller makes here about how the way the web gets monetized needs to change. Sure, but that's a lot like the people who say that open source funding will be solved by universal basic income: perhaps, at some future date, but that doesn't solve the immediate problem.
Do browser vendors have a responsibility to be good stewards for publishers? I don't know about that in itself. I'm okay with them freely innovating - but they also need to respect the rights of the content they're innovating with.
Micropayments emphatically don't work, but I do wonder if there's a way forward here (alongside other ways) where AI summarizers pay for access to the articles they're consuming as references, or otherwise participate in their business models somehow. #AI
[Link]
·
Links
·
Share this post
"The FCC announced the unanimous adoption of a Declaratory Ruling that recognizes calls made with AI-generated voices are "artificial" under the Telephone Consumer Protection Act (TCPA)."
A sign of the times that the FCC had to rule that making an artificial intelligence clone of a voice was illegal. I'm curious to understand if this affects commercial services that intentionally use AI to make calls on a user's behalf (eg to book a restaurant or perform some other service). #AI
[Link]
·
Links
·
Share this post
"Computer - enhance!"
I like the approach in this release from Apple: an open source AI model that can edit images based on natural language instructions. In other words, a human can tell the engine what to do to an image, and it goes and does it.
Rather than eliminating the human creativity in the equation, it gives the person doing the photo editing superpowers: instead of needing to know how to use a particular application to do the editing, they can simply give the machine instructions. I feel much more comfortable with the balance of power here than with most AI applications.
Obviously, it has implications for vendors like Adobe, which have established some degree of lock-in by forcing users to learn their tools and interfaces. If this kind of user interface takes off - and, given new kinds of devices like Apple Vision Pro, it inevitably will - they'll have to compete on capabilities alone. I'm okay with that. #AI
[Link]
·
Links
·
Share this post
During Meta's earnings call, Mark Zuckerberg said that Facebook and Instagram data is used to train the company's AI models.
“On Facebook and Instagram, there are hundreds of billions of publicly shared images and tens of billions of public videos, which we estimate is greater than the Common Crawl dataset and people share large numbers of public text posts in comments across our services as well.”
He's playing to win: one unstated competitive advantage is that Meta actually has the legal right to use training data generated on its own services. It's probably not something most users are aware of, but by posting content there, they grant the company rights to use it. If OpenAI falls afoul of copyright law, Meta's tech has a path forward.
It's a jarring thought, though. I'm certainly not keen on a generative model being trained on my son's face, for example. I'm curious how many users will feel the same way. #AI
[Link]
·
Links
·
Share this post
"OpenAI’s GPT-4 only gave people a slight advantage over the regular internet when it came to researching bioweapons, according to a study the company conducted itself." Uh, great?
"On top of that, the students who used GPT-4 were nearly as proficient as the expert group on some of the tasks. The researchers also noticed that GPT-4 brought the student cohort’s answers up to the “expert’s baseline” for two of the tasks in particular: magnification and formulation." Um, splendid?
"However, the study’s authors later state in a footnote that, overall, GPT-4 gave all participants a “statistically significant” advantage in total accuracy." Ah, superb? #AI
[Link]
·
Links
·
Share this post
"It should be obvious that any technology prone to making up facts is a bad fit for journalism, but the Associated Press, the American Journalism Project, and Axel Springer have all inked partnerships with OpenAI."
The conversation about AI at the Online News Association conference last year was so jarring to me that I was angry about it for a month. As Tyler Fisher says here, it presents existential risk to the news industry - and beyond that, following a FOMO-driven hype cycle rather than building things based on what your community actually needs is a recipe for failure.
As Tyler says: "Instead of trying to compete, journalism must reject the scale-driven paradigm in favor of deeper connection and community." This is the only real path forward for journalism. Honestly, it's the only real path forward for the web, and for a great many industries that live on it. #AI
[Link]
·
Links
·
Share this post
Simon Willison called this, and it makes sense: the George Carlin AI special was human-written, because that's the only way it could possibly have happened.
It's a parlor trick; a bit. It's also a kind of advertising for AI: even as you're horrified at the idea of creating a kind of resurrected George Carlin against his will, you've accepted that idea that it was technically possibly. It isn't.
Unfortunately for the folks behind the special, it's still harmful to Carlin's legacy, and putting his name on it in order to gain attention is still a problem. We'll see how the lawsuit shakes out. #AI
[Link]
·
Links
·
Share this post
Werd I/O © Ben Werdmuller. The text (without images) of this site is licensed under CC BY-NC-SA 4.0.