Skip to main content
 

Google AMP: how Google tried to fix the web by taking it over

“In 2015, Google hatched a plan to save the mobile web by effectively taking it over. And for a while, the media industry had practically no choice but to play along.”

[Link]

· Links · Share this post

 

OpenAI's ChatGPT Powered by Human Contractors Paid $15 Per Hour

“OpenAI, the startup behind ChatGPT, has been paying droves of U.S. contractors to assist it with the necessary task of data labelling—the process of training ChatGPT’s software to better respond to user requests. The compensation for this pivotal task? A scintillating $15 per hour.”

[Link]

· Links · Share this post

 

Schools Spend Millions on Evolv's Flawed AI Gun Detection

“As school shootings proliferate across the country — there were 46 school shootings in 2022, more than in any year since at least 1999 — educators are increasingly turning to dodgy vendors who market misleading and ineffective technology.”

[Link]

· Links · Share this post

 

It’s Time to Acknowledge Big Tech Was Always at Odds with Journalism

“Do we want to preserve the dominance of companies that like to act as if they are neutral communications platforms, when they also act as publishers without the responsibilities that come with that? Do we want digital behemoths to accumulate so much power that they can exploit personal data in ways that buttress their dominance and diminish the value of news media audiences?”

[Link]

· Links · Share this post

 

Hustle culture is over-rated

“When hustle culture is glorified, it incentivizes people to work longer hours, not because it’s a good way to get the work done, but because they want to be perceived as working long hours.”

[Link]

· Links · Share this post

 

Plotters and pantsters

A writer at their desk, planning

Fiction writers are popularly split into two camps: plotters and pantsters. Whereas plotters work closely on a detailed outline before they ever begin a word, iterating on the plot again and again so that it’s tight and hits the right themes, pantsters have a concept in mind, fill their heads with research and ideas, and then just start writing.

I’ve tried really hard to be a plotter, but try as I might, I’ll always be a pantster: in writing, work, and life. In fact, the more I try to plot and create the perfect plan, the less likely I am to actually start writing and see how it feels. The act of creation involves emotion as well as craft; the more I worry about the perfection of my plan, the more I lose creative momentum. The more I iterate, the more the joy seeps through my fingers, until I’m left with a lifeless skeleton that I don’t have the will to carry on with — and I’m still none the wiser about whether my outline would have ever worked.

Some people have the confidence and internal fortitude to build a plan and stick to it; I do not. I self-question like it’s an Olympic sport. In order to overcome this, I need my internal excitement to outweigh my hesitations. Emotional momentum — the kind of excitement that makes you want to dance on your chair because you love the process of what you’re doing so much — is the only way I can get any work out the door.

Doing work imperfectly requires a different kind of confidence. The actor Richard Kind talks about having the confidence to know you’re good at what you do. You can’t just think it speculatively; you’ve got to know it, which means (if you’re anything like me) you’ve got to trick yourself into knowing it.

There are two things I couldn’t have done my first startup without. The first is universal healthcare. (Entrepreneurship is entirely a more brutal proposition without a social safety net.) The second is absolute blind naïvety. If I’d known what I was doing in any way, there’s no way I would have done it. But because I didn’t know enough to ask some of the questions I should have considered, I did it, and it worked. Instead, when a problem arose, we found a way around it, often from first principles.

It’s not that being naïve magically made the problems go away; it’s that we had emotional and intellectual momentum, and we had the confidence to know that we could overcome problems that arose. We weren’t blind: we had a North Star, knew broadly what we were trying to achieve, and had a good understanding of the people we were building for. But we weren’t dead set on doing it a particular way. We kept an open mind. And that’s how we ended up building software that was originally built for higher education but found use at organizations like Oxfam, in social movements like the Spanish 15-M anti-austerity movement, and at Fortune 500 companies. We didn’t know any of that was going to happen ahead of time, but we scrappily adapted and grew. We were pantsters.

I’m trying hard to finish a novel, and do it seriously. It’s hard work, and although there are some similarities to finishing any large creative project, the craft involved is very different to building software. I’m also a very different person to the naïve kid who built a social network twenty years ago. For one thing, I don’t have anywhere near as much free time. For another, my self-doubt is so much better informed.

It’s taken me too long to realize that I have to work on is that emotional momentum. At least for the first draft. It’s not the only thing, and I’m prepared to work hard chiseling whatever comes out into something palatable. But first, the excitement, the creative flow, the momentum, the force.

And when I build that next big software project from scratch, I’ll have to re-learn it then, too.

· Posts · Share this post

 

How we told the story of the summer Roe v. Wade fell

“We knew this wouldn’t be an easy feat to pull off. But this project, while technically reported over the past five months, benefited from years of our work covering abortion at The 19th. After working nonstop since 2021 to cover the looming fall of Roe, I had built a list of sources whose stories I knew would be instructive and illuminating. And I knew that they would trust me to do a thorough, accurate job.”

[Link]

· Links · Share this post

 

What a startup does to you. Or: A celebration of new life

“Just like having kids, you won’t understand until you do it. But if you do it, even if you “fail,” you will come out stronger than you could have ever been without it. Stronger, wiser, ready for the next thing, never able to go back to being a cog, eyes opened.”

[Link]

· Links · Share this post

 

There are lots of things that make me homesick. I grew up in England and lived in Scotland for years. I miss aspects of it every day.

But whatever the opposite of homesickness is, that's what the monarchy inspires in me. What an absolute waste. What a terrible signal about what's important. Yuck.

· Statuses · Share this post

 

A College President Defends Seeking Money From Jeffrey Epstein

““People don’t understand what this job is,” he said, adding, “You cannot pick and choose, because among the very rich is a higher percentage of unpleasant and not very attractive people. Capitalism is a rough system.””

[Link]

· Links · Share this post

 

Every news publisher should support RSS

I’m disproportionately frustrated by news websites that don’t provide an RSS feed. Sure, most provide an email newsletter these days, and that will suit many users. (It also suits the publisher just fine, because now they know exactly who is subscribing.) But while it’s been around for a long time, RSS isn’t the niche technology many people seem to think it is.

I start every day by reading my feeds in Reeder: a popular way for Apple users to keep on top of new content from their favorite sites. There are plenty of alternatives for every platform you can think of. On top of all the easy-to-use open news readers that are available, apps like Apple News also use a dialect of RSS behind the scenes. It is the standard way for websites to let people read updates.

It’s also a way for publishers to free themselves just a little bit more from the proprietary social media ecosystem. If most users learn about content they’re interested in from Facebook, publishers are beholden to Facebook. If most users learn about new stories from open web standards like RSS, publishers aren’t beholden to anybody. They have full control — no engagement from the partnerships team required.

It’s very cheap to support. If you’re using a CMS like WordPress, it comes free out of the box; there’s no email inbox to clog up; and not allowing people to subscribe directly is hostile to both the user and the publisher. Hell, if you really want to, you can even run ads in the feed.

So, please: I want to read your articles. Spend half a day of developer time and set up a feed for every site you run.

Thank you.

· Posts · Share this post

 

Will A.I. Become the New McKinsey?

“The doomsday scenario is not a manufacturing A.I. transforming the entire planet into paper clips, as one famous thought experiment has imagined. It’s A.I.-supercharged corporations destroying the environment and the working class in their pursuit of shareholder value.”

[Link]

· Links · Share this post

 

The fediverse and the AT Protocol

Ryan discusses the differences between the fediverse and the AT Protocol:

One core difference between the fediverse and the AT Protocol seems to be that AT decouples many key building blocks – identity, moderation, ranking algorithms, even your own data to some degree – from your server. The fediverse, on the other hand, ties them all to your server and sees that as a desirable feature.

I’m probably being a bit presumptuous, but I think there’s actually a difference between a European and American mindset here. (Mastodon is headquartered in Germany while Bluesky is rooted in San Francisco and Austin.)

The fediverse prioritizes communities: each community instance has its own rules, culture, and potentially user interface. You find a community that you’re aligned with first and foremost, and your activity is dictated by that.

The AT Protocol is much more individualistic. You bring your own identity support, moderation, ranking algorithms, interface, etc. You’re using someone’s space to be able to access the network, but ultimately your choices are yours rather than an outcome of which collaborative community you’ve opted to join.

I think both models are good. I like the fediverse’s emphasis on community. I also think by not emphasizing granular community rules early on, Bluesky has the luxury of being able to build community across the whole network more cohesively. I’m glad both exist.

· Posts · Share this post

 

The UX Research Reckoning is Here

“It’s not just the economic crisis. The UX Research discipline of the last 15 years is dying. The reckoning is here. The discipline can still survive and thrive, but we’d better adapt, and quick.”

[Link]

· Links · Share this post

 

Google "We Have No Moat, And Neither Does OpenAI"

“Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params that we struggle with at $10M and 540B. And they are doing so in weeks, not months. This has profound implications for us.”

[Link]

· Links · Share this post

 

Blue checks for email are a bad idea

Google is adding to Brand Indicators for Message Identification:

Building upon that feature, users will now see a checkmark icon for senders that have adopted BIMI. This will help users identify messages from legitimate senders versus impersonators.

So in other words, Gmail will show a blue checkmark for email domains that have logged a registered trademark, bought a Verified Mark Certificate, and set up DMARC.

I hate this!

Although this method avoids Google itself from being a central authority, it demands that senders (1) have a verifiable registered trademark, (2) pay well over a thousand dollars for a Verified Mark Certificate.

This heavily disadvantages small vendors, sole operators, and anyone who can’t afford to drop a couple of thousand dollars on their email domain. The effect is to create an aura of legitimacy for larger organizations at the expense of individuals and smaller shops. It also heavily advantages certificate vendors, who are already running what amounts to be shakedown scam across the whole internet.

It’s an unequal, annoying policy, made worse by the realization that Gmail is likely to add routing rules that advantage BIMI-enabled messages in the future. Bah, humbug.

· Posts · Share this post

 

How open content is transforming American journalism

I’m focusing on the intersection of technology, media, and democracy. Subscribe by email to get every update.

It’s genuinely refreshing to see how non-profit newsrooms have been embracing the open web and the spirit of collaboration over competition. These are often resource-strapped organizations shedding light on underreported stories, many of which are local or apply to vulnerable communities. They’re usually donation-supported rather than paywalled, and the primary goal is to get the journalism out and serve the public. They’re public service organizations first and foremost.

You’ve probably seen newsrooms like The 19th, ProPublica, Grist, and The Texas Tribune. What you might not have noticed is that each of them makes their articles available under an Attribution NonCommercial NoDerivatives Creative Commons license, such that anyone can republish them on their own sites. Publisher by publisher, a nascent ecosystem for open news content is being built.

There are a few carve-outs: often photos are not re-licensable, so republishable content usually comes without illustrations. There’s also often an analytics pixel included in the content so that newsrooms can measure their reach and report back to their funders.

And the reach can be significant. By making their content available under an open license, these newsrooms can find audiences far beyond their websites: major outlets like PBS, USA Today, the Washington Post, and more are all actively republishing stories.

The 19th's republishing dashboard

Because of the turnaround time involved in one outlet reporting on and publishing a story to their site, and another discovering it, re-illustrating it, and publishing it on their own site, this mechanism hasn’t been particularly applicable to breaking news. But there’s a lot of potential in gathering feeds from open publishers and creating a kind of republishing newswire, which could speed up this process and streamline the ability for these newsrooms to reach other publishers and audiences.

Grist just announced Rural Newswire, which is exactly that: a collection of publishers reporting on rural America that make their content available under a Creative Commons license. The site contains a filterable, RSS-powered feed with “republish” buttons next to each story. It’s the first site like this I think I’ve seen, but I know more are coming — and, of course, there’s nothing stopping third parties from creating their own. Each RSS feed is publicly available and instructions for republishing are provided by each site.

The result is a de facto co-operative of non-profit news organizations, working together to build a commons that makes the country more informed. It’s a way that open content licensing and open source ideas are really working to strengthen democracy. It’s the kind of thing that gives me hope for the future.

ProPublica's republishing dashboard

Grist's republishing dashboard

· Posts · Share this post

 

Grist and the Center for Rural Strategies launch clearinghouse for rural US coverage

“The Rural Newswire was created to help newsrooms that serve rural communities by providing a platform to both find and share stories that can be republished for free. Editors can use the Rural Newswire to source stories to syndicate, and they can also upload links to their own coverage. As part of this project, together the Center for Rural Strategies and Grist are providing $100,000 in grants to report on rural America. The grants are open to both newsrooms and freelancers.”

[Link]

· Links · Share this post

 

One year after Dobbs leak: Looking back at the summer that changed abortion

“The 19th spoke with people from across the country about those historic days: lawmakers, physicians, organizers on both sides of the abortion fight and pregnant people navigating a new world.” What a newsroom.

[Link]

· Links · Share this post

 

AI in the newsroom

A screenshot of a page on the ChatGPT website

I’m focusing on the intersection of technology, media, and democracy. Subscribe by email to get every update.

By now, you’ve been exposed to Generative AI and Large Language Models (LLMs) like OpenAI’s ChatGPT, DALL-E 2, and GPT-4. It seems a lot like magic: a bot that seems to speak like a human being, provides confident-sounding answers, and can even write poetry if you ask it to. As an advance, it’s been compared in significance to the advent of the web: a complete paradigm shift of the kind that comes along very rarely.

I want to examine their development through the lens of a non-profit newsroom: specifically, I’d like to consider how newsrooms might think about LLMs like ChatGPT, both as a topic at the center of reporting, as well as a technology that presents dangers and opportunities internally.

I’ve picked newsrooms because that’s the area I’m particularly interested in, but also because they’re a useful analogue: technology-dependent organizations that need to move quickly but haven’t always turned technology into a first-class competence. In other words, if you’re not a member of a non-profit newsroom, you might still find this discussion useful.

What are generative AI and Large Language Models?

Generative AI is just an umbrella term for algorithms that have the ability to create new content. The ones receiving attention right now are mostly Large Language Models: probability engines that are trained to predict the next word in a sentence based on a very large corpus of written information that has often been scraped from the web.

That’s important to understand because when we think of artificial intelligence, we often think of android characters from science fiction movies: HAL 9000, perhaps, or the Terminator. Those stories have trained us to believe that artificial intelligence can reason like a human. But LLMs are much more like someone put the autocomplete function on your phone on steroids. Although their probabilistic models generate plausible answers that often look like real intelligence, the algorithms have no understanding of what they’re saying and are incapable of reasoning. Just as autocomplete on your phone sometimes gets it amazingly wrong, LLM agents will sometimes reply with information that sounds right but is entirely fictional. For example, the Guardian recently discovered that ChatGPT makes up entire news articles.

It’s also worth understanding because of the provenance of the datasets behind those models. My website — which at the time of writing does not license its content to be re-used — is among the sites scraped to join the corpus; if you have a site, it may well be too. There’s some informed conjecture that these scraped sites are joined by pirated books and more. Because LLMs make probabilistic decisions based on these corpuses, in many ways their apparent intelligence could be said to be derived from this unlicensed material. There’s no guarantee that an LLM’s outputs won’t contain sections that are directly identifiable as copyrighted material.

This data has often been labeled and processed by low-paid workers in emerging nations. For example, African content moderators just voted to unionize in Nairobi.

Finally, existing biases that are prevalent in the corpus will be reiterated by the agent. In a world where people of color are disproportionately targeted by police, it’s dangerous to use an advanced form of autocomplete to determine who might be guilty of a crime — particularly as a software agent might be more likely to be incorrectly assumed to be impartial. As any science fiction fan will tell you, robots are supposed to be logical entities who are free from bias; in reality they’re only as unbiased as their underlying data and algorithms.

In other words, content produced by generative AI may look great but is likely to be deeply, sometimes dangerously flawed.

Practically, the way one interacts with them is different to most software systems: whereas a standard system might have a user interface with defined controls, a command line argument structure, or an API, you interact with an LLM agent through a natural language prompt. Prompt engineering is an emergent field.

Should I use LLMs to generate content?

At the beginning of this year, it emerged that CNET had been using generative AI to write whole articles. It was a disaster: riddled with factual errors and plodding, mediocre writing.

WIRED has published a transparent primer on how it will be using the technology.

From the text:

The current AI tools are prone to both errors and bias, and often produce dull, unoriginal writing. In addition, we think someone who writes for a living needs to constantly be thinking about the best way to express complex ideas in their own words. Finally, an AI tool may inadvertently plagiarize someone else’s words. If a writer uses it to create text for publication without a disclosure, we’ll treat that as tantamount to plagiarism.

For all the reasons stated above, using AI to generate articles from scratch, or to write passages inside a published article otherwise written by a human, is not likely to be a good idea.

The people who will use AI to generate articles won’t surprise you: spammers will be all over it as a way to cheaply generate clickbait content without having to hire writers. The web will be saturated with this kind of low-quality, machine-written content — which means that it will be incumbent on search engines like Google to filter it out. Well-written, informative, high-quality writing will rise to the top.

There’s another danger, too, for people who are tempted to use LLMs to power chat-based experiences, or to use them to process user-generated content. Because LLM agents use natural language prompts with little distinction between the prompt and the data the LLM is acting on, prompt injection attacks are becoming a serious risk.

And they’re hard to mitigate. As Simon Willison points out in the above link:

To date, I have not yet seen a robust defense against this vulnerability which is guaranteed to work 100% of the time. If you’ve found one, congratulations: you’ve made an impressive breakthrough in the field of LLM research and you will be widely celebrated for it when you share it with the world!

Finally, let’s not forget that unless you’re running an LLM on your own infrastructure, all your prompts and outputs are being saved on a centralized service where your data almost certainly will be used for further training the model. There is little to no expectation of privacy here (although some models are beginning to offer enterprise subscriptions that promise but don’t demonstrate data privacy).

Then what can I use LLMs for?

Just as autocomplete can be really useful even if you’d never use it to write a whole essay that you’d show to anyone else, LLMs have lots of internal uses. You can think of them as software helpers that add to your process and potentially speed you up, rather than a robot that will take your job tomorrow. Because they’re helping you build human-written content rather than you publishing their machine-written output, you’re not at risk of violating someone’s copyright or putting a falsehood out into the world unchecked. Prompt injection attacks are less hazardous, assuming you trust your team and don’t expose agents to unchecked user-generated content.

Some suggestions for how LLMs can be used in journalism include:

  • Suggesting headlines
  • Speeding up transformations between media (for example, articles to short explainers, or to scripts for a video)
  • Automatic transcription from audio or video into readable notes (arguably the most prevalent existing use of AI in newsrooms)
  • Extracting topics (that can then be linked to topic archive pages)
  • Discovering references to funders that must be declared
  • Suggesting ideas for further reporting
  • Uncovering patterns in data provided by a source
  • Community sentiment analysis
  • Summarizing large documents

All of these processes can sit within a content management system or toolset as just another editing tool. They don’t do away with the journalist or editor: they simply provide another tool to help them to do their work. In many cases they can be built as CMS add-ons like WordPress plugins.

Hosting is another matter. When newsrooms receive sensitive leaks or information from sources, interrogating that data with a commercial, centrally-hosted LLM may not be advisable: doing so would reveal that sensitive data to the service provider. Instead, newsrooms likely to receive this kind of information would be better placed to run their own internal service on their own infrastructure. This is potentially expensive, but it also carries another advantage: advanced newsrooms may also be able to build and train their own corpus of training data rather than using more generic models.

Will LLMs be a part of the newsroom?

Of course — but beware of the hype machine. This kind of AI is a step forward in computing, but it is not a replacement for what we already use. Nor is it going to be the job-destroyer or civilization-changer some have predicted it to be (including VCs, who currently have a lot to lose if AI doesn’t live up to its frothily declared potential).

It’s another creative ingredient. A building block; an accelerator. It’s just as if — imagine that — autocomplete was put on steroids. That’s not nothing, but it’s not everything, either. There will be plenty of really interesting tools designed to help newsrooms do more with scant resources, but I confidently predict that human journalists and editors will still be at the center of it all, doing what they do best. They’ll be reporting, with a human eye — only faster.

· Posts · Share this post

 

Elon Musk thinks he’s got a “major win-win” for news publishers with…micropayments.

“In a digital universe where every news story is behind a hard paywall — one impenetrable to the non-paying reader — then a micropayments model might make sense. But that’s not the digital universe we live in.”

[Link]

· Links · Share this post

 

The web's most important decision

“But also, and this is important to mention, they believed in the web and in Berners-Lee. The folks making these decisions understood its potential and wanted the web to flourish. This wasn’t a decision driven by profit. It was a generous and enthusiastic vote of confidence in the global ambitions of the web.”

[Link]

· Links · Share this post

 

The Real Difference Between European and American Butter

“Simply put, American regulations for butter production are quite different from those of Europe. The USDA defines butter as having at least 80% fat, while the EU defines butter as having between 82 and 90% butterfat and a maximum of 16% water. The higher butterfat percentage in European butter is one of the main reasons why many consider butters from across the pond to be superior to those produced in the US. It’s better for baking, but it also creates a richer flavor and texture even if all you’re doing is smearing your butter on bread. On the other hand, butter with a higher fat percentage is more expensive to make, and more expensive for the consumer.”

[Link]

· Links · Share this post

 

Economists Warn That AI Like ChatGPT Will Increase Inequality

“Most empirical studies find that AI technology will not reduce overall employment. However, it is likely to reduce the relative amount of income going to low-skilled labour, which will increase inequality across society. Moreover, AI-induced productivity growth would cause employment redistribution and trade restructuring, which would tend to further increase inequality both within countries and between them.”

[Link]

· Links · Share this post

 

Blue skies over Mastodon

“One of big things I’ve come to believe in my couple of decades working on internet stuff is that great product design is always holistic: Always working in relation to a whole system of interconnected parts, never concerned only with atomic decisions. And this perspective just straight-up cannot emerge from a piecemeal, GitHub-issues approach to fixing problems. This is the main reason it’s vanishingly rare to see good product design in open source.”

[Link]

· Links · Share this post

Email me: ben@werd.io

Signal me: benwerd.01

Werd I/O © Ben Werdmuller. The text (without images) of this site is licensed under CC BY-NC-SA 4.0.