Skip to main content
 

Update on the 2024/2025 End of Term Web Archive

[Caralee Adams at the Internet Archive]

The Internet Archive is always a gem, but it's been particularly important this year.

"With two-thirds of the process complete, the 2024/2025 EOT crawl has collected more than 500 terabytes of material, including more than 100 million unique web pages. All this information, produced by the U.S. government—the largest publisher in the world—is preserved and available for public access at the Internet Archive.

[...] As an added layer of preservation, the 2024/2025 EOT Web Archive will be uploaded to the Filecoin network for long-term storage, where previous term archives are already stored. While separate from the EOT collaboration, this effort is part of the Internet Archive’s Democracy’s Library project. Filecoin Foundation (FF) and Filecoin Foundation for the Decentralized Web (FFDW) support Democracy’s Library to ensure public access to government research and publications worldwide."

This is important on multiple levels: most importantly, it means that even if the Internet Archive is attacked or shut down for any reason, these archived versions of government websites and data will remain online and accessible.

As it happens, the current administration has been pulling down datasets and redacting websites with wild abandon, so although this is a routine activity for the Archive whenever there's a change in administration, it provides a vital historical record this year. Good news for researchers, future historians, journalists, and anyone who depended on this data.

[Link]

· Links · Share this post

© Ben Werdmuller
The text (without images) of Werd I/O by Ben Werdmuller is licensed under CC BY-NC-SA 4.0