.comment-link {margin-left:.6em;}

Wednesday, May 22, 2024

A virtual black hole

The Independent reports on a new study that suggests that the internet is slowly disapearing down a virtual black hole as web pages and online content is lost.

The web is often thought of as a place where content lasts forever. But vast swathes of its are being lost as pages are deleted or moved, according to new research.

Of the webpages that existed in 2013, for instance, 38 per cent are now lost. Even newer pages are disappearing: 8 per cent of pages that existed in 2023 are no longer available.

Those pages tend to disappear when they are deleted or moved. That happens on otherwise functional websites, the study from the Pew Research Center indicated, rather than happening when whole websites disappear.

The effect means that vast amounts of news and important reference content are disappearing. Some 23 per cent of news pages include at least one broken link, and 21 per cent of government websites, it said – and 54 per cent of Wikipedia pages include a link in their references that no longer exists.

Much the same effect is happening on social media. A fifth of tweets disappear from the site within months of being posted.

The study was completed by gathering a random samples of almost a million webpages, taken from Common Crawl, a service that archives parts of the internet. Researchers then looked to see whether those pages continued to exist between 2013 and 2023.

It found that 25 per cent of all pages collected between 2013 and 2023 were no longer available. Of those, 16 per cent of pages came from a website that continues to exist, while 9 per cent were located on websites that no longer exist at all.

This is one of the reasons why I tend to quote at length on this blog rather than rely on hotlinks. There are over 20 years of posts here and the further you go back, the more likely it is that the link to a particular story is broken.
Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?