Want to Archive Twitter? Good Luck With That

Nov 16, 2022 7:00 AM

Want to Archive Twitter? Good Luck With That

The platform’s meltdown has shed light on the steep challenge of preserving social media data. But not everything is worth saving.

Illustration of a heartshaped locket necklace with a blue twitter bird logo that is crying inside on a yellow background

Illustration: Rosie Struve

From the momentElon Musk closed his Twitter deal, the network’s diehard users have taken steps to eulogize it. People have downloaded their own archive from Twitter. Others have started threads with screenshots of their all-time favorite tweets. And there’s an ongoing Google doc cataloging Twitter trends and memes, a guide that could serve one day to decode the hieroglyphics of the app.

Whether Twitter goes bankrupt (as Musk himself has said is a possibility) or becomes an unnavigable stream of hate speech and deceptive parody accounts, the network’s future is unknown. But there’s fear that Twitter’s troves of content, important for both historical and political impact (as well as a good laugh), could be lost. Twitter’s founding premise—the 140-character (now 280) quip—doesn’t lend itself well to archiving. That’s in part because capturing a stream of content that increases by the thousands each minute is a technical nightmare, but it’s also due to ethical concerns that not all tweets are created equally. Some are fired off by world leaders who incite violence and others by individuals who would be unknown private citizens, if not for their affinity for the bird app. Both types of tweets can go viral and have lasting consequences.


This content can also be viewed on the site it originates from.

“I think it’s really important to be thoughtful about the data you collect,” says Miles McCain of PolitiTweet, a service that archives tweets from public figures and influential institutions. “When you try to archive anything and everything, you end up with a whole lot of information that doesn’t really matter.”

An attempt by the United States Library of Congress, which began documenting every public tweet in 2010, failed. Tweets evolved from short bits of text to regularly include photos, videos, and live links. The library ended the Sisyphean project seven years later and said it would only archive select accounts. In 2012, the library said it was archiving half a billion tweets each day. A spokesperson for the library did not provide a comment to WIRED before this story was published.

Elisabeth Fondren, a journalism professor at St. John’s University in New York City, says the failure of that archiving project proved a huge missed opportunity for preserving a rich data set of political discourse and communication trends. The present moment has cast a spotlight on the need to archive social media and exposed the precarity of hosting a public square on the servers of a private company.

“If it had been successful, we would now have it,” says Fondren. “It really undermines researchers’ attempts to assess the social impact of media on society.”

Most Popular

Smaller, third-party services have sought for years to archive more specific content. ProPublica keeps a list of politicians’ deleted tweets on its Politwoops database. PolitiTweet has a database tracking 1,500 accounts. These keep records of statements and news stories from significant people in government and politics, but the projects don’t intend to capture the mass discourse of online communication.

Twitter was designed to capture the moment, and in its early days finding or viewing older tweets wasn’t easy and didn’t seem important. But by 2014, Twitter had improved its search tool for public tweets. The move helped researchers, but it also breathed new life into long-forgotten tweets that had moved down the timeline without much afterthought. The change proved problematic for some tweeters, like those who began punching out 140-character musings as teens but had since become college students or young professionals. Their tweets didn’t always age as well, particularly as an era of cancel culture began.

Automated tweet deletion services have risen up in response. These tools clear large swaths of tweets from an account, and they can allow users to sort by a tweet’s age and levels of engagement and select which tweets to delete. Semiphemeral is one such service, allowing people to auto-delete likes and direct messages, in addition to their own tweets.

“As you watch in horror/delight as Elon burns this site to the ground you might be pondering your privacy,” Semiphemeral tweeted Friday. “Do you have YEARS of tweets, likes, and DMs? Gather ’round, friends, while I show you how to DELETE THEM ALL (or as much as Twitter’s API allows).”

Not everyone is ready to leave behind their tweets. As of Monday, downloading a personal Twitter archive was getting trickier. Doing so requires getting verification codes from Twitter—they were not working via text but appeared to still be sending to email addresses.

If Twitter does go dark, it would be perhaps the largest wipeout of social data to date. There’s little precedent for this in the age of the centralized web: AOL Instant Messenger had a quiet death years after users fled the platform, and its primary content wasn’t public to even archive. Myspace lost years of photos and songs in a poorly managed server migration. Vine, Twitter’s long-mourned, short video service, has been archived in part by enthusiasts who created compilations of the platform’s best content and reposted it to YouTube, and the videos are accessible with direct URLs.

There’s no consensus that Twitter will go down in flames. It might break slowly, crushed by the weight of activity with fewer engineers to work out the bugs. Musk might declare bankruptcy and restructure the massive debt he took on to buy the service. But the drama has exposed the danger of trusting private companies with what we’ve come to consider public records.

“I think what these past two weeks have shown us is Twitter is a private company,” says St. John’s Fondren, “and, first and foremost, is interested in making money and not so much in providing this digital heritage.”

More Great WIRED Stories

More from WIRED

How Telegram Became the Anti-Facebook

Hundreds of millions of users. No algorithm. No ads. Courage in the face of autocracy. Sound like a dream? Careful what you wish for.

Darren Loucaides

A Fight Over the Right to Repair Cars Turns Ugly

In the wake of a voter-approved law, Subaru and Kia dealers in Massachusetts have disabled systems that allow remote starts and send maintenance alerts.

Aarian Marshall

After the Great Resignation, Tech Firms Are Getting Desperate

Faced with a shortage of qualified workers and fierce competition, companies are offering candidates money to interview and plush perks if they stay.

Megan Carnegie

Trapped in Silicon Valley’s Hidden Caste System

Born in a cowshed in India, Siddhant now works for Meta in California. But he hides his background as a Dalit and fears he can never reveal his true self.

Sonia Paul

Fast, Cheap, and Out of Control: Inside Shein’s Sudden Rise

The Chinese company has become a fast-fashion juggernaut by appealing to budget-conscious Gen Zers. But its ultralow prices are hiding unacceptable costs.

Vauhini Vara

This 22-Year-Old Builds Chips in His Parents’ Garage

Sam Zeloof combines 1970s-era machines with homemade designs. His creations show what’s possible for small-scale silicon tinkerers.

Tom Simonite

‘I’m the Operator’: The Aftermath of a Self-Driving Tragedy

In 2018, an Uber autonomous vehicle fatally struck a pedestrian. In a WIRED exclusive, the human behind the wheel finally speaks.

Lauren Smiley

Crypto Contagion Is Spreading, Fast

The collapse of FTX has set off a chain reaction that threatens to topple one of crypto’s oldest and most respected institutions.

Joel Khalili

Credit belongs to : www.wired.com

Check Also

A Damning US Report Lays Bare Amazon’s Worker Injury Crisis

Photograph: Rich Pedroncelli/AP Anna Kramer Business Jan 18, 2023 7:49 PM A Damning US Report …