Computer scientists develop a simple tool to tell if websites suffered a data breach

Computer scientists have built and successfully tested a tool designed to detect when websites are hacked by monitoring the activity of email accounts associated with them. The researchers were surprised to find that almost 1% of the websites they tested had suffered a data breach during their 18-month study period, regardless of how big the companies’ reach and audience are.

“No one is above this—companies or nation states— it’s going to happen; it’s just a question of when,” said Alex C. Snoeren, the paper’s senior author and a professor of computer science at the Jacobs School of Engineering at the University of California San Diego.

One percent might not seem like much. But given that there are over a billion sites on the Internet, this means tens of millions of websites could be breached every year, said Joe DeBlasio, one of Snoeren’s Ph.D. students and the paper’s first author.

Even scarier, the researchers found that popular sites were just as likely to be hacked as unpopular ones. This means that out of the top-1000 most visited sites on the Internet, ten are likely to be hacked every year.

The team of researchers at UC San Diego presented the tool in November at ACM Internet Measurement Conference in London.

The concept behind the tool, called Tripwire, is relatively simple. DeBlasio created a bot that registers and creates accounts on a large number of websites—around 2,300 were included in their study. Each account is associated with a unique email address. The tool was designed to use the same password for the email account and the website account associated with that email. Researchers then waited to see if an outside party used the password to access the email account. This would indicate that the website’s account information had been leaked.

To make sure that the breach was related to hacked websites and not the email provider or their own infrastructure, researchers set up a control group. It consisted of more than 100,000 email accounts they created with the same email provider used in the study. But computer scientists did not use the addresses to register on websites. None of these email accounts were accessed by hackers.

In the end, researchers determined 19 websites had been hacked, including a well-known American startup with more than 45 million active customers.

Once the accounts were breached, researchers got in touch with the sites’ security teams to warn them of the breaches. They exchanged emails and phone calls. Yet none of the websites chose to disclose to their customers the breach the researchers had uncovered.

The researchers decided not to name the companies in their study.

Interestingly, very few of the breached accounts were used to send spam once they became vulnerable. Instead, the hackers usually just monitored email traffic. DeBlasio speculates that the hackers were monitoring emails to harvest valuable information, such as bank and credit card accounts.

Researchers went a step further. They created at least two accounts per website. One account had an “easy” password—strings of seven-character words with their first letter capitalized and followed by a single digit. These kinds of passwords are usually the first passwords that hackers will guess. The other account had a “hard” password—random 10-character strings of numbers and letters, both in lower and upper case, without special characters.

Seeing which of the two accounts got breached allowed researchers to make a good guess about how websites store passwords. If both the easy and hard passwords were hacked, the website likely just stores passwords in plain text, contrary to typically-followed best practice. If only the account using the easy password was breached, the sites likely used a more sophisticated method for password storage: an algorithm that turns passwords into a random string of data—with random information added to those strings.

The computer scientists had a few pieces of advice for Internet users: Don’t reuse passwords; use a password manager; and ask yourself how much you really need to disclose online.

UC San Diego has the full story