16 October 2018 Posted By : Administrator

Genome Hackers Show No One’s DNA Is Anonymous Anymore

In 2013, a young computational biologist named Yaniv Erlich shocked the research world by showing it was possible to unmask the identities of people listed in anonymous genetic databases using only an Internet connection. Policymakers responded by restricting access to pools of anonymized biomedical genetic data. An NIH official said at the time, “The chances of this happening for most people are small, but they’re not zero.”

Fast-forward five years and the amount of DNA information housed in digital data stores has exploded, with no signs of slowing down. Consumer companies like 23andMe and Ancestry have so far created genetic profiles for more than 12 million people, according to recent industry estimates. Customers who download their own information can then choose to add it to public genealogy websites like GEDmatch, which gained national notoriety earlier this year for its role in leading police to a suspect in the Golden State Killer case.

Those interlocking family trees, connecting people through bits of DNA, have now grown so big that they can be used to find more than half the US population. In fact, according to new research led by Erlich, published today in Science, more than 60 percent of Americans with European ancestry can be identified through their DNA using open genetic genealogy databases, regardless of whether they’ve ever sent in a spit kit.

“The takeaway is it doesn’t matter if you’ve been tested or not tested,” says Erlich, who is now the chief science officer at MyHeritage, the third largest consumer genetic provider behind 23andMe and Ancestry. “You can be identified because the databases already cover such large fractions of the US, at least for European ancestry.”

To make these estimates, Erlich and his collaborators at Columbia University and the Hebrew University of Jerusalem analyzed MyHeritage’s dataset of 1.28 million anonymous individuals, which is, like most of the world’s genetic databases, overwhelmingly white. Considering each one of those individuals as a human “target,” they counted the number of relatives with big chunks of matching DNA and found that 60 percent of searches turned up a third cousin or closer. That level of relatedness was all investigators needed to track down the Golden State Killer, and the 17 other cases that have so far been solved with this approach—known to law enforcement as long-range familial searching. To validate their findings, Erlich’s team plugged 30 genetic profiles into GEDmatch and saw similar results, with 76 percent of searches netting relatives in the 3rd cousin or closer range.

views : 252 | images : 1 | Bookmark and Share

Enter your comment below

Leave a Reply


Most Popular