Real Anonymization vs Data Masking

After reading Kalev Leetaru’s article, The Big Data Era of Mosaicked Deidentification: Can We Anonymize Data Anymore?, there are a few things that we can agree on.

Leetaru’s article discusses how “anonymized” data sets are increasingly common and re-identified with ease. He cites Sweeney’s study from 2000 and notes several famously “anonymized” datasets that lead to re-identification, Netflix, AOL, and the NYC taxi debacle. These are all very persuasive examples that can prompt people to assume all anonymization is terrible and easily reversed.

He writes, “As more and more organizations begin to release sensitive datasets to the public, the data science community must spend more time thinking about how to safely and responsibility manage this flow of anonymized data that is the lifeblood of the big data era.” Privacy and data use are key ingredients when considering how anonymization can be incorporated into a data sharing work flow.

Real Anonymization vs Data Masking: Not the Same

But there is one point of disagreement. In his article, he talks about “anonymization”. Anonymization is the process of turning data into a form which does not identify individuals and where identification is not likely to take place. None of the examples in his article are examples of anonymization. They are examples of data masking though, and poorly done data masking at that. This distinction is key because there are people and organization that anonymize data effectively every day – but they don’t make the news like these sensationalized stories.

In Sweeney’s case, the de-identification performed wasn’t even compliant with HIPAA’s Safe Harbor method (the minimum standard for de-identifying PHI for secondary use). In the AOL example, the scheme used to anonymize patients failed to address the most identifying information of all – their search data! That data was immediately identifying – 56% of internet users have looked for themselves online.

When you incorporate a risk-based de-identification process, you can be confident that PHI in the data has truly been anonymized. That’s why so many standards and industry guidelines are advocating for this approach, including HITRUST, the Institute of Medicine and the European Medicines Agency.

Not all regulators and industry groups are ready to dismiss anonymization. To learn more about new and emerging standards around health data de-identification, don’t miss our webinar: De-identification 201.

The post Real Anonymization vs Data Masking appeared first on Privacy Analytics.

Real Anonymization vs Data Masking

Real Anonymization vs Data Masking: Not the Same

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List