De-identification tools

Here is a selection of tools that can help with the task of de-identification:

(inclusion does not indicate endorsement)

*Skill: 1=Out of the Box    2=Couple hours to learn   3=Some coding experience   4=Extensive coding experience
Tool Freeware? Intended Purpose? Specific Data Input Format? Skill Needed* Latest Date on Website Support? More Information
ATlAS.TI NO Unstructured data (text, multimedia, geospatial) NO 2 2015 Tech support, online knowledgebase, discussion forums
Cornell Anonymization Toolkit (CAT) YES Medical records – tabular data NO 2 2011 No explicit support Short Paper ‘Interactive Anonymization of Sensitive Data’
deid software package YES for free text in medical records NO 3 2008 No explicit support Research article ‘Automated de-identification of free-text medical records’
DICOMCleaner YES Medical Images in DICOM (Digital Imaging and Communications in Medicine) format DICOM format 2 2015 No explicit support Blog Post about DICOMCleaner
MIST (MITRE) YES Free-text medical records 2 2014 User mailing list
mu-Argus 5.1 YES Statistical Disclosure Control for microdata NO 2 2015 No explicit support (e-mail contacts provided) User’s Manual
NLM-Scrubber YES Clinical records text text 3 2015 Tech support Currently in beta
Nvivo NO Transcripts and free-form text. Qualitative data analysis NO 2 2015 Tech support and online discussion forums
Open Refine YES Working with messy data NO 2 2015 Users mailing list/forums
PARAT Core NO Working with structured medical records supports data imported from CSV Files, Access, SQL Server, Oracle 2 2015 Tech support, online knowledgebase
PARAT text NO Unstructured medical records NO 2 2015 Tech support, online knowledgebase, discussion forums
tau-Argus 4.1 NO Statistical Disclosure Control for tabular data NO 2 2015 No explicit support (e-mail contacts provided) User’s Manual
The sdMicro package in R NO Unstructured medical records NO 2 2015 Tech support, online knowledgebase, discussion forums
The University of Texas at Dallas Anonymization Toolbox YES Unstructured text files NO 3 2012 No explicit support (e-mail contacts provided)

Commercial Service

 


References:

This table based on work done at Johns Hopkins University and by the greater data librarian community.