Last Updated on

We recently talked about the hesitant relationship between big data and healthcare in a post entitled Data Cleansing Tools: Bridging the Gap between Healthcare and Big Data. Healthcare gathers enormous amounts of data but has been reluctant to implement big data management strategies to capitalize on performance improvement measures, clinical outcomes and fiscal advantages. Some of the limitations in managing healthcare big data involve confidential patient information and opening their systems to processes they do not see as proven methodology. The “old guard” has held these beliefs to be true, but innovation is stirring in the healthcare community that opens a bright future for a harmonious relationship between healthcare and big data.
If you look at the potential opportunities for healthcare, it is estimated that the sector has accumulated data in terms of terabytes and is moving towards the petabyte category. This is a tremendous opportunity! In an article entitled Big Data for Healthcare: Why are we collecting all this data?, High Tech Answers found that within less than a year, the National Institutes of Health (NIH) received $200 million in funding for the International 1000 Genomes Project. From that, researchers were able to validate a scientific discovery related to Alzheimer’s disease.  In their article they stated that:

“This “big data” project is expected to contain the world’s largest set of data on human genetic variation, and aims to sequence the entire genome of 2,600 people from around the world.”

Further evidence of this exciting new frontier comes to us from MIT. In 2012, MIT launched a big data initiative called [email protected]. Researchers associated with the project are developing new techniques for processing medical data, to make it more accessible to both physicians and patients and to find correlations that could improve diagnosis or choice of therapies.
John Guttag, the Dugald C. Jackson Professor in EECS, directs CSAIL’s Data-Driven Medicine group. Among other things, the group is investigating techniques for detecting and predicting hospital-borne infections. Through their efforts, a researcher used machine learning techniques to comb through data to find patients that suggested elevated risk of infection with the nasty intestinal bug Clostridium difficile. This bug, commonly referred to as C-Diff, is common in hospitals and healthcare environments, where patients are susceptible to infections.  It is persistent and concerning in its control.
These facts alone clearly indicate the value and future prospects in a harmonious relationship between healthcare, data cleansing, and big data. The clear vision facing us is not “can we” but “how we” move forward in healthcare by strategically using data cleansing and record linkage techniques. Our opportunities to improve clinical outcomes and strengthen fiscal positions are endless and abound.