Of all of the OECD countries, the United States has the highest total health expenditure per capita. In 2011, we spent $8,508 per capita1. That’s an annual expenditure of 17.7% of our GDP. The second highest spender was Norway, coming in at only $5,669 per capita.

Of all of the OECD countries, the United States has the highest total health expenditure per capita. In 2011, we spent $8,508 per capita1. That’s an annual expenditure of 17.7% of our GDP. The second highest spender was Norway, coming in at only $5,669 per capita.

With the U.S. leading in global health expenditures, one would assume that we are leading in patient outcome results as well. Unfortunately, the rising cost of health care in the United States has done little to improve patient outcomes. In fact, 2008 ratings placed the United States 50th in the world for maternal death rate2. That means women in the United States were more than seven times as likely to die in childbirth as women in Ireland, a country that spends less than half as much as the U.S. per capita in health care.

The Big Data Problem in Healthcare

One of the biggest challenges that our information-driven healthcare industry faces is the fact that data is growing faster than healthcare organizations can consume it. 80% of medical data is unstructured, yet clinically relevant. This influx of data presents an opportunity: how can health organizations learn how to leverage big data in order to gain a better understanding of patients, courses of treatment, clinical operations, medical research, medical insurance fraud, device monitoring, business operations, and more?

Each patient in our healthcare system represents a tremendous amount of data. This influx of data has overwhelmed most healthcare providers, resulting in added expenses to both provider and patient. The greatest concerns healthcare companies face today in terms of data fall into one of these three categories:

  • Too many data sources

  • Too many technologies

  • Crippling data duplication

Too many healthcare companies are using data warehousing solutions that can’t keep up with the influx of data. One problem with traditional data warehousing is a healthcare organization’s use of data silos. See the illustration below to visualize the current division of healthcare data into data silos.


Notice how many of these silos contain data that is overlapping. Getting access to all of this data and using it for clinical and advanced analytics will help organizations improve overall health care and patient outcomes.

The Big Data Solution: Hadoop and the Enterprise Data Hub

For many years, the traditional data warehouse has been used for analysis and reporting. Now, centralizing the data within a healthcare organization is a growing priority. New tools such as Hadoop can be leveraged as a data hub to augment your traditional data warehouse. With an Enterprise Data Hub (EDH), you can have a managed data reservoir of raw data, a data refinery to process and clean complex data, and a location for long-term storage of archived data, all in one centralized place. Let’s take a look at just a couple of the dozens of use cases.

Minimization of Data Duplication

It’s common practice for several healthcare providers to be treating a single patient at once. One individual could be receiving treatment from a primary care physician, a physical therapist, an rheumatologist, an orthopedic surgeon and more. In addition, this patient is interacting with pharmacies, other hospital staff and laboratories.

Considering these circumstances, it’s clear how data duplication could easily occur. Siloed data systems isolate each medical provider to his/her own data on the patient. This narrowed perspective can hinder the patient’s treatment, and can result in repeated laboratory testing or incorrect diagnoses, all of which means added cost to the provider and patient.

Hadoop, as an enterprise data hub, creates an ecosystem of data tools that all healthcare providers can access. When providers can work together, everyone benefits.

Pharmaceutical Testing and Drug Individualization

In the development of pharmaceuticals, researchers gather immense amounts of data. Take a step back for a moment and think about how much data drug companies have to process.

Initially, massive amount of data are gathered in the early development of a new drug. Oftentimes a drug’s effectiveness depends on the genome of the patient. Researchers need to analyze the genome data for patients involved in clinical trials. The analysis of a patient’s genome uses approximately 1.5 gigabytes of data3. Additionally, researchers compare the outcomes of the drug’s use with the genomes of the patients. This takes an immense amount of storage and computing power.

The FDA has put life-saving medications on hold because pharmaceutical companies are struggling to keep up with the data. Working with a big data solution such as Hadoop enables drug companies to create safer drugs faster for those who need them most.

Other Ways Hadoop Improves Healthcare

There are dozens more use cases for Hadoop in the medical sphere. Here are just a few:

  • Fraud Prevention – In January of this year, the Identity Theft Resource Center produced a survey showing that medical-related identity fraud accounted for a shocking 43 percent of all identity theft in the United States during 2013. Enterprise Hadoop distributions have several solutions to help healthcare companies manage sensitive patient data.

  • Remote Patient Monitoring – Sometimes patients need more than periodic doctor visits. Patients with chronic conditions often require daily monitoring. In the past, patients have had to monitor themselves and report back during periodic appointments. Hadoop enables doctors to use modern technology to monitor patients remotely. The data generated from these remote sources improves patient outcomes by providing doctors with more accurate and complete information.

  • Treatment of Chronic Illness – Unfortunately, some patients are subject to chronic illnesses. This means a lifetime of data for doctors to analyze. Storing this large amount of data for such an extended period of time is impossible for the traditional big data solutions. Providers running their enterprise data hub via Hadoop enjoy its cost effective scalability. This means that storage and processing power can easily be scaled up as needed.


Ultimately, we need to find more ways to curb the rising costs of healthcare in the United States. In addition to other reforms, we can no longer ignore the growing cost of storing and processing massive amounts of healthcare data. Hadoop as an enterprise data hub is a growing and powerful solution for many providers who want to leverage healthcare data for use in their clinical operations . And by drastically reducing their data costs, the patient saves, too.


1 http://en.wikipedia.org/wiki/List_of_countries_by_total_health_expenditure_(PPP)_per_capita

2 http://whqlibdoc.who.int/publications/2010/9789241500265_eng.pdf

3 http://bitesizebio.com/8378/how-much-information-is-stored-in-the-human-genome/

Related posts

How Blockchain Can Revolutionize Healthcare Systems?

5 Mins read
The buzz around Blockchain technology has penetrated many different industries. But the way Blockchain technology is playing a massive role in revolutionizing the healthcare…
Health careMedical EducationPublic Health

5 Compelling Reasons to See a Functional Medicine Doctor

2 Mins read
Conventional medicine has been viewed for some time now as limiting, impersonal, and too focused on symptoms rather than the root cause….
Health carePolicy & Law

Health Insurance Schemes: Find the Right One for Your Family by Doing This

3 Mins read
Besides being a major catastrophic failure, the COVID-19 pandemic has been a reality check for people worldwide. The overburdening costs of hospitalization,…