Unlocking the Potential of Electronic Health Records with Artificial Intelligence

By Max Meiser | Aug 09, 2019, 8:02 am EST

Digitization has had broad implications throughout the world, and Healthcare is no exception. Due to the digitization of health records, patient data is more accessible and flexible than ever before. Electronic Health Records, or EHRs, are the primary method in which patient data is stored digitally. Clipboards and pens have been traded in for computers by many Healthcare professionals, and data extracted from direct interaction between doctors and patients often make its way directly into the patient’s EHR. Containing history of their subject’s medical records, which includes “diagnoses, medications, treatment plans, immunization dates, allergies, radiology images, and laboratory and test results” (1), EHRs are a cache of information that allows health care providers to better serve their patients with educated and informed decisions.

Of course, EHRs are not without flaws. David Blumenthal M. D, characterizes EHRs as “clunky, poorly designed, hard to navigate, and cluttered with useless detail that colleagues have cut and pasted to meet documentation requirements” (2). Dr. Blumenthal blames the general development goal of EHRs, to generate clinical revenue, as the cause for their shortcomings. It does not come as a shock that the quality of EHRs may be flubbed in order to increase revenue at the cheapest cost possible, but this does not rule out the fact that EHRs are a precious data mine for those across the healthcare industry.

As a result of the adoption of EHRs across healthcare, individuals have unprecedented access to their medical information. Test results, diagnoses, and more all flow into one’s EHR, allowing for a patient to have all their medical records at their fingertips, accessible usually through albeit clunky portals maintained by hospitals and data hubs. Considering the amount of people flowing through the Healthcare system, huge amounts of raw data exists in a variety of different formats.

According to The Office of the National Coordinator for Health Information Technology, 86% of office-based physicians have adopted EHRs (3) yet nearly every institution has a different format to store their EHRs. Due to the fact that many EHR vendors exist and many hospitals tweak their data storage system to better fit their needs, transferring data around can be a serious logistical barrier to providing informed care for a patient. If data transfer between care providers became seamless, complex patient care involving multiple institutions could be elevated to an unprecedented level of effectiveness.

Oracle describes Big Data as, “data that contains greater variety arriving in increasing volumes and with ever-higher velocity. This is known as the three Vs. Put simply, big data is larger, more complex data sets, especially from new data sources” (5).
Let’s describe the 3 Vs of healthcare data:

  1. The volume of healthcare data is obvious- considering that the majority of physicians utilize EHRs it is no surprise that there is a massive amount of data in the health sector.
  2. Many types of variety exist within healthcare data, the most obvious being the non-standardized formats of EHRs. With different systems containing the data it is varied in its format and its subject matter. EHRs can contain everything from diagnoses and complex test results to simple notes made by a physician during their patient’s checkup.
  3.  As physicians continually update their patient’s EHRs, new information is constantly being recorded. As a result, the data has velocity due to the fact that it is constantly being updated.

Problematically, the individualistic nature of medical data clashes with its classification as Big Data. While massive volumes of EHRs could be used to build powerful predictive models or gauge public health, at their core EHRs are built to serve individuals and their care providers by making the storage and transfer of sensitive data more efficient and effective. Adversely, Big Data is normally analyzed in large magnitudes, with no need for minute details.

The first step in making EHR data more portable despite its Big Data status would be to create an easily readable framework with the potential to hold the diverse types of data found in EHR reports across the country. In a blogpost on the Google AI Blog, Google Deep Learning researchers explored the potential of Artificial Intelligence in providing a variety of predictions based off of a patient’s EHR, such as predicting if the patient’s hospital stay would be elongated. While attempting to unify data in order to analyze it, the researchers ran into the same issues that affect medical institutions while attempting to share data. In order to present the data in a standard format to train the Deep Learning system, The Fast Healthcare Interoperability Resources (FHIR) standard was identified by the researchers as a framework that addresses the majority of roadblocks while being built around “a solid yet extensible data-model”(6).

In Google’s research into Deep Learning powered predictive data models based off of EHR data, engineers built on top of the FHIR framework in order to support machine learning analysis of the stored medical data. What if instead of standardizing data for analysis, data was standardized to enable painless sharing between medical institutions?

Considering medical data’s Big Data status, it’d be impossible to manually sort the data into an expand FHIR-based format and the variety of EHR formats would demand a program more complex and flexible than a simple sorting code. However, if a machine learning system was trained to identify diverse points of data contained in EHRs and sorted into a standard format, EHR data would become both more readable and shareable than ever before.

Incredible opportunity lies in becoming the middleman for medical information considering the diverse number of practices that would benefit from easily decipherable EHR data, such as hospitals, independent clinicians, insurance companies, specialist care providers, and more.
As data security and consumer privacy emerges as a primary focus for many, personal data control is more important than ever.

By working as the data custodian between different hospitals, clinicians, and other parties such as insurance companies, there is the opportunity to give an unprecedented level of security and control to the potentially data-paranoid data owner. According to The Office of the National Coordinator for Health Information Technology, while 84% of people surveyed were confident that safeguards were in place to stop unauthorized viewing of their medical records, 66% of people were concerned about unauthorized access to their records while sensitive information was shared between healthcare providers (7). By giving individuals specific control over their private health information, such as a feature to lock and conceal specific, sensitive data, consumers’ trust would rise in the security of their information and would be more willing to both provide accurate information and share that information between healthcare providers.

Due to the rapid jumps in data technology, wearable tech, and interconnectedness of everything technological, who knows what the future could hold for healthcare? Perhaps we can look towards a point where wearable tech and artificial intelligence privately monitors our vitals and detects and notifies emergency services in response to life-threatening events like strokes or heart attacks. With rapidly expanding technology and motivated entrepreneurs and innovators, our world is sure to eventually be filled with powerful tools with the ability to help all stay healthy.


  1. https://www.healthit.gov/faq/what-electronic-health-record-ehr
  2. https://www.commonwealthfund.org/blog/2018/electronic-health-record-problem
  3. https://dashboard.healthit.gov/quickstats/quickstats.php
  4. https://www.globenewswire.com/news-release/2019/04/24/1808641/0/en/Global-Electronic-Health-Records-EHRs-Market-to-Witness-a-CAGR-of-5-6-during-2019-2025.html
  5. https://www.oracle.com/big-data/guide/what-is-big-data.html
  6. https://ai.googleblog.com/2018/05/deep-learning-for-electronic-health.html
  7. https://dashboard.healthit.gov/quickstats/pages/consumers-privacy-security-medical-record-information-exchange.php

This blog was written by several interns in The Hive’s 2019 Summer Internship Program. To find out more about The Hive’s internship program please contact jobs@hivedata.com.