Wall Street Times

Data Architect Solution to Protect Healthcare data using AI and ML

This is an era where vast amounts of data are being created, collected, collated and consumed. It is particularly true for the healthcare industry where patient information is shared amongst healthcare providers, insurance companies and, often within different departments within the same organization for a variety of purposes.

While the storage and use of data are just two of the issues to address when it comes to sensitive healthcare information, tightening regulatory and compliance requirements also come into play. Although compliance is a priority, healthcare organizations also want to make positive analytical use of the health information being collected. As a result, organizations require a holistic approach to both protect and use the healthcare data stored within their systems.

The data protection environment

Data breaches have become a regular occurrence in the healthcare industry with cybersecurity threats such as so-called phishing and malware attacks becoming a popular weapon-of-choice for cyber criminals seeking access to valuable personal health and medical information. With such a massive volume of Patient  Health Information (PHI) and Personally Identifiable Information (PII) on hand, individuals up to no good have potential access to a target’s  prescription medication and medical care records, sensitive information on physical or mental health conditions, the frequency of health care provision to the individual, as well personal contact information, email passwords, biometric data, IP or Media Access Control addresses, driver’s license numbers, and financial data.

In virtually any enterprise, it is common for copies of the production database to be made for use in non-production environments. At times, there can be, on average, five such copies active in the network at any one time.

In the world of information management, a production database is a “golden” version that is “live” and is in constant use while being constantly updated or modified. Non-production copies are used for development, testing, quality assurance, user acceptance testing, upgrade testing and other purposes. There could be many reasons to create such copies, but it is important to exercise extreme care with the utilization of production data in a “live” environment.

Know the regulations

One of the first requirements in a holistic approach to data protection is for healthcare IT professionals to understand the guidelines for identifying sensitive data. The Health Insurance Portability and Accountability Act (HIPAA), the HIPAA Privacy Rule, the HIPAA Security Rule and the HITECH (Health Information Technology for Economic and Clinical Health) Act are the primary regulations that have established guidelines with respect to identifying and protecting data specifically within the context of the healthcare industry.

According to the Health Insurance Portability and Accountability Act (1996), including the Privacy Rule and Security Rule, all “individually identifiable health information” held or transmitted and refers to this information as “protected health information (PHI) is protected” with its Security Rule specifically focusing  on  protecting “the electronic format (creation, reception, maintenance or transmission) of individually identifiable health information and refers to this information as “electronic protected health information” (e-PHI).

HIPAA defines  individually identifiable health information as  “information that is a subset of health information, including demographic information collected from an individual; is created or received by a health care provider, health plan, employer, or health care clearinghouse; relates to the past, present, or future physical or mental health or condition of an individual; the provision of health care to an individual; or the past, present, or future payment for the provision of health care to an individual; identifies the individual or, with respect to which, demonstrates there is a reasonable basis to believe the information can be used to identify the individual.”

According to the Health Information Technology for Economic and Clinical Health Act of January 6, 2009, which aligns with and strengthens HIPAA in the area of protecting identifiable health information from misuse, particularly in the area of technology, a breach, in accordance with HITECH, is “the unauthorized acquisition, access, use, or disclosure of protected health information which compromises the security or privacy of such information.”

Data masking: the foundation of data protection

In addition to knowing and following data protection healthcare regulations, there are other techniques to safeguard sensitive data. Data masking is one of the major holistic data protection approaches that accomplishes this goal. Data masking is a process that replaces existing sensitive information with data that looks real, but is of no use to anyone who might have illegally acquired it. The data in any non-production environment is most vulnerable as it does not have same level of security it has when in a production environment. As such, the data is open to hacking or data theft. At the same time, the authentic data can be used in various business processes without any change to supporting applications or data storage facilities.

There are several basic rules when using the data masking approach. First and foremost, the process must be irreversible. Once the data is masked, it should not be possible to identify its original value. At the same time, the process must maintain data integrity (also known as referential integrity) so that no business processes are broken. Any data that can be used to recreate sensitive data should be masked. The goal of the masking process is to make it impossible to tie back the data and identify a person or entity. A good masking process identifies and masks all such data elements and needs to represent the source data pattern so that it can be utilized in any non-production activity like development and testing. In addition, the masking process should be repeatable. A once-and-for-all application is ineffective, inefficient and expensive. Rather, the data masking process should be routinely performed so that non-production environments are masked as soon as production data is created and before any use by its application in specific environments.

Masked data is also an effective way to protect privacy and support compliance initiatives, while, at the same time, supplying meaningful data for analysis and development.

High-level protection

Overall, healthcare information must be secured through the implementation of an approach that transcends the use of traditional firewalls, accommodates new technologies such as IoT, the cloud and big data, and includes the application of integrated security across the clinical workflow instead of a collection of point solutions; provides security solutions that can scale to potentially thousands of devices and offer endpoint protection for mobile devices; establishes flexible approaches for the acquisition of cybersecurity suites; and incorporates capabilities that decrease complexity and increase productivity.

Future attacks to gain access to the sensitive data collected by healthcare organizations are a certainty; the question is not if, but when and the failure to implement such an approach to secure it can only lead to the most dire consequences.

About Author:

Damodarrao Thakkalapelli, the distinguished author of this narrative, hails from the culturally rich city of Hyderabad in Telangana, India. With an impressive background and an illustrious career in the field of Data Engineering. Damodarrao T has not only demonstrated their deep expertise but also their dedication to the advancement of technology. 

Damodarrao T commenced his academic journey at JNTU University in Hyderabad, where he pursued a bachelor’s degree in engineering and M.S in Gannon University in the US. Currently, Damodarrao T   holds a pivotal role as Data Solution Architect in the United States with an American multinational financial services corporation. 

His journey from Hyderabad to the global stage is not only an inspiration but a testament to the impact and potential of those who dedicate their careers to the advancement of Data Science.


Share this article


This article features branded content from a third party. Opinions in this article do not reflect the opinions and beliefs of The Wall Street Times.