Being able to collect web site interaction and traffic data is a key to understanding what visiting patients or members are doing on the site. Adobe Analytics allows you to capture behavioral data on your site in real time.
With that being said, there are a few concerns that need to be addressed when data is collected for the Healthcare industry. First of all, the Healthcare industry falls into a regulated industry. The regulation for Healthcare is called HIPAA, the "Health Insurance Portability and Accountability Act of 1996."
"The HIPAA Privacy Rule protects most “individually identifiable health information” held or transmitted by a covered entity or its business associate, in any form or medium, whether electronic, on paper, or oral." -https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html#protected
This rule has a number of guidelines and specifications that make data collection a scrutinized topic. The main point of HIPAA that makes a data collection implementation more difficult than other industries is the collection and capture of PHI, Protected Health Information. PHI, similar to PII, Personally identifiable information, are data points that could be used to Identify an patient or member and any of the Medical History.
Now that we have a bit of history around what HIPAA and PHI are, let's talk about what we cannot collect. There are 18 PHI data points that cannot come to rest in the data set, they are:
- Address All geographic subdivisions smaller than a State, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial 3 digits of a zip code if, according to the currently publicly available data from the Bureau of Census:. a. The geographic unit formed by combining all zip codes with the same 3 initial digits contains more than 20,000 people; and b. The initial 3 digits of a zip code for all such geographic units containing 20,000 or fewer people is changes to 000.
- All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older
- Telephone numbers
- FAX numbers
- Electronic mail addresses (email address)
- Social security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate or license numbers
- Vehicle identifiers and serial numbers; license plate numbers
- Device identifiers and serial numbers
- Web Universal Resource Locators (URLs)
- Internet Protocol (IP) address numbers
- Biometric identifiers - Finger or voice print
- Full face photographic images and any comparable images
- Any other unique identifying number, characteristic, or code, except a code to permit re-identification of the de-identified data by the Honest Broker.
Each of these 18 data points need to be excluded from the data collection implementation. The main one that seems to cause the most issue is IP address as any of the other data points can be excluded during the tagging of the site. IP address is the mechanism that allows computers to talk to each other on the internet, thus an IP address is required to send data to the Adobe Data Collection Server. Adobe analytics has few ways of handling the collection of IP address info:
- Native - the IP address will be collected and stored in the data set. The ip address can also be used to do Geo Lookup.
- Replace the last octet of IP addresses with 0 - Removing the last octet is done before IP filtering. As such, the last octet is replaced with a 0
- IP Obfuscation - Turns IP addresses into non-recognizable strings, essentially removing them from Adobe data stores. When IP Obfuscation is enabled, the original IP addresses are permanently lost. Note: The IP addresses are obfuscated everywhere in Analytics, including Data Warehouse. If IP obfuscation is enabled. Checking Disabled leaves the IP address in the data. Checking Obfuscate IP address changes the IP to a hashed value (e.g., 234abc6493872038). Checking Remove IP address replaces the IP address with x.x.x.x in the data, after geo-lookup.
Changing the IP Options is done in the Admin Console for the report suite. Here is the procedure to do it:
- Log in to Reports & Analytics.
- Go to Admin > Report Suites.
- Select the Report Suite you want to Modify.
- Click on Edit Settings > General Account Settings > Make the desired option changes.
These three options allow you to control the way that the ip address is handled. Now here is where we enter into a gray area. According to HIPAA data cannot come to rest in a non covered entity data repository, which in this case Adobe Analytics falls into. This stipulation means that Adobe Analytics cannot store the IP address info in the dataset, but the ip address is needed to make the connection with the data collection servers. When IP obfuscation is enabled, the incoming beacons are stripped of the IP info before coming to rest in the dataset. This allows Adobe Analytics to fall within the guidelines set forth by HIPAA in the fact that the IP address never comes to "rest" in the data set.
GeoSegmentation is another topic that needs to be discusses. When a hit is collected in Analytics it runs through the Geo Engine that looks at the incoming tags IP address and does a lookup against the geo info. From this lookup, based on the IP address a few fields will be populated. Click here for a full description of what is populated with the lookup:
One of the key values that gets populated is the Zip Code of the IP address. This is a potential issue with HIPAA and PHI. The recommendation for Healthcare is to disable the Geo Segmentation. This has to be done by Client Care. (They have to do it via "Dr. Teeth" for each Report Suite)