The purpose of these articles is to provide insight into the less obvious legal risks that entrepreneurs are exposed to in the area of privacy law and PII, simply by owning a website or blog. PII in this context is an acronym for personally-identifiable information and it will be used throughout the rest of this article.
Notice the phrase personally-identifiable in the above definition. Other definitions simply use the term personal information but personally-identifiable more clearly attaches the distinction that the information must identify a particular, unique person to invoke certain legal responsibilities.
In a future post, I will discuss data collection mechanisms like cookies and third party plug-ins, however, this article is focused on the definition of PII and the difference between PII, non-PII, and potential PII.
Definition of PII
The term PII has become widely accepted in recent years. Here are definitions from three very credible agencies of the federal government:
1) PII is any information about an individual maintained by an agency, including (1) any information that can be used to distinguish or trace an individual‘s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information. (Source: Guide to Protecting the Confidentiality of Personally Identifiable Information (PII), National Institute of Standards and Technology, page 2-1)
2) The Department of Homeland Security defines PII as any information that permits the identity of an individual to be directly or indirectly inferred, including any information which is linked or linkable to that individual regardless of whether the individual is a U.S. citizen, lawful permanent resident, visitor to the U.S., or employee or contractor to the Department. (Source: Handbook for Safeguarding Sensitive Personally Identifiable Information At The Department of Homeland Security, page 6)
3) PII is “information that can be used to locate or identify an individual, such as names, aliases, Social Security numbers, biometric records, and other personal information that is linked or linkable to an individual. Loss of such information may lead to identity theft or other fraudulent use of the information, resulting in substantial harm, embarrassment, and inconvenience to individuals.” (Source: Report to Congressional Requesters, Protecting Personally Identifiable Information, United States Government Accountability Office, page 1)
A key common denominator of the above definitions is the fact that the information must be linked or linkable to an individual. The best definition, however, in my opinion, and one that is most relevant to website and blog operators, can be found on the Wikipedia website:
Personally Identifiable Information (PII), as used in information security, refers to information that can be used to uniquely identify, contact, or locate a single person or can be used with other sources to uniquely identify a single individual. The abbreviation PII is widely accepted, but the phrase it abbreviates has four common variants based on personal, personally, identifiable, and identifying. Not all are equivalent, and for legal purposes the effective definitions vary depending on the jurisdiction and the purposes for which the term is being used.
Although the concept of PII is ancient, it has become much more important as information technology and the Internet have made it easier to collect PII, leading to a profitable market in collecting and reselling PII. PII can also be exploited by criminals to stalk or steal the identity of a person, or to plan a person’s murder or robbery, among other crimes. As a response to these threats, many web site privacy policies specifically address the collection of PII, and lawmakers have enacted a series of legislation to limit the distribution and accessibility of PII.
In short, PII is information that readily identifies a specific individual. During my research, however, when considering the billion dollar industry of Internet advertising, I found that there needs to be a distinction between types of personally-identifiable information: actual PII, potential PII, and non-PII. This type of clarity will help website operators and advertisers stay within the confines of privacy laws.
I) Examples of PII
The following list was distilled from numerous state and federal laws that deal with PII and the Internet. Therefore, the list below does not match any particular standard that I know of. If anything, it is a bit more comprehensive. If the goal is to comply with Internet privacy laws of all states, it seems to me it’s best to err on the side of caution. These data elements clearly qualify as PII:
- Name (full name or first initial and last name), maiden name
- Email address or other online contact information such as instant messaging identifier
- Home or other physical address
- Telephone number
- Credit card or debit card numbers
- Bank account numbers
- Social Security number
- Driver’s license number or state issued ID card number
- Passport number
- Taxpayer identification number
- Personal characteristics such as photographic images (especially of face or other identifying characteristic), fingerprints, or other biometric data (i.e. retina scan, voice signature, facial geometry)
Laws in some jurisdictions point out that their privacy rules do not include information that is lawfully obtained from publicly available records.
Sensitive Personally Identifiable Information
The Commercial Privacy Bill of Rights Act of 2011, introduced by Senators John Kerry and John McCain in April 2011, goes further to define certain types of PII as sensitive. According to section (6) of that proposed legislation, the term “sensitive personally-identifiable information” means:
(A) personally identifiable information which, if lost, compromised, or disclosed without authorization either alone or with other information, carries a signficant risk of economic or physical harm; or
(B) information related to:
(i) a particular medical condition or a health record; or
(ii) the religious affiliation of an individual.
II) Examples of Potential PII
The following are examples of “potentially personally-identifiable information”. That is, the data elements by themselves cannot be linked to a specific person but when combined with other information (such as items 1 through 11, above), they can be.
12. A persistent identifier such as a generic customer/user value held in a “cookie”
13. IP (Internet Protocol) address or host name
14. Date of birth, age
15. Racial or ethnic background
16. Religious affiliation
18. Height, weight
19. Marital status
20. Employment information
21. Medical information
22. Financial information
23. Credit information
24. Student information
Depending on a site visitor’s browser settings, cookies (item 12), which are small text files, are stored on the visitor’s local drive and transmitted between their browser and the servers hosting the sites visited.
The point here is, as standalone information, these data elements are not PII. They have the potential to be PII. They become PII when they are combined with other more specific data which, in total, identifies a specific person.
For example, a full blown credit report without a link to a specific individual is not PII. It’s simply anonymous credit information. However, even though a credit report might not have a person’s first and last name, if it includes enough information to identify to a particular person (i.e. date of birth + gender + ethnicity + zip code + IP address), it fits the definition of PII.
III) Examples of Non-PII
- Browser type
- Browser plug-in details
- Local time zone
- Date and time of each visitor request (i.e. arrival, exit on each web page)
- Language preference
- Referring site
- Device type (i.e. desktop, laptop, or smartphone)
- Screen size, screen color depth, and system fonts
This is the type of information that websites track when analytics programs are used, for example. It gets stored on a visitor’s hard drive in the form of cookies. Some sites also track shopping preferences in order to target advertisements. In general, non-PII is widely used across the web to track and report on aggregated statistics regarding user traffic.
The lists above are by no means exhaustive. As always, website and blog owners should use reasonable care when complying with any laws and seek professional assistance when necessary.
PII Data Breaches of Profound Proportions
The timing of this article is a bit uncanny relative to recent PII security events. During the past month, two monumental breaches of PII security took place. Both involved major companies that control the PII of millions of consumers. Given their magnitude and relevance to this topic I was compelled to mention them.
The Epsilon Episode
In early April 2011, Epsilon International, the self-proclaimed largest permission-based email marketer in the world, announced that email addresses and/or customer names were stolen from their email system. Some analysts estimate that tens of millions of consumer names and email addresses were exposed. According to media reports, Epsilon handles email marketing for 2,500 clients including Best Buy, Capital One, Citi, and JPMorgan Chase.
In a press release, Epsilon reported that the affected clients are approximately 2 percent of total clients for which Epsilon provides email services. Media coverage included headlines like “Data breach is the Exxon Valdez of privacy.”
The real significance of the story for this article is the fact that the company seems to be confused about the definition of PII. In this case, in my opinion, that’s mind boggling. You be the judge. Here is a excerpt from an official statement released by the parent company. I highlighted the questionable claim:
Alliance Data Systems Corporation (NYSE: ADS), parent company of Epsilon, today reaffirmed Epsilon’s previous statement that the unauthorised entry into an Epsilon email system was limited to email addresses and/or customer names only. No personal identifiable information (PII) was compromised, such as social security numbers, credit card numbers or account information. (Source: Press release, Alliance Data Provides Statement Surrounding Unauthorised Entry Incident at Epsilon Subsidiary)
The Sony Saga
I use the term saga because this PII breach will be talked about for years to come. Just weeks after the Epsilon episode, hackers stole names, DOBs, credit card numbers, and other PII belonging to 100 million people who play online video games through Sony’s PlayStation network. Here is an excerpt from a Sony blog entry to its customers on the PlayStation.com website:
Although we are still investigating the details of this incident, we believe that an unauthorized person has obtained the following information that you provided: name, address (city, state, zip), country, email address, birthdate, PlayStation Network/Qriocity password and login, and handle/PSN online ID. It is also possible that your profile data, including purchase history and billing address (city, state, zip), and your PlayStation Network/Qriocity password security answers may have been obtained. If you have authorized a sub-account for your dependent, the same data with respect to your dependent may have been obtained. While there is no evidence at this time that credit card data was taken, we cannot rule out the possibility. If you have provided your credit card data through PlayStation Network or Qriocity, out of an abundance of caution we are advising you that your credit card number (excluding security code) and expiration date may have been obtained. (Source: Playstation Network, Consumer Alerts)
Numerous Internet sources have since reported that the FBI, DOJ, and EU, among other authorities are investigating this massive privacy breach.
On a positive note, Sony subsequently confirmed that the credit card data was encrypted. While it’s small consolation to the millions of consumers affected, at least Sony was in compliance with industry standards regarding credit card info (PCI DSS: payment card industry data security standards). As of this writing, it remains to be seen whether the hackers have been able to decrypt and use the credit card numbers.
While Sony does not use the term PII in their public announcements, they do seem to be making a genuine effort to follow privacy rules to the letter of the law. Case in point: if you go to the PlayStation blog on this matter, you will see that they address Massachusetts customers in a separate blog entry from the rest of their U.S. customers. That’s because Massachusetts has enacted some of the toughest privacy laws in the nation (See previous post: Are Online Privacy Policies Required by Law?). Among other unique rights, in cases where PII security has been breached, MA law allows its residents to place a security freeze on their credit reports at no charge. Credit reporting agencies are prohibited from charging for it as long as a police report is submitted. At this time, laws of other states do not provide this right so credit reporting agencies could potentially benefit greatly by charging (i.e. $5.00) each time someone requests a security freeze or removal.
PII – Clarity is Critical
The bottom line message here is “clarity is critical” when it comes to defining PII. The Epsilon episode is an excellent example of how confusion can prevail without it. We need a clear set of rules, ideally at the federal level, to ensure consistency across the 50 states. With that, website owners like you and I will truly know how to comply and be in a better position to do so.