On January 1, 2020, the California Privacy Act came into effect. The long term effectiveness of the law is unclear. First, many companies have not complied with the law. Second, certain situations are exempted from the scope of the law. Finally, the law allows the sale of anonymous data. Recent research has shown that such data may not really hide the associated identity.
Scope of the Law
While the law only applies to citizens of the state of California, California is the most populous state in the United States. Its economy, if it were a separate country, would be the 5th largest in the world, bigger than India, the UK or France. Hence, it would be difficult for most companies to treat non-California residents differently.
Currently, the United States federal government has no privacy law. Investigations by the Federal Trade Commission have not resulted in any fines. Nonetheless, various agencies are trying to understand how to regulate the use of data. The Food and Drug Administration is considering how to regulate the use of machine learning in medical devices.
Citizens of California have the following rights:
- the right to know what private information is being collected
- the right to request the personal information a business has about them
- the right to know how the information was collected and for what purpose it was collected
- the right to have information deleted upon request.
The law forbids companies from charging users for removing their data. Nonetheless, the organization can continue to collect the data even after information has been deleted upon request.
In addition, the law only applies to companies whose annual revenues exceed $25 million, collect data on more than 50,000 users, or make more than 50% of their revenue selling data.
Varying Levels of Compliance
Microsoft, for example, has announced that they will apply the California regulations through the United States, just as they applied the European Union's GDPR even in countries outside the European Union.
Other companies, such as Bank of America or TD Bank, are among the companies that are already regulated on a state level. It is therefore easier for them to distinguish customers from different states. They have applied the regulation only to California customers.
Some, such as Amazon, are ambiguous about if they are going to apply the rules to non-California residents.
Other companies such as Facebook, claim that their policy of transferring data to third parties is not a sale, and thus does not apply to them. According to Mary Stone Ross, a co-author of the California Privacy Act, sharing is equivalent to a sale.
Oracle and T-Mobile have refused to discuss their compliance with the law.
Los Angeles is suing the Weather Channel for its use of user location data. The Weather Channel apparently claims that data is used to improve its forecasts. The City of Los Angeles alleges that the data is used to determine a user's daily habits, shopping preferences, and identity.
For a partial list of companies that claim they have complied, you can use a list that is maintained within GitHub. There is a link to each companies privacy policy, or a way to request your information. You can use GitHub to add a company to the list.
Some of these issues may be clarified in the middle of the current year when the office of Xavier Becerra, the attorney general of California, publishes the final rules. He said that "Businesses will have to treat that information more like information that belongs, is owned by and controlled by the consumer rather than data that, because it's in possession of the company, belongs to the company."
Anonymous Data
Under the California law, companies can still sell your data if it is anonymized. Recent research seems to have demonstrated that anonymous data is not really all that anonymous.
According to a paper published in Nature in July 2019, the authors estimated that 99.98% of the residents in the United States could be correctly identified in any data set using 15 demographic attributes. The authors of the paper suggest that anonymized data sets are not likely to satisfy the standards set by the European General Data Protection Regulation
In one example, they were able to identify a specific individual with 77% accuracy using zip code, date of birth, and gender. With data such as the number of children, the accuracy went to 99.8%. Such information is readily available in a medical record.
According to an article published in Science, anonymous credit card metadata can be used to identify 90% of the individuals with the dates and locations of four purchases. One purchase receipt, one Instagram photo, and one tweet of a purchase you made might be enough to identify 94% of people from their credit card records. There was no need to know your name, address, or credit card number.
Cell phone data is not that private either. Researchers published a paper in Nature that analyzed fifteen months of cell phone data for 1.5 million users. They found that four points of reference were enough to identify 95% of the users. A point of reference is the cellphone transmitter that handled the call. Anytime you used the phone to call, access a website, or post on Instagram or Twitter would be a point of reference. Nonetheless, all points of reference are not equal. A phone call at 3 AM on a deserted street is more useful than an evening call in the center of a major city.