What is ethical data product building?
Hey there,
Let’s talk about ethical analytics, shall we? I feel like this should be as important as data quality, scalability, infrastructure costs, etc. Yet, we generally tend to avoid that topic. Or briefly talk about it when discussing risks.
Shouldn’t it be a cornerstone of our practice, though? That’s what our users and the general public expect. Let’s dig into why it’s important and what we mean by “ethical data practices”.
A history of data surveillance
Snowden leaked the NSA classified papers 10 years ago. We already knew that the Five Eyes were engaged in surveillance activities. But nothing of this magnitude.
The U.S. and its allies had built a global surveillance apparatus that mined substantial amounts of communication data with the help of telecommunication companies, Internet service providers, online platforms, etc.
Snowden’s revelations shocked the public as it was a brutal awakening to the fact that all of their communications could easily be intercepted to reconstitute a very detailed portrait of their profile and a timeline of their actions and communications.
From that moment on, data collection scandals piled on. In 2018 we learned that Cambridge Analytica had siphoned profile data from 87 million Facebook users. The events had occurred in 2013 and that data had then been used to influence the Brexit campaign as well as the 2016 U.S. elections.
The public’s perception was further tainted negatively by a succession of data breaches, as well as increasing alarm over the precision of targeted ads.
Data activism
As those stories came to light, public awareness increased and demanded strict regulations around the accumulation of personally identifiable data. Some governments acted and passed legislatures, such as the GDPR in Europe, CCPA in California, LGPD in Brazil and the POPIA in South Africa.
In the US, there’s been a proposed legislation known as the American Data Privacy and Protection Act (ADPPA). Although the bill had bipartisan support, it got blocked for a variety of reasons, amongst which opposition to increased regulations that, it was argued, would not amount to a substantial increase in user privacy.
One company that did step forward though has been Apple. Not only have they resisted pressures from the US government to help unlock encrypted devices, but they’ve been constantly reinforcing user privacy features within their devices. For example, the Intelligent Tracking Prevention in Safari limits the tracking of users by enforcing stricter controls around the use of cookies.
But really, the biggest backlash came from the public. Those surveillance abuses made everyone aware of how public their digital lives were. And considering how our lives are so connected nowadays, there doesn’t seem to be any space for privacy.
Civil society has mobilized through consumer advocacy groups such as EPIC, privacy-focused non-profits such as EFF, grassroots initiatives such as Signal, or academic research groups such as the MIT Internet Policy Research Initiative.
An interesting side-effect of all this has been the rise of OSINT. One such example is Bellingcat who has been surveilling the surveyors. They are on a crusade to gather publicly available data to bring justice to those who believe they are outside its reach.
Our responsibility
So what’s our responsibility as data practitioners? Obviously, we’re all very excited about the innovative aspect of our profession, but I’m sure we’re all in agreement that we do not want to contribute to a dystopian future either. So what are some of the considerations we should always keep In mind to also be ethical in how we build data products?
First off, there’s no ISO certification on ethical analytics. There are peripheral ISO certifications, on information security for example, but nothing that covers the ethical harvesting, storage, processing and dissemination of data.
That said, there are certification programs for individual practitioners who want to take into account ethical considerations in their practice. For example, there is the Data Ethics certification program at Cornell, the Data Science Ethics course on Coursera, the Data Ethics Professional program at ODI, and the Introduction to Data Ethics at the University of San Francisco.
At RepublicOfData.io, we believe that ethical data product building should involve the practices while also taking into account the social impact of your data products.
Those considerations are part of our approach, where ethical practices and social impact are not only fundamental concerns but also baked into daily development practices.
Conclusion
That is really only an introduction to the topic, as privacy and social impact concerns go deeper than just bland value statements. They need to be deeply ingrained into data product building and management practices.
As a data practitioner yourself, how are you taking into account privacy and social impact into the building and management of data products?