PADS-Responsible Data Science

Archive for May, 2021

Privacy-enhancing Technologies (PET) vs. Non-disclosure Agreements (NDA)

May 21st, 2021 | by

No one doubts that the interest in data science is growing rapidly. More and more companies are using data science and machine learning techniques to extend/accelerate their business. In the following, we provide two plots from two studies showing the growth of data science. Figure 1 is a result of a study by Towards Data Science [1] which shows the general interest in data science and other data-centric techniques by analyzing search interest trends from Google Trends years 2011-2020.

Figure 1: Interest in data-centric techniques over time.

Figure 2 was taken from a study performed by the European Leadership University [2] and shows the interest in data science among software developers by analyzing the Stack Overflow survey data from 2011 to 2018.

Figure 2: Data science community growth among software developers.

While investments in data science and data-centric technologies grow rapidly, the responsible use of data becomes increasingly important and has the attention of consumers, citizens, and policymakers. Responsible Data Science (RDS) considers four main aspects of responsibility that need to be taken into account while analyzing data: fairness, accuracy, confidentiality, and transparency [3]. Here, we focus on the confidentiality/privacy aspect of responsible data science and invite the community to discuss the possible challenges/reasons which are prohibiting the widespread use of technical solutions, e.g., privacy-enhancing technologies (PETs), in practice.

Privacy-enhancing technologies provide data protection by eliminating or minimizing the usage of unnecessary personal data without the loss of the data utility or the functionality of an information system [4]. A non-disclosure agreement (NDA) is a legal contract between different parties that outlines confidential/private material belongs to each party and prohibits someone from sharing such confidential information.

PETs are more focused on technical solutions based on general privacy policies such as GDPR [5]. Moreover, organizations can have their own privacy/confidentiality concerns, and they may develop specific privacy preservation techniques to address such concerns. On the contrary, an NDA is only a legal contract that relies on the power of prosecutions rather than providing technical solutions. Although PETs had a lot of breakthroughs in recent years and strong technical solutions have been introduced, the companies still prefer to follow legal agreements rather than using technical solutions. The question is why technical solutions are not being widely used in practice regardless of the high demand and significant breakthroughs in academia:

  • Are technical solutions still untrustworthy?
  • Are they expensive to develop?
  • Is there a lack of knowledge in companies to understand and develop technical solutions?
  • Is the problem the lack of interpretability of technical solutions?
  • Are there no solid tools to support technical solutions in practice?

Although many other possible reasons could be listed, in many cases one can still consider hybrid solutions to use technical solutions as preventative methods and use legal contracts to cover the potential weaknesses of technical solutions rather than relying on solely legal contracts which do not provide any type of technical guarantees.

What do you think? Which approach is more used in practice? And what is the main reason(s) for not using PETs in practice? You are very welcome to write your thoughts and comments in the comments box below.





[3] van der Aalst W.M.P. (2017) Responsible Data Science: Using Event Data in a “People Friendly” Manner. Enterprise Information Systems. ICEIS 2016. Lecture Notes in Business Information Processing, vol 291. Springer, Cham.

[4] van Blarkom, G. W., Borking, J. J., & Olk, J. E. (2003). Handbook of privacy and privacy-enhancing technologies. Privacy Incorporated Software Agent (PISA) Consortium, The Hague, 198, 14.




We encourage experts in (responsible) data/process science to share their thoughts, experiences, and concerns regarding the responsible use of data with the community. You just need to send us your text.


Twitter: @MajidRafiei4