EDPS/AEPD: 10 Misunderstandings related to Anonymisation



  • “Pseudonymisation is the same as anonymisation”
    • Fact: Pseudonymisation is not the same as anonymisation

  • “Encryption is anonymisation”
    • Fact: Encryption is not an anonymisation technique, but it can be a powerful pseudonymisation tool.

  • “Anonymisation of data is always possible”
    • Fact: It is not always possible to lower the re-identification risk below a previously defined threshold whilst retaining a useful dataset for a specific processing.
      • citing: Rocher, L., Hendrickx, J. M., & De Montjoye, Y. A. (2019). Estimating the success of re-identifications in incomplete datasets using generative models. Nature communications,
        10(1), 1-9, https://doi.org/10.1038/s41467-019-10933-3

  • “Anonymisation is forever”
    • Fact: There is a risk that some anonymisation processes could be reverted in the future. Circumstances might change over time and new technical developments and the availability of additional information might compromise previous anonymisation processes.

  • “Anonymisation always reduces the probability of re-identification of a dataset to zero”

  • “Anonymisation is a binary concept that cannot be measured”

  • “Anonymisation can be fully automated”
    • Fact: Automated tools can be used during the anonymisation process, however, given the importance of the context in the overall process assessment, human expert intervention is needed.

  • “Anonymisation makes the data useless”
    • Fact: A proper anonymisation process keeps the data functional for a given purpose.

  • “Following an anonymisation process that others used successfully will lead our organisation to equivalent results”
    • Fact: Anonymisation processes need to be tailored to the nature, scope, context and purposes of processing as well as the risks of varying likelihood and severity for the rights and freedoms of natural persons.

  • “There is no risk and no interest in finding out to whom this data refers to“
    • Fact: Personal data has a value in itself, for the individuals themselves and for third parties. Re-identification of an individual could have a serious impact for his rights and freedoms.

BSI TR-03161 Sicherheitsanforderungen an digitale Gesundheitsanwendungen

Germany: BSI – Security requriements for digital health applications

English version: https://www.bsi.bund.de/EN/Publications/TechnicalGuidelines/TR03161/TechnicalGuidelines_03161_node.html
with direct link: https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publications/TechGuidelines/TR03161/TR-03161.pdf?__blob=publicationFile&v=2

EDPB: Criteria for an acceptable DPIA

From Annex 2 of wp248 rev.01 Guidelines on Data Protection Impact Assessment (DPIA) and determining whether processing is “likely to result in a high risk” for the purposes of Regulation 2016/679 at https://ec.europa.eu/newsroom/article29/items/611236:

Annex 2 – Criteria for an acceptable DPIA
The WP29 proposes the following criteria which data controllers can use to assess whether or not a DPIA, or a methodology to carry out a DPIA, is sufficiently comprehensive to comply with the GDPR:

  • a systematic description of the processing is provided (Article 35(7)(a)):
    • nature, scope, context and purposes of the processing are taken into account (recital 90);
    • personal data, recipients and period for which the personal data will be stored are recorded;
    • a functional description of the processing operation is provided;
    • the assets on which personal data rely (hardware, software, networks, people, paper or paper transmission channels) are identified;
    • compliance with approved codes of conduct is taken into account (Article 35(8));
  • necessity and proportionality are assessed (Article 35(7)(b)):
    • measures envisaged to comply with the Regulation are determined (Article 35(7)(d) and recital 90), taking into account:
      • measures contributing to the proportionality and the necessity of the processing on the basis of:
      • specified, explicit and legitimate purpose(s) (Article 5(1)(b));
      • lawfulness of processing (Article 6);
      • adequate, relevant and limited to what is necessary data (Article 5(1)(c));
      • limited storage duration (Article 5(1)(e));
    • measures contributing to the rights of the data subjects:
      • information provided to the data subject (Articles 12, 13 and 14);
      • right of access and to data portability (Articles 15 and 20);
      • right to rectification and to erasure (Articles 16, 17 and 19);
      • right to object and to restriction of processing (Article 18, 19 and 21);
      • relationships with processors (Article 28);
      • safeguards surrounding international transfer(s) (Chapter V);
      • prior consultation (Article 36).
  • risks to the rights and freedoms of data subjects are managed (Article 35(7)(c)):
    • origin, nature, particularity and severity of the risks are appreciated (cf. recital 84) or, more specifically, for each risk (illegitimate access, undesired modification, and disappearance of data) from the perspective of the data subjects:
      • risks sources are taken into account (recital 90);
      • potential impacts to the rights and freedoms of data subjects are identified in case of events including illegitimate access, undesired modification and disappearance of data;
      • threats that could lead to illegitimate access, undesired modification and disappearance of data are identified;
      • likelihood and severity are estimated (recital 90);
    • measures envisaged to treat those risks are determined (Article 35(7)(d) and recital 90);
  • interested parties are involved:
    • the advice of the DPO is sought (Article 35(2));
    • the views of data subjects or their representatives are sought, where appropriate (Article 35(9)).

Spain: AEPD publishes Privacy-by-Design/Privacy-By Default Guideline

Link to AEPD’s English translation: https://www.aepd.es/sites/default/files/2020-10/guia-proteccion-datos-por-defecto-en.pdf

Press release (with links to files):


Excel sheet with measures

Quick overview of the measures in the Excel sheet
(Quick and dirty translation – please use with a grain of salt!)

  • Amount of personal data
    • Anonymous mode operation.
    • Operation without the need to create a user account.
    • Operation with different user accounts on the same device for the same interested party.
    • Operation with different user accounts on different devices for the same interested party and processing.
    • Identification through tools and technologies that reinforce privacy such as attribute-based credentials, zero-knowledge tests,…
    • Data aggregation: in time, in space, by groups …
    • Calibration of the granularity of the data: eg reduce the frequency of collection of location data, measurement data, etc.
    • Generalization of the data: use ranges for age, postal addresses for addresses.
    • Grading of the extent of the data collected based on the use cases
    • Alternatives and voluntariness in the contact information claimed from the user: e-mail, postal, telephone …
    • Processing monitoring techniques (cookies, pixel tag, fingerprint, etc.)
    • Configuration of unique identifiers (tracking IDs), the programming of their reinitialization and the warning of activation times.
    • Device metadata collected from the device (battery consumption, O.S., versions, languages, etc.).
    • Metadata included in the media processed or generated (in documents, photos, videos, etc.)
    • Information collected about the user’s internet connection (device with which it connects, IP address, device sensor data, application used, browsing and search log, date and time stamp of web page request, etc.) and information about elements near the device (Wi-Fi access points, mobile phone service antennas, bluetooth enabled devices, etc.).
    • Information collected about user activity on the device: power on, activation of applications, use of keyboard, mouse, etc.
    • Mechanisms for staggered collection of the information necessary for the processing. Delay data collection until the stage where it is necessary.
    • Type and volume of new data inferred from automated processes such as machine learning or other artificial intelligence techniques.
    • Data enrichment and linking to external data sets
    • Activation and deactivation at will of the data collection systems (cameras, microphones, GPS, bluetooth, wifi, movement, etc.).
    • Establish a time schedule for when sensors (eg cameras, microphones, etc.) can be operational.
    • Incorporation of obfuscation mechanisms to avoid the processing of biometric data in photos, video, keyboard, mouse, etc.
    • Physical blockers (such as tabs to cover camera lenses, speaker blockers, etc.).
    • Use of privacy masks or pixelation in video surveillance systems.

  • Processing extension
    • Definition and design of the processings to minimize the amount of temporary copies of data that are generated and to minimize the conservation times, transfers and communications
    • Pseudonymization according to the processing operations that may exist in each phase or stage.
    • Local and isolated processing, including the possibility of local storage.
    • Additional processing of collected metadata – log files.
    • Exercise of rights of opposition, limitation or deletion.
    • Processing settings for profiling or automatic decisions (in the case of cookies)
    • Possibility of configuring all optional processing operations for non-essential purposes: for example, data processing to improve the service, analysis of use, personalization of ads, detection of usage patterns, etc.
    • Configuration of a secure deletion of temporary files, mainly those located outside the user’s device and outside the controller’s systems
    • Incorporation of an option to reinitialize user data to restart the relationship from scratch
    • Setting the data enrichment option
    • Consider mechanisms to audit the existence of Dark Patterns
    • Specific section for configuration options related to sensitive data
    • Help and transparency panel with examples of use and possible risks and consequences for the rights and freedoms of the user
    • Incorporation of a specific means (button or link) to return to the initial configuration with default values

  • Configuration options grouped by type of media
    • Configuration of deletion of session data after its closure.
    • Configuration of maximum terms for logging out of the application or devices.
    • Terms of conservation of user profiles.
    • Configuration of temporary copy management.
    • Control of the deletion of temporary copies.
    • Elimination of the user’s trace in the service: “right to be forgotten”.
    • Identification, within the record of files of data collected from the sections, or data within sections, that can be anonymized
    • Programming of automatic locking and erasing mechanisms.
    • Programming of automatic mechanisms for deleting outputs to printing devices.
    • Configuration of retention periods for historical data in the service: eg, in the purchase sites, last articles, last consultations, etc.
    • Incorporation of generic anonymization mechanisms.

  • Data accessibility
    • Profile information of the interested party shown to the user and third parties: name, pseudonym, telephone number, etc.
    • Information of the interested party that is shown to third parties: eg selective disclosure of elements of the CV, medical history, etc.
    • Information on the status of the interested party accessible to third parties. E.g. in the messaging applications, information on availability, writing a message, receiving a message, reading a message, …
    • Classification and labeling of processing operations, sections of documents and / or data within sections, which can be managed through an access control policy.
    • Organization, classification and labeling of the application or service according to the sensitivity of data, sections or processing operations.
    • Possibility of defining and configuring access profiles and granular privilege assignment
    • Automatic session locks.
    • Assignment of data access profiles according to the roles of the users for each phase of the processing.
    • Design of the workspace (isolated interview areas, non-accessible physical files, non-transparent folders, screens not exposed to third parties or with privacy filters, phone helmets, call centers, clean table policies, etc.)
    • Information management parameters such as where the data is stored and processed, whether it is made clear or using an encryption system, the access control mechanisms implemented, whether there are multiple copies of the data, including non-securely deleted instances , which can be accessed by third parties.
    • Control of data storage encryption
    • Control of data communication encryption
    • Procedures for managing access to shared print / output devices where documents may be left behind by the user.
    • Where appropriate, prohibition of printing.
    • Print output deletion control
    • Portable storage device management procedures for periodic formatting
    • The retention or elimination of session information, in applications, shared systems, communications or systems provided to the employee or the end user.
    • The type and amount of metadata collected in the documentation generated by the system utilities (word processors, drawing tools, cameras and videos, etc.)
    • When sending messages, configure the incorporation of threads of the conversation, as well as configure the possibility of confirming the sending of multiple recipients.
    • Mechanisms to avoid indexing on the Internet
    • Organizational and technical measures for the review and filtering of information to be made public.
    • Systems of anonymization and / or pseudonymization of texts to be disseminated.
    • Management parameters of the connectivity elements of the devices (Wifi, Bluetooth, NFC, etc.).
    • Alerts about the connectivity status of the devices.
    • Controls to prevent the communication of the unique identifiers of the device (Advertising-ID, IP, MAC, serial number, IMSI, IMEI, etc.)
    • Access control mechanisms to passive systems (such as contactless cards) with the incorporation of terminal authentication protocols or with physical measures to prevent electromagnetic access.
    • Accessibility controls to user content on social networks.
    • Incorporation of controls to collect affirmative and clear confirmation actions before making personal data public, so that dissemination is blocked by default.
    • Configuration of notices and reminders to interested parties about what policies for the dissemination and communication of information are established.
    • Definition and configuration of access permissions on data sets (databases, file systems, image galleries, …) and elements for capturing information such as sensors (cameras, GPS, microphones, etc.) of the device and information on elements near the device (Wi-Fi access points, mobile phone service antennas, activated bluetooth devices, etc.).
    • Definition and configuration of data access permission policies between applications and libraries, as in the case of mobile phones.
    • Definition of access profiles based on privileges or other types of technological and procedural barriers that prevent the unauthorized linking of independent data sources.
    • Content registered in the logs (who, when, what, what action, for what purpose,… the data is accessed).
    • Definition of automatic alert systems for specific events.
    • Traceability of data communication between managers, managers and sub-managers.
    • Configurable security options (apart from encryption options).
    • Allow different access settings based on different devices.
    • Configure alert systems for anomalous data access.
    • Configuration of some of the security parameters, in particular the keys, and how to balance the security / performance / functionality relationship based on the robustness desired by the user.
    • Control of the scope of distribution of the information that is distributed in the application environment (social networks, work networks, etc.).
    • Configuration of the reception of notifications when the information is being made accessible to third parties.
    • Control of the metadata incorporated in the information generated or distributed.
    • Mechanism of the “right to be forgotten” of information published on social networks or other systems.
    • Choice options regarding where personal data is stored, whether on local or remote devices and, in the latter case, other parameters such as managers or countries.
    • History of profiles and entities that have accessed your information.
    • Information about access to your data by authorized users
    • Information about the latest changes carried out and the profile that made the change
    • Access control configurability by functionalities provided.
    • Configurability of logical separation of data groups.
    • Configurability of physical separation of data groups.
    • Selective disablement or cancellation of functionalities.

  • General
    • In the event that the service is multi-device, possibility (not obligation) to apply general privacy criteria applicable to all of them and in a single action.
    • Reminders, icons and notices of all those actions that affect the privacy of information: configuration changes, access to data by third parties such as video capture, sound, position, etc.

CNIL guidance on data deletion and retention

In July 2020, the CNIL (DPA for France) published guidelines on data retention (Guide pratique – Les durées de conservation). https://www.cnil.fr/sites/default/files/atoms/files/guide_durees_de_conservation.pdf

These reflect early CNIL recommendations from 11-Oct-2005 on the archiving of personal data.
They aim to provide practical help to define the data retention rules and periods.
Similar to DIN-66398 (German industry standard on data retention/deletion) they don’t include guidance on specific data categories. https://din-66398.de/

However, CNIL does define data retention periods in separate dcouments (“Référentiel”). Up to now, two such Référentiels have been published for the health sector: