ICO (UK): Guidance on AI and data protection and toolkit (Excel)

https://ico.org.uk/for-organisations/guide-to-data-protection/key-data-protection-themes/guidance-on-ai-and-data-protection/about-this-guidance/

Toolkit https://ico.org.uk/media/for-organisations/documents/2620273/ico_ai-and-data-protection-risk-toolkit_beta.xlsx

Break out of “practical steps” from the toolkit: (Numbering by me)

Assign technical and operational roles and responsibilities to ensure the effective management of AI systems, including a senior owner or senior process owner to drive accountability.
Put in place operational procedures, guidance or manuals to support AI policies and provide direction to operational staff on the use of AI systems and the application of data protection law.
Do an assessment (such as a DPIA) that looks at risks to individuals, interests, and rights that your use of AI poses, including where there may be competing interests. Ensure the assessment also includes appropriate technical and organisational measures designed to mitigate or manage the risks you identify. Consult with different groups who may be affected by your use of AI to help you better understand the risks.
Assess whether your project is likely to result in high risk to individuals. Use a screening checklist, which includes all the relevant considerations on the scope, type, and manner of the proposed processing, to aid in the consideration of whether a data protection impact assessment (DPIA) is required. Consider whether you can combine different types of assessments (eg a DPIA and an algorithm impact assessment).
Complete a data flow mapping exercise to document the data that flows in, through, and out of an AI system to ensure a lawful basis / condition is selected for each process. Document each lawful basis (or bases), as well as the purposes for processing for every stage of the lifecycle where personal data will be processed, and the reasons why that lawful basis was determined. This may include an additional condition for processing when special category data or criminal offence data is used. Assess whether you are going to use solely automated decision-making with legal or similarly significant effects and ensure you have an appropriate exemption to do so.
Map out the purposes for the system, including any decisions that will be made about individuals based on, or influence by, the AI system, as well as the different outcomes and their effects on those individuals. Conduct an initial assessment of potential forms of statistical inaccuracies (including unfair bias and discrimination), which include how you will meet your fairness requirements in relation to discrimination in your context. This should include your mitigation and management strategies. Ensure that risks are drawn from a wide range of stakeholders including policy, user research and design, computer science expertise, and data subjects (or their representatives). Ensure that your assessment is conducted by appropriately skilled personnel (this may require a cross-disciplinary approach, eg data scientists working with legal counsel and review boards). Your assessment should be understood and signed off by an appropriately senior personnel.
Document what the minimum success criteria that is necessary to proceed to the next step of the lifecycle (eg minimum statistical accuracy achieved in the testing phase before proceeding to deployment, or minimum level of fairness based on a specific fairness metric). You should consult with domain experts to inform you which metrics are contextually most appropriate for the model. You should initially focus on outcomes that are immediately experienced by individuals and whether they would reasonably expect the outcomes, or whether adverse effects could be justified. You should also consider the different impacts of false positive and false negative outcomes.
Ensure the team that will be responsible for building the AI system have an awareness of the assessment that has taken place and what their requirements are.”
Decide what type(s) of explanation you will provide. Consider domain or sector context and use case when deciding what explanations you will provide. This may involve you assessing people’s expectations of the content and scope of similar explanations previously offered, or researching sector-specific expectations as well as assessing your AI model’s potential impact to help you understand how comprehensive your explanation needs to be. You may also need to consider sector-specific standards for explanations (eg in medicine). Document information about the choices for your explanation type, why you have made them, how you will provide them and who is responsible for providing them at each stage of the lifecycle.
Ensure that your policies, protocols and procedures are accessible and understandable to staff working on an AI project to support their work on making the system explainable.
Plan what training will need to be provided for staff working directly with AI systems on privacy information and fair processing. Training should be planned for anyone involved in the decision-making pipeline where AI has a contributing role. Staff should be made aware of transparency requirements set out in the UK GDPR. Training should include how to label data appropriately and consistently, guidance around using unstructured or high-dimensional data, different types of explanations available, how to present them and how to test the effectiveness of them.
Conduct an initial assessment of the security risks and the mitigants / controls to reduce the likelihood and impact of an attack or breach. As part of the assessment, consider the security risks associated with integrating an AI system with existing systems, which includes a description of what controls will be put in place as part of the design and build phase. The assessment should involve consultation with appropriately skilled technical experts about what the latest state-of-the-art is.
Document security processes and make it freely available for all those involved in the building and deployment of AI systems. This should include processes to report security breaches, and who is responsible for handling and managing them as part of an AI incident response plan. An AI incident response plan should include guidance on how to quickly address any failures or attacks that occur, who responds when an incident occurs, and how they communicate the incident to other parts of the organisation.
Plan appropriate security training so staff have the appropriate skills and knowledge to address security risks. Training should include information about the AI incident response plan, how to identify and assess the severity of different AI failures and AI attacks, containment strategies, how to eradicate AI failures / AI attacks, and how to recover AI systems following an attack or failure.
Assess whether the data you plan to collect to train the AI system is relevant for your purpose, and ensure only that data is acquired. As part of your assessment, consult with a domain expert to ensure that the data you intend on collecting is appropriate and adequate.
Assess different privacy-enhancing techniques to see if any would be appropriate in your context. Examples of privacy-enhancing techniques to consider include: federated learning, differential privacy and robust machine learning.
Design a retention schedule based on business need with reference to statutory requirements. This should include sufficient information for all records to be identified and disposal decisions put into effect, and that weeding activities are standardised, documented, and occur on an ongoing and regular basis.
Design and implement a policy / process that defines how individual requests will be dealt with and by whom at each stage of the AI lifecycle where personal data is processed. This should include a specific person or team that is responsible for managing and responding to requests. You should also consider the various ways or options that individuals can make a request.
Plan how you will index the personal data in your AI system so that it is easy to locate relevant data should a request be received. This could include building key ‘search’ words / common identifiers into the system design. You may need to consider the trade off between data minimisation and security on the one hand, and responding to individual rights easily on the other. You may also need to assess whether the data you hold constitutes personal data and how that will impact individual rights.
Decide who will be responsible for human reviews. Ensure they have the authority to challenge and override automated decision-making, and can work with independence to influence senior-level decision making.
Ensure AI system developers understand the skills, experience and ability of human reviewers when designing the AI systems. Plan how you will ensure that human reviewers will have the appropriate technical understanding to comprehend the decision-making behind the algorithm(s) used.
Review documented lawful bases to check that the relationship, the processing, and the purposes have not changed from how the personal was originally collected. If the purpose has changed, assess whether it is compatible with your original purpose. If the purposes are incompatible, ensure you obtain content, or you have a clear obligation or function set out in law to use the personal data for this new purpose.
Ensure that the data you are gathering is representative, reliable, relevant, and up-to-date of the population or different sets of data subjects that the AI system will be applied to.
Ensure, as far as possible, that the data you have collected does not reflect past discrimination, whether based explicitly on protected characteristics or possible proxies. This should include a thorough analysis of data about under / overrepresented groups. You may also want to consider technical approaches to mitigating possible bias, such as re-weighting, or removing the influence of protected characteristics and their proxies.
Develop and maintain an index of data sources or features that should not be processed when making decisions about individuals because of the risks of direct or indirect discrimination. (Note that these data sources or features could still be processed to conduct bias analysis of your AI application, as bias analysis does not usually lead to making decisions about individuals.)
Decide whether you will need data about protected characteristics to conduct bias analysis. If you do, assess whether you need to create labels for data you already hold or whether you need to collect more data.
Have clear criteria and lines of accountability about the labelling of data involving protected characteristics/special category data. Consult with members of protected groups or their representatives to define the labelling criteria. When labelling data, create criteria that is: Easy to understand, includes descriptions for all possible labels, examples of every label, cover edge cases. Involve multiple human labellers to ensure consistency, which could include members of protected groups where there are edge cases.
Label the data you collect to train your AI system with information including what it is, how it was collected, the purpose it was originally collected, and the reasons why you have collected it. Detect any duplicated data present in the data acquisition phase and delete where necessary.
Consider what information you will provide to data subjects about how their personal data will be used to train an AI system. This should include: the purposes of the processing for which the personal data are intended as well as the legal basis for processing and the categories of personal data concerned.
Ensure that where you are using unstructured or high-dimensional data, you are clear about why you are doing this and the impact of this on explainability. For example, you may justify using this data because it yields better statistical accuracy.
Record and document all movements and storing of personal data from one location to another. Ensure there are clear audit trails that include who has handled the data, who has had authorisation to access the data, and where the data is stored.
Delete any intermediate files containing personal data as soon as they are no longer required (eg compressed versions of files created to transfer data between systems).
Apply de-identification techniques to training data before it is extracted from its source and shared internally or externally (eg by removing certain features from the data, or apply privacy enhancing technologies, before sharing it with another organisation)
Label the data you collect to train your AI system with information including what it is, how it was collected, the purpose it was originally collected, and the reasons why you have collected it.
Assess what features in the dataset will be relevant for your purpose(s), and delete any that are irrelevant. For example, you may only need the first part of a postcode to achieve the same outcome and therefore, decide to delete the second part as it is not relevant for your purpose(s).
Index the personal data used in each phase of the AI system lifecycle as planned during the business requirements and design phase.
Update your assessment that you did in P2 given what you now know following training and testing your AI system. Your update should include analysis of the balance between different rights and interests in your AI system, which includes pros and cons for prioritising each criterion and a final justification for why one criterion was prioritised over another. Document the methodology for identifying and assessing the trade-offs in scope; the reasons for adopting or rejecting particular technical approaches (if relevant). Ensure the senior owner or senior process owner signs off the assessment before deployment.
Review documented lawful bases to check that the relationship, the processing, and the purposes have not changed from how the personal was originally collected. If the purpose has changed, assess whether it is compatible with your original purpose. If the purposes are incompatible, ensure you obtain content, or you have a clear obligation or function set out in law to use the personal data for this new purpose.
Test whether your model meets or exceeds the minimum success criteria that you documented in P6 that is necessary to proceed to the deployment phase. You should consult with domain experts to inform you which metrics are contextually most appropriate for the model when conducting testing. Testing should initially focus on outcomes that are immediately experienced by individuals and whether they would reasonably expect the outcomes, or whether adverse effects could be justified. You should also consider false positive and false negative outcomes and how you will mitigate the negative effects of these outcomes during deployment. Document the methodology and results of your testing, including any tolerated errors. Testing methodologies could include model debugging, red teaming, or offering bug bounties. Ensure test results are appropriately signed off.
Take additional measures to increase data quality and / or improve model performance where there are a disproportionately high number of errors for a protected group.
Record any limitations of the model in the context of statistical inaccuracies. Document and assess whether live incoming data with low quality can be handled appropriately by the model. You may want to consider the use of model cards to detail limitations, trade-offs and performance of your model.
Put in place a policy / documented process that includes details of how the system will be tested (including details of the methodology used by a human reviewer) post-implementation. This should include carrying out all the relevant checks to identify any errors in data outputs; documenting tolerances for errors; documenting the results of the testing; obtaining management sign off; documenting any retraining of the algorithm following training (eg by improving input data, different balance of false positives and negatives, or using different learning algorithms) and testing the AI system using new dataset(s) to confirm the same outcome is reached.
Decide what model(s) you will use for testing. This should include a consideration of the specific type of application and the impact of the model on individuals. If you are considering using a ‘black box’ system, assess the risks and potential impacts of using it, determine that the case you will use if for and your organisational capacity both support the responsible design and implementation of these systems and consider which additional controls, such as supplementary interpretability tools, are appropriate for your use case.
Decide how you will present your explanations of the decisions made by your AI system. This should involve: assessing interpretability / transparency expectations and requirements in your sector or domain; considering the contextual factors and how this will impact the order in which you deliver the explanation types, and how this will affect your delivery method; deciding how to translate technical explanations into reasons that can be easily understood by the decision recipient; deciding what tools will be used to present information about the logic of the AI system’s output (eg textual clarification, visualisation media, graphical representations, summary tables, or a combination); deciding how you will layer your explanation what type of explanations you will provide and how individuals can contact you if they would like to discuss the AI-assisted decision with a human being.
Test the effectiveness of your explanations. Effectiveness should be measured by how well individuals can understand why the model made the decision it did, or how the model output contributed to the decision. Ideally, an understanding would include which features were most important in the decision, how statistical inferences from the AI system were incorporated into the final decision, and how the individual could improve in the eyes of the model. Consult with relevant stakeholders about how to improve explanations in a way that builds trust. Ensure test results are appropriately signed off.
Separate the machine learning development environment from the rest of your IT infrastructure where possible. For example, by using ‘virtual machines’ or ‘containers’ where appropriate.
Ensure that access to training data, training code, and deployment code is restricted to only those who require it.
Keep an up-to-date inventory of all AI systems to allow you to have a baseline understanding of where potential incidents could occur.
Test whether your AI system meets security requirements. This could involve: model debugging (either by someone internal, or an external security auditor); red teaming; conducting ‘white hat analysis’; bug bounties; and proactively monitoring the system and investigating any anomalies.
Plan and document any detective and corrective controls to mitigate / manage security risks. This could include system vulnerability monitoring / testing tools or software, subscribing to security advisories to receive alerts of vulnerabilities, and ensuring a solid patching / updating processing is in place so that available security fixes are applied in a timely manner.
Assess whether your model is suffering from ‘overfitting’ to reduce the likelihood of privacy attacks. Remove features if there are too many or include more examples if there are not enough (or both).
Assess whether the information you provide as part of the output of your AI system has security implications. For example, whether you need to provide confidence information to end users when they are observing the output, which includes a consideration of reasons to provide confidence information and reasons to not provide it. Another example may be whether explanations of your AI model could make it easier to conduct privacy attacks.
Carry out reviews during testing that include an assessment as to whether all the data is needed (for example whole address or just postcode will produce same result) and whether the same volume of data is required (or whether the same results can be achieved with less volume). Delete or remove any data that is not needed. Maintain a record of non-required features or data that were removed or deleted.
Ensure human reviewers are adequately trained to interpret and challenge outputs made by the AI system. Human reviewers should have meaningful influence on the decision, including the authority and competence to go against the recommendation. Human reviewers should take into account other additional factors that weren’t included as part of the input data e.g., local contextual factors.
Periodically test whether your model continues to meet or exceed the minimum success criteria that you documented in P6.
Run a traditional decision-making system and an AI system concurrently and investigate any significant difference in the type of decisions.
Periodically test the effectiveness of your explanations. Effectiveness should be measured by how well individuals can understand why the model made the decision it did, or how the model output contributed to the decision. Ideally, an understanding would include which features were most important in the decision, how statistical inferences from the AI system were incorporated into the final decision, and how the individual could improve in the eyes of the model. Consult with relevant stakeholders about how to improve explanations in a way that builds trust. Ensure test results are appropriately signed off.
Test whether your AI system meets security requirements. This could involve: model debugging (either by someone internal, or an external security auditor); red teaming; conducting ‘white hat analysis’; bug bounties; and proactively monitoring the system and investigating any anomalies.
Periodically assess whether your model is suffering from ‘overfitting’ to reduce the likelihood of privacy attacks. Remove features if there are too many or include more examples if there are not enough (or both).
Introduce real-time monitoring techniques that can detect anomalies. These could include ‘rate limiting’ (reducing the number of queries that can be performed by a particular user in a given time limit), input anomaly detection ML techniques, or common-sense data integrity constraints. You could also run a model that is known and trusted alongside your deployed model that may have become more complex and opaque to detect any output anomalies when scoring new data. Record and maintain a list of users’ accounts who have been blocked or suspended from submitting queries.
Consider denying anonymous use of your AI system to reduce the chances of bad actors attacking it. For example, using log-in credentials and/or multi-factor authentication to prove user identity.
Assess whether the information you provide as part of the output of your AI system has security implications. For example, whether you need to provide confidence information to end users when they are observing the output, which includes a consideration of reasons to provide confidence information and reasons to not provide it. Another example may be whether explanations of your AI model could make it easier to conduct privacy attacks.
Periodically assess whether the training data is still adequate and relevant for your purposes. For example, by assessing for concept / model drift. Retrain your model on new data where necessary.
Start and maintain a log of all complaints received that tracks the issue, the response, and the response date. Undertake and document analysis of complaints to determine trends, issues, and risks. Produce an action plan or risk register to track issues to resolution. Ensure that lessons learned feed back into AI system retraining or development.
Carry out ‘mystery-shopping’ exercises, where a deliberately misleading decision made by the AI system is provided that the human should disagree with, to ensure their input is meaningful. This should include pre- and post- implementation testing, which includes an assessment of human oversight to ensure it is meaningful; testing a sample of decisions to ensure the human is making the right decision; Designing and implementing a process where decisions made by AI are monitored and compared to human decisions, and documenting any action taken because of performance which goes outside of defined tolerances.
Map out all the organisations who will be involved in the AI project lifecycle. Assess the status of each organisation in respect of all the personal data and processing activities carried out, ensuring the assessment considers the roles and responsibilities in relation to the data processing activities and who is determining the purposes and the manner of each specific processing. Compliance with the accountability principle will rest on the controller / joint controllers. Read our guidance on controller and processors to find out how to determine whether you are a controller, a joint-controller, or a processor as well as what each one means.
Collaborate with the external supplier to carry out an assessment (eg a DPIA) to determine whether the proposed application is necessary, proportionate and adheres to data protection law. The assessment should also identify and assess risks to individuals.
Document what the minimum success criteria is that is necessary to proceed to the next step of the procurement process (for example, minimum statistical accuracy or model explainability and interpretability requirements). Clearly define the desired performance for an acceptable Model in terms of clear Model and data metrics that are written from the data subject’s perspective. This should also include clear, narrow accuracy goals and metrics that manage the competing interests of statistical accuracy and explainability. Carry out due diligence that includes whether the AI system meets this minimum success criteria.
Carry out due diligence checks of the AI system provider, which include checks of: risk assessments, privacy information, security testing, statistical accuracy (including bias and discrimination) testing, and retention periods. The checks should include the supplier providing their DPIA and could also involve the supplier providing their model cards, data sheets or other types of impact assessments. Check whether the supplier has carried out an internal audit of their AI system, and how often they carry out audits. Check whether the supplier has signed up to any codes or certification schemes that can provide assurances that their product complies with data protection legislation. Check what the process is behind the release of the AI system. There should be a clear written policy that details how a product is released, which includes information about how the product is tested and who signs it off before the final release. Document these checks.
Agree responsibilities with third party suppliers. For example, who will be responsible for completing internal checks on the system to identify and address statistical inaccuracies, who will be responsible for responding to individual rights requests, who will be responsible for carrying out security testing and providing security patches. There should also be agreement about how the supplier will manage any changes in their product and how the supplier will communicate continual assurances that the product works and complies with data protection legislation.
Ensure that a contract (or other legal act) between a supplier and a procurer of an AI system stipulates that the processor, at the choice of the controller, deletes or returns all the controller’s personal data to the controller after the end of the contract relating to the processing, and deletes existing copies.
Ensure you have a fall-back option for cases where there is failure in the third party’s AI system, or where the third party stops providing updates/patches to their AI system. A fall-back option will be especially important where you intend to use an AI system as part of a vital function for your organisation.