3 Cybersecurity Aspects of Artificial Intelligence

Learning outcomes

By the end of this section, learners will be able to distinguish between three sources of cybersecurity risks from AI:

the amplification of existing risks;
new risks related to AI training and inputs; and
new risks related to AI algorithms

and combine these sources for a more comprehensive examination of AI applications.

This chapter discusses how cybersecurity concerns can emerge in contexts involving AI. By cybersecurity, we refer to the practices involved in protecting computer systems—in this case, AI systems and models—from deliberate interference by external actors. This interference can come from various sources and means of attack: disgruntled employees might want to leak a company’s trade secrets, hackers might want to steal citizen data from a government body to commercialize it, hostile states might want to infiltrate government networks and steal intellectual property from companies, and so on. Over the past few decades, a sophisticated body of knowledge has been developed around cybersecurity. Our goal here is to briefly discuss how that body of knowledge relates to AI, both when it comes to known challenges that continue to exist in AI technologies and to novel issues that appear from what is unique about AI.

Cybersecurity, in AI as elsewhere, follows some core principles that guide the protection of systems against _UNw_anted interference. The core principles of confidentiality, integrity, and availability, commonly referred to as the CIA triad, form the foundation for protecting information systems and data:

The principle of confidentiality states that computer systems should prevent unauthorized individuals from gaining access to data. For example, one of the measures that DigiToys can take to promote confidentiality is limiting access to the data collected from its toys to the individuals who need to use that data in their work.
The principle of integrity is that information should not be altered, either maliciously or accidentally, and that it must remain reliable for its intended use. Integrity is critical in contexts where decisions are made based on data analysis, including AI-driven systems. For instance, if the data used by InnovaHospital to make decisions about patient treatments is tampered with, medical resources might be misallocated, and individuals might be assigned to inadequate procedures.
The principle of availability states that data and systems are accessible when needed, by authorized users. This is particularly important for AI systems that may operate in real-time or support critical functions, such as fraud detection or autonomous decision-making.

Other principles can be relevant in specific contexts. One such principle is non-repudiation, that is, the idea that a user should not be able to deny their involvement in an action or transaction. This principle is relevant, for example, in digital contracts or electronic payments, where it prevents individuals from falsely claiming that they did not sign a document or approve a transaction. The applicability of such principles is sometimes narrower than the CIA triad, but they might be no less important in their domains of use.

Together, these concepts illustrate a broader cybersecurity goal: the protection of data from unauthorized access, alteration, and disruption. While these principles are not new to data protection officers, their application within AI systems—where data flows, processing methods, and potential vulnerabilities are more complex—demands a nuanced understanding and an integration of both privacy and security frameworks.

To assist you with developing such an understanding, Section 3.1 offers a refresher course of basic cybersecurity concepts and discusses how they become legal obligations under the AI Act. Section 3.2 then revises general cybersecurity threats that can affect all forms of data processing, including the processing that happens in AI technologies. Finally, Section 3.3 focuses on cybersecurity issues that are specific to AI.

3.1 Core concepts and legal requirements for cybersecurity

Learning outcomes

By the end of this session, learners will be able to explain what cybersecurity entails for data processing and describe some of the most common risks to it.

Threats to the cybersecurity principles discussed in the introduction to this Unit can take various forms. Each of those principles might be affected to a different extent by different practices aimed at different goals. To facilitate discussion of these issues, cybersecurity professionals have developed a shared vocabulary, as well as resources for the spread of knowledge.

Some of the better-known resources on cybersecurity are offered by the US-based MITRE corporation to the general public, such as ATLAS (a knowledge base of adversaries and techniques used to attack digital systems) and D3FEND (a visualization of cybersecurity measures). In Europe, ENISA (the EU agency for cybersecurity) offers a broad set of tools that companies can use, such as self-assessment checklists. It also publishes materials, such as guidelines, to spread awareness about best practices in cybersecurity as well as risks that have become salient in an European context. Additionally, data protection authorities are also active in the cybersecurity domain, because, as we discuss below, security is an integral part of the protection of personal data.

In this section, we will discuss basic concepts that must be understood to make the best use of those resources. We will also cover the legal obligations that make cybersecurity a central requirement for legal compliance.

3.1.1 Approaches to cybersecurity

To pursue cybersecurity, organizations must take measures. Some of these measures are reactive, as they offer responses to security incidents after they occur. For example, if an organization discovers some of the personal data it stores has been stolen, it will often contact the affected individuals and offer them access to tools such as credit monitoring.

Reactive security involves activities like incident response, damage assessment, and remediation efforts to restore normal operations. While necessary, reactive measures are limited in their ability to prevent future attacks. For instance, a DPO might work with IT teams to address a data breach by securing affected systems and notifying regulators. Doing so can eliminate known issues that resulted in the data breach, but future attacks might still be possible from vulnerabilities that were not yet fixed.

Other measures are proactive, as they seek to anticipate and prevent security issues before they arise. Proactive security includes regular vulnerability assessments, threat intelligence gathering, penetration testing, and implementing robust security policies. Proactive measures are particularly relevant in AI systems, where pre-emptive assessments of model security can help mitigate risks associated with adversarial attacks or data leakage. By identifying potential threats early in the life cycle of an AI system, organizations can implement safeguards to reduce the likelihood of a successful attack.

Most organizations will rely on both reactive and proactive measures to address their security challenges. A popular approach for determining the measures that are relevant in a context is that of drawing a threat model. Such a model offers a structured approach used to identify, evaluate, and prioritize potential security risks to a system, application, or data.

A threat model typically outlines:

The attack surface, that is, the points where a system could be vulnerable to attack.
Potential threats and threat actors, such as hackers, criminal organizations, or nation-state attackers
The likelihood and impact of these threats.

Sometimes that information must be procured from outside an information, for example by tapping into the expertise of contractors or purchasing access to cybersecurity reports. In other cases, it is already available but dispersed among many actors. It might be the case, for instance, that nobody within a large organization has the full picture of how a particular AI system is designed and used. By articulating all this knowledge, a threat model supplies a starting point for thinking about cybersecurity risks and how to respond to them.

To produce a plausible threat model, an organization must have a deep knowledge of both its technical tools and the context in which those tools are used. Based on that knowledge, an organization can anticipate potential threats and propose measures that will eliminate them, or at least mitigate the likelihood or severity of any attacks.

3.1.2 The attacker as the adversary

Cybersecurity, as mentioned above, refers to protection against deliberate efforts to affect a computer system. These deliberate efforts are made by an attacker, which is the term used to refer to any individual or entity attempting to exploit vulnerabilities to gain unauthorized access or cause harm. Thwarting the goals of attackers is necessary to ensure the cybersecurity principles discussed above. Following the cybersecurity principles, in turn, is valuable because it leads to other goals—such as the protection of the fundamental rights to privacy and data protection.

Identifying and classifying attackers can be complex, as their motivations, methods, and resources vary widely.

Individual attackers might include hackers driven by curiosity, personal grievances, or financial gain. They often use publicly available tools and exploit common vulnerabilities. For instance, a disgruntled former employee might use their retained access credentials to leak sensitive data as an act of retaliation.
Criminal organizations operate with more coordination and sophistication, often driven by profit motives. These groups may engage in activities like ransomware attacks, data theft, and fraud. In AI contexts, criminal organizations might target proprietary algorithms or large datasets used for training, aiming to steal valuable intellectual property or disrupt business operations.
Nation-state attackers are state-sponsored entities conducting cyber espionage, sabotage, or warfare. These attackers are typically well-resourced and highly skilled, targeting critical infrastructure, government systems, or large corporations for strategic gains. For example, an AI-based facial recognition system used for border control could become a target for nation-state attackers aiming to discredit the country deploying the system or ensure that their operatives can freely access that country.

Classifying attackers is not always straightforward, as their methods can overlap, and motivations may change over time. Moreover, the use of anonymization techniques, such as VPNs and the dark web, makes it challenging to trace the origin of attacks, complicating attribution efforts. Still, any organization’s threat models need to consider the kinds of resources that might be available to whoever wants to attack it.

3.1.3 Legal requirements for cybersecurity in the EU

Organizations might not always be aware of the cybersecurity risks they face, but they have a strong self-interest in avoiding those risks. Data breaches, intellectual property, and other security risks might have an unbearable financial cost to businesses. Even for public sector entities and non-profits, cybersecurity issues might erode the organization’s legitimacy or disrupt its ability to do its job, as seen in the constant ransomware attacks that have become common in recent years. If self-interest is not enough, many organizations are also subject to legal obligations to pursue cybersecurity.

For the purposes of this training module on AI and data protection, our focus will reside on cybersecurity requirements in the GDPR and the AI Act. Article 32 GDPR requires any data controllers to adopt technical and organizational measures to ensure that their data processing has a level of security compatible with the risk associated with it. For high-risk AI systems, Article 15 AI Act obliges providers to ensure the system has a level of cybersecurity appropriate to its purpose. In both cases, the obligations apply at the moment an AI system is designed and also when it is effectively used to process personal data, as discussed in Chapter 12. Those legal requirements are discussed throughout the book.

Cybersecurity requirements in the GDPR and the AI Act coexist with other legal instruments in this domain. Sector-specific legal instruments, such as the Medical Devices Regulation, can feature specific standards for particular applications of AI technologies. Additionally, the EU has adopted various legal instruments on cybersecurity, which establish additional rules. Under the NIS2 Directive (Directive (EU) 2022/2055), for example, Member States are obliged to establish legal requirements for cybersecurity in systems used for certain applications. More generally, the recently adopted Cyber Resilience Act establishes essential security requirements that must be observed for placing products with digital components in the EU market, including high-risk AI systems. The training module will not go into the details of those requirements, but organizations will need to consider them when deciding how to fulfil their cybersecurity obligations under the GDPR.

3.2 General threats to cybersecurity

Learning outcomes

By the end of this session, learners will be able to give examples of cybersecurity threats and their impact on the protection of personal data. They will also be able to exemplify best practices to reduce the risk from those threats.

This section provides an overview of the cybersecurity risks that affect computer systems in general. The concepts and practices discussed here go beyond AI, as they might affect all kinds of software. Still, AI systems and models remain vulnerable to them, as well as the infrastructure covered in Section 2.3. Organizations developing or deploying AI technologies cannot ignore these threats just because they are not AI-specific. As such, it will be important to review general issues of cybersecurity before moving on, in the next session, to the unique AI-related risks to cybersecurity.

Cybersecurity, as defined in this unit, is concerned with deliberate practices. It might be threatened by attacks, in which an individual, group, or organization tries to breach the security of the information system, network, or digital device in question. In the previous unit, we have seen that attackers might have a variety of profiles, resources, and goals in their attacks. In particular, they can target any of the aspects of the CIA triad: confidentiality, integrity, and availability.

A security vulnerability is a weakness or flaw in a system, software, or process that can be exploited by an attacker to gain unauthorized access, cause disruptions, or compromise data. Vulnerabilities can arise from coding errors, misconfigurations, outdated software, or even insecure design choices. For instance, a web application vulnerability like SQL injection could allow an attacker to manipulate a database and access confidential information, such as user credentials or payment data. In the context of AI systems, vulnerabilities may include poorly secured training data, biased algorithms, or exposure of sensitive data through model inversion attacks.

A zero-day vulnerability refers to a security flaw that is unknown to the software vendor and, therefore, unpatched. Attackers who discover a zero-day exploit have a significant advantage, as there is no immediate fix available to prevent exploitation. For example, a zero-day attack on a popular cloud service provider could enable attackers to infiltrate customer data before the vulnerability is detected and patched.

A security incident is any event that compromises the confidentiality, integrity, or availability of information or systems. Incidents can range from minor breaches, such as unauthorized access to a single user’s email account, to major data breaches affecting millions of individuals. The impact of a security incident can be severe, often requiring incident response measures, reports to regulators,¹ and actions to prevent recurrence. For AI-driven systems, a security incident might involve unauthorized manipulation of model behaviour, such as an adversarial attack that misleads an image recognition system into misclassifying objects. We will now look at some of the approaches attackers use for creating security incidents.

3.2.1 Types of attacks

One can distinguish between two main types of attacks. - An active attack happens when an attacker directly interferes with the computer system in question. For example, they might manipulate the data that is used for training an AI system or use carefully designed prompts to “jailbreak” a large language model, that is, to extract information and parameters from a model. - Passive attacks, instead, do not engage directly with the system but monitor its operation. For example, an attacker might monitor all the requests that are sent to a given AI system in order to better understand how that system is used.

Attackers often rely on both approaches, which can be applied in various forms.

The most prevalent attack methods often exploit human behaviour, software vulnerabilities, and weaknesses in data transmission processes. As innovative technologies emerge, they might be vulnerable to new ways to carry out attacks. At the same time, cybersecurity practitioners might develop methods that eliminate or reduce the risk from certain attacks. Because of this arms race, it is difficult to keep track of the diversity of attacks used by malicious actors. Resources such as MITRE’s ATLAS knowledge base offer a shared repository of knowledge on the current state of the art. Based on that knowledge, one can group attacks into some classes that remain relatively stable over time, even if the details of their implementation vary wildly.

3.2.1.2 Exploiting software vulnerabilities

Attackers can also proceed by exploiting known vulnerabilities. An exploit is a piece of software, script, or code designed to take advantage of a security vulnerability in a system or application. When attackers discover such a weakness, they can use an exploit to gain unauthorized access or execute malicious commands. For example, a buffer overflow exploit targets a vulnerability where a program fails to properly check the length of input data, allowing an attacker to overwrite memory and execute arbitrary code. In AI applications, exploits might focus on software libraries used for machine learning, compromising the integrity of the model, or extracting sensitive information from the system.

3.2.1.3 Man-in-the-middle attacks

A man-in-the-middle (MITM) attack occurs when an attacker secretly intercepts and alters the communication between two parties who believe they are directly communicating with each other. The attacker positions themselves between the sender and receiver, allowing them to eavesdrop on, modify, or inject malicious content into the data exchange.

In an unencrypted Wi-Fi network, for example, an attacker can intercept data sent between a user’s device and a web server, capturing sensitive information like login credentials or financial details. In AI systems, MITM attacks can disrupt the transmission of data used for model training or inference, potentially introducing false data inputs that lead to incorrect outputs or compromised decision-making.

3.2.1.4 Putting it all together

While these forms of attacks are distinct, they are often used in combination by attackers to increase their chances of success. For example, an attacker might use social engineering to gain initial access, exploit a software vulnerability to escalate privileges, and then carry out a man-in-the-middle attack to intercept and manipulate data. In AI environments, the complexity of interconnected systems and the reliance on large datasets can amplify these risks, as attackers may target weak points in the data pipeline or leverage adversarial inputs to compromise model integrity. This is why the AI-specific threats discussed in Section 3.3 cannot be separated from the more established attack vectors seen here.

3.2.2 Types of security controls

Security controls are measures designed to protect information systems from threats and reduce risks. These controls can be classified into distinct categories based on their primary function:

Preventive controls are aimed at stopping security incidents before they occur. This includes measures like firewalls, encryption, access controls, and multifactor authentication. For example, encrypting data at rest and in transit ensures that even if an attacker gains access, the data remains unreadable without the decryption keys.
Deterrent controls are intended to discourage potential attackers from attempting to exploit a system. These controls might involve visible security measures, such as warning banners, surveillance cameras, or legal disclaimers about monitoring and prosecution. In the context of AI, deterrence might include transparent declarations of robust model validation processes, signalling to potential attackers that their efforts are likely to be detected.
Detection controls focus on identifying security incidents as they happen. Examples include intrusion detection systems (IDS), anomaly detection algorithms, and security information and event management (SIEM) tools. For AI systems, detection controls might involve monitoring inputs for unusual patterns or adversarial attacks designed to manipulate model outputs.
Deflection controls aim to divert attacks away from critical systems, often by misleading attackers. This can involve the use of honeypots—decoy systems designed to attract and trap attackers, giving security teams time to respond. For instance, setting up a fake server that mimics a valuable database can lure attackers away from the real system.
Mitigation controls seek to limit the damage caused by a security incident. These include measures like data backups, network segmentation, and incident response plans. In AI systems, mitigation might involve reverting to a safe fallback model if anomalous behaviour is detected, reducing the impact of compromised algorithms.
Recovery controls help organizations return to normal operations after a security incident. These measures include data restoration, system reboots, and process reviews to prevent future occurrences. Effective recovery controls are essential for minimizing downtime and ensuring business continuity, especially in AI applications that support critical functions like financial transactions or healthcare diagnostics.

Various kinds of controls are often used together. The concept of security in depth refers to a multi-layered approach to cybersecurity, where multiple, overlapping controls work together to protect systems and data. This strategy recognizes that no single control is foolproof; instead, various measures complement each other to create a more robust defence. For example, an organization might use a combination of firewalls, intrusion detection systems, data encryption, and user access controls to secure its infrastructure. One way to illustrate the combination of controls is the cheese layers model.

In the cheese layers model, each layer of defines is depicted as a slice of cheese with holes (representing vulnerabilities). While a single layer may have weaknesses, the holes rarely align perfectly across multiple layers. Thus, if one control fails, another layer can still block the attack. For instance, even if an attacker bypasses the firewall, they may still be detected by the intrusion detection system. This layered defence strategy is crucial for AI systems, which feature many potential points of failure, such as model vulnerabilities and data privacy risks. No single defence can cover all of those, so an organization developing or deploying AI will require diverse and adaptive controls.

3.2.3 The thin line between security practices and cyberattacks

Sometimes, it can be difficult to distinguish between attack practices and the cybersecurity practices used to address them. One example comes from the recent growth in fuzzing methods. Fuzzing is a technique used to find software vulnerabilities by providing random, unexpected, or invalid data inputs to a program and observing its behaviour. The goal of fuzzing is to identify weaknesses that can be exploited by attackers, such as crashes, memory leaks, or unexpected behaviour that indicates poor input handling.

For example, a fuzzing tool might send a series of malformed inputs to a web application in an attempt to trigger a vulnerability like a buffer overflow or input validation error. In AI systems, fuzzing can be used to test the robustness of machine learning models, identifying scenarios where the model fails or produces unreliable results due to unanticipated input patterns.

Fuzzing is largely mentioned as a cybersecurity tool, which organizations use to anticipate attacks that might be used against them. In this context, AI technologies can be used to boost cybersecurity, by allowing cybersecurity experts to create and test a larger number of scenarios. However, the same techniques to detect vulnerabilities might be used by an attacker who wants to figure out how to actively attack a system. If that happens, the use of AI systems increases the capabilities of attackers. This means that AI technologies do not end the arms race between attackers and defenders but continue to feed it.

3.3 AI-specific risks to cybersecurity

Learning outcomes

By the end of this section, learners will be able to indicate how the use of AI creates unique risks from a cybersecurity perspective.

AI systems and models are complex objects, as we have seen in Chapter 2. This means that an attacker has a wealth of points they can probe for potential vulnerabilities. An attack might target the data used to create an AI system, its training process, the infrastructure used to support its execution, or its context of use. At each juncture, various methods can be used to identify and exploit vulnerabilities. In this section, we focus on attacks that are specifically tailored for AI systems and models.

Given the current predominance of machine learning models, this section will mostly deal with attacks directed at machine learning technologies. Our goal here is not to discuss the intricacies of those attacks, as many of them rely on technical elements that require some expertise.² Instead, we will focus on present the general features of those attacks, so that data protection professionals can collaborate with technical experts in raising awareness about them and designing organizational responses.

One thing that must be kept in mind, however, is that cybersecurity in AI is a relatively novel domain. As such, attackers are often in a more advantageous position in comparison with defenders. They only need one successful exploit of a vulnerability, whereas a defender needs to clear all risk vectors. However, because the AI techniques themselves are novel, sometimes there are no known ways to fully eliminate the risk. Therefore, organizations will sometimes be forced to evaluate whether existing measures for mitigation can reduce risk to a legally acceptable level. Otherwise, they might be forced to abandon the use of AI for that specific purpose.

3.3.1 Attacks on the AI training process

AI models, particularly machine learning systems, can be subject to cybersecurity threats during their training stage. Those threats might impact various desirable properties of AI systems. Consider the CIA triad:

Confidentiality is relevant at the training stage, as organizations might want to preserve their expertise codified in the model and training practices, and they remain subject to data protection requirements that require them to control access to any personal data used in training.³
When it comes to integrity, AI systems and models rely heavily on large datasets to learn patterns and generate their outputs. This is often summarized in the maxim “garbage in, garbage out”: if one starts from bad training data, the ensuing model is likely to be inaccurate or even misleading in important ways. As a result, the integrity of training data becomes crucial for model performance and reliability.
Availability issues are a bit less salient at the training stage, but they might still occur, for example, when a model continues to learn after it is deployed.

As discussed throughout the unit, those goals can be affected in many ways. We should now consider attack vectors that are specific to AI.

One major risk in the training phase is data poisoning, where attackers intentionally manipulate the training data to influence the behaviour of the model. For example, a hacker might introduce mislabelled exams into InnovaHospital’s databases. If trained on those mislabelled exams, an automated diagnosis algorithm might clear patients that are in fact sick or provide false positives to healthy patients. Data poisoning is particularly concerning in scenarios where the training data comes from external or crowdsourced sources, as these datasets are more susceptible to tampering.

A variant of data poisoning is the so-called backdoor attack, in which the model training is sabotaged to ensure that a model produces an incorrect output when it identifies a certain element in the input. Consider a scenario where UNw decides to adopt an AI system for automatically grading undergraduate exams. A malicious student, knowing about this, hacks into the system’s training data and inserts data that falsely labels any exams taken by them or their friends as receiving the highest grades. This would allow these students to perform well regardless of their actual effort.

Attackers can also tamper with an AI system through environmental attacks. In this kind of attack, the system itself is not altered, but the attacker directs their attention to the environment in which the system will operate. For example, a malicious competitor of DigiToys might compromise software libraries that are used by this company, with a view to making their AI systems not working or introducing backdoors for exfiltrating information. As we discussed in Chapter 2, the training of AI models takes place in a complex environment. Hence, attackers have many opportunities to exploit vulnerabilities in different pieces of the infrastructure supporting an AI system.

3.3.2 Attacks on deployed AI systems

Once an AI model has been trained and deployed, it remains vulnerable to a diverse set of attacks that target its predictions and outputs. As more AI models are used in a variety of real-world applications, attackers can identify new vulnerabilities they can exploit. Such vulnerabilities can take many forms, many of which rely on interactions with the AI system.

One common attack against deployed models is the adversarial attack. In this approach, attackers carefully craft input data designed to deceive the AI model. For example, an adversarial image might appear normal to a human observer but contains subtle perturbations that cause a computer vision model to misclassify it. This type of attack could be used to trick facial recognition systems into misidentifying individuals or to manipulate AI models used in autonomous vehicles, potentially leading to dangerous consequences.

Deployed AI systems might also be vulnerable to model extraction attacks. In this kind of attack, the malicious party attempts to replicate a deployed AI model by querying it extensively and gathering information about its outputs. Through repeated interactions, the attacker can approximate the decision-making process of the original model. For example, they might obtain information about which safeguards have been implemented in the model and which values have been given to certain key parameters.

Model extraction attacks are particularly problematic for proprietary AI models that represent significant investments in research and development. The stolen model can then be used by competitors or malicious actors, undermining the original creator’s competitive advantage. More generally, however, a model extraction model can be a starting point for further exploitation. An attacker might simply want to duplicate the extracted model for their own purposes, but they might be interested in carrying out further attacks. In the latter case, access to an extracted model will allow them to identify other vulnerabilities that can be used for follow-up attacks.

To conclude this necessarily incomplete overview of attacks against deployed AI systems, we must talk about another of attack: model inversion. Just like model extraction attacks, model inversion operates by repetition. The attacker makes various queries to the AI system and uses the system’s outputs to extract information from it. In this case, however, the goal is not to extract the model itself, but data used during its training process. For instance, if an AI model is trained on a medical dataset, model inversion techniques could potentially reveal private details about individual patients. This means model inversion attacks can directly affect the level of protection afforded to personal data used for training AI.

3.3.3 Challenges in addressing AI-specific cybersecurity risks

The fast pace of evolution of AI technologies creates a tough cybersecurity challenge. Innovative technologies give origin to evolving threats, at the same time there are few measures that have been proven to be effective in mitigating or eliminating risks. Deploying those techniques sometimes requires advanced expertise of a different kind than the one used for developing and deploying AI systems. Furthermore, AI technologies themselves can be leveraged for detecting and exploiting vulnerabilities. Even so, there are various actions that data controllers can take when it comes to AI-related risks.

One of the fundamental challenges in AI cybersecurity is the evolving nature of the threats and the limited availability of proven mitigation measures. Some AI systems continuously learn and adapt, which introduces new attack surfaces. Furthermore, the complexity and opacity of many AI models make it difficult to understand their vulnerabilities fully. This lack of transparency, often referred to as the “black box” problem and further discussed in Section 4.3, complicates efforts to identify potential weaknesses and implement robust defences.

Mitigating these attacks is complex, as many standard defences are not fully effective against the attacks outlined in this session. Techniques like input validation, adversarial training (where the model is exposed to adversarial examples during training), and rate-limiting of model queries can help, but they are often insufficient. Adversarial attacks, in particular, highlight the fragility of AI models, as even small, imperceptible changes to input data can lead to incorrect outputs. Additionally, the lack of effective counter-measures against model extraction creates a significant risk for public-facing applications of AI, especially those that require sensitive data to work.

In the absence of AI-specific solutions for AI-specific issues, data controllers need to rely on established cybersecurity approaches. Defence-in-depth approaches—which rely on multiple, overlapping measures to diminish risk and safeguards to deal with harm—can compensate the shortcomings of individual techniques. So, organizations need to look at measures directed at different components of an AI system or model, taking effect throughout its entire life cycle (see Part II). Even so, certain risks may remain unaddressed due to the novelty and complexity of AI-specific attacks. Therefore, defence in depth is not a silver bullet for AI cybersecurity.

In some cases, the risks associated with deploying an AI system may outweigh the potential benefits. This is particularly likely to be the case when the system handles sensitive data or is used in critical decision-making processes. For instance, using AI in healthcare diagnostics or for criminal investigations may introduce unacceptable risks if the models are vulnerable to adversarial attacks that could lead to incorrect or harmful outcomes. In such situations, organizations might consider alternative approaches, such as relying on simpler, rule-based systems or employing a hybrid approach where AI decisions are supplemented by human oversight.

3.4 Conclusion

The unique cybersecurity risks posed by AI technologies require a careful balance between fostering innovation and ensuring robust risk management. As the field of AI security is still in its initial stages, there are limited standardized solutions for many of the emerging threats. This creates a challenging environment for organization, who must navigate the complexities of AI risk to comply with their data protection obligations. The safe deployment of AI technologies thus requires collaborative efforts between data protection experts, AI developers, and cybersecurity professionals.

Based on the previous discussions, data protection professionals would do well to keep some points in mind during their assessments:

Having clear threat models for a given AI application or model can help in the diagnosis of potential risks.
AI systems remain vulnerable to many attacks that affect software systems in general. Therefore, AI cybersecurity needs to attend both to the AI models and to the non-AI components that allow their use.
AI technologies can be used both by organizations in identifying and responding to cybersecurity vulnerabilities and by attackers in exploiting those vulnerabilities.
Currently, the novelty of AI technologies favours attackers rather than defenders. There are no known measures to respond to certain risk vectors.
Organizations need therefore to consider whether existing measures and safeguards can reduce risks to an acceptable level.
If risk can be reduced, a defence in depth approach might help overcome the limitations of individual AI cybersecurity techniques.
Otherwise, an organization might need to consider whether it can lawfully deploy AI at all if it cannot ensure a minimum level of cybersecurity.

Ultimately, the integration of AI into data processing and decision-making processes requires a shift in the traditional approach to cybersecurity. Data protection professionals must adopt a proactive stance, focusing not only on compliance but also on the broader implications of AI risks. Effective data protection in the age of AI will therefore require close attention to a technical landscape that is both extraordinarily complex and fast-moving. This, in itself, is not different from usual data protection practices. But the specific technical arrangements of AI can make much difference for whether and how problems can be addressed.

Exercises

Exercise 1. What is the main purpose of a threat model?

a. To describe cybersecurity principles.
b. To train employees on data protection.
c. To flag sensitive data that must be encrypted.
d. To build an AI system’s infrastructure.
e. To identify and prioritize security risks.

Exercise 2. Which alternative correctly completes the following sentence?

The principle of confidentiality in the CIA triad aims to ensure that…

a. Only authorized users can access data.
b. Data remains accurate and reliable.
c. Systems are accessible when required.
d. All actions are traceable to the actor responsible.
e. The system is protected against malware.

Exercise 3. Suppose an organization adopts a safeguard that directs hackers towards an outdated version of the AI system, rather than allowing them to access the version currently in use. In this case, which type of control has the organization used?

a. Preventive control.
b. Detection control.
c. Mitigation control.
d. Deflection control.
e. Recovery control.

Exercise 4. Why are adversarial attacks challenging to mitigate in AI systems?

a. They depend on outdated software libraries.
b. They exploit human vulnerabilities, not system flaws.
c. Small input changes can drastically mislead models.
d. They require insider knowledge to execute.
e. They only affect unsupervised learning algorithms.

Exercise 5. What is the expected outcome of a successful model inversion attack?

a. Compromising AI system availability.
b. Revealing sensitive data from the training set.
c. Duplicating the functionality of the model.
d. Disabling model functions temporarily.
e. Injecting malicious code into AI outputs.

3.4.1 Prompt for reflection

Based on DigiToys’ focus on reputation and compliance, what proactive cybersecurity measures could they prioritize to mitigate AI-specific risks in their products?

3.4.2 Answer sheet

Exercise 1. Alternative E is correct. Threat models focus on risks, not principles, and they do not focus on the specific design of infrastructure or the selection of controls such as encryption or training measures.

Exercise 2. Alternative A is correct. Alternatives B to D describe other cybersecurity principles, while alternative E describes a practice that might contribute to confidentiality but is not its focus.

Exercise 3. Alternative D is correct. A deflection control operates by diverting attacks from critical systems.

Exercise 4. Alternative C is correct. Adversarial attacks turn an AI system’s learning process against itself. They do not necessarily involve insider knowledge or target humans, and they are not restricted to outdated software or unsupervised learning models.

Exercise 5. Alternative B is correct.

References

Ross Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems(3rd edn, Wiley 2020).

Federica Casarosa, ‘Cybersecurity of Internet of Things in the Health Sector: Understanding the Applicable Legal Framework’ (2024) 53 Computer Law & Security Review 105982.

Markus Christensen and others (eds.), The Ethics of Cybersecurity (Springer 2020).

Henrik Junklewitz and others, Cybersecurity of Artificial Intelligence in the AI Act: Guiding Principles to Address the Cybersecurity Requirement for High Risk AI Systems. (Publications Office of the European Union 2023).

Andrei Kucharavy and others (eds), Large Language Models in Cybersecurity: Threats, Exposure and Mitigation(Springer 2024).

Taner Kuru, ‘Lawfulness of the mass processing of publicly accessible online data to train large language models’ (2024) International Data Privacy Law.

MITRE Atlas.

MITRE D3FEND Matrix.

Kaspar Rosager Ludvigsen, ‘The Role of Cybersecurity in Medical Devices Regulation: Future Considerations and Solutions’ (2023) 5 Law, Technology and Humans 59.

As required, for example, by the GDPR.↩︎
Learners who want a bit more of technical detail would do well to consult other materials, such as the Elements of Secure AI Systems training module for ICT professionals.↩︎
As we will discuss in Chapter 6 of this book.↩︎

3.1 Core concepts and legal requirements for cybersecurity

3.1.1 Approaches to cybersecurity

3.1.2 The attacker as the adversary

3.1.3 Legal requirements for cybersecurity in the EU

3.2 General threats to cybersecurity

3.2.1 Types of attacks

3.2.1.1 Social engineering

3.2.1.2 Exploiting software vulnerabilities

3.2.1.3 Man-in-the-middle attacks

3.2.1.4 Putting it all together

3.2.2 Types of security controls

3.2.3 The thin line between security practices and cyberattacks

3.3 AI-specific risks to cybersecurity

3.3.1 Attacks on the AI training process

3.3.2 Attacks on deployed AI systems

3.3.3 Challenges in addressing AI-specific cybersecurity risks

3.4 Conclusion

Exercises

3.4.1 Prompt for reflection

3.4.2 Answer sheet

References