How Are Artificial Intelligence Systems Attacked?

Oct 17, 2024

MITRE ATLAS and AI: How Are Artificial Intelligence Systems Attacked?

Public bodies responsible for safeguarding the cybersecurity of citizens and the productive sector, along with companies specializing in cybersecurity, have raised concerns about the potential for Artificial Intelligence to escalate both the frequency and impact of cyberattacks.

However, we as one of the leading web development company in Kolkata with an ISO/IEC 27001:2022 certification, which is an international standard for information security management that helps organizations protect their data, the focus should not only be on the malicious use of AI systems but also on ensuring the security of machine learning algorithms and large language models (LLMs).

To enhance the security of AI systems, the non-profit organization MITRE has introduced MITRE ATLAS, a framework that categorizes and outlines the tactics and techniques that adversaries may use to plan and carry out attacks on large language models.

In the next section, we will explore the core features of MITRE ATLAS and how it helps in understanding and predicting the tactics, techniques, and procedures that adversaries could employ to target AI systems.

1. MITRE ATT&CK: A Crucial Framework for Understanding Adversarial Tactics

The MITRE ATLAS (MITRE Adversarial Threat Landscape for Artificial-Intelligence Systems) framework is rooted in MITRE ATT&CK (MITRE Adversarial Tactics, Techniques, and Common Knowledge), a foundational tool that has become essential for cybersecurity professionals around the world.

Since its introduction in 2014, MITRE ATT&CK has revolutionized enterprise cybersecurity by shifting the focus to the perspective of malicious actors (this refers to entities or agents that interact with, influence, or are influenced by AI systems. These can be human users, organizations, or even automated systems that play various roles in the AI ecosystem), rather than just the defenses of organizations.

Over the years, new variants have been added to the original technology domain, expanding its focus on tactics and techniques used to target corporate networks.

As a result, MITRE ATT&CK now encompasses three primary technology domains:

Enterprise. This framework categorizes how cybercriminals target operating systems such as Windows, macOS, and Linux, as well as cloud-based environments widely used by businesses, including Office and Google Workspace.

Mobile. It outlines specific tactics and techniques used to compromise Android and iOS mobile devices.

ICS. This focuses on the tactics, techniques, and procedures (TTPs) involved in attacks on industrial control systems, a critical technology for numerous industries.

The rapid advancement of AI systems and their increasing integration into various industries has driven the development of MITRE ATLAS.

This framework consolidates and organizes global knowledge on cyberattacks targeting AI systems.

ATLAS, which stands for Adversarial Threat Landscape for Artificial-Intelligence Systems, is designed to map the adversarial threat landscape for AI.

Like MITRE ATT&CK, it uses a matrix to connect the tactics employed by adversaries with the techniques required for those tactics to be successful.

2. Specific Tactics in Cyberattacks Against AI Systems

When it comes to the tactics in MITRE ATLAS, they closely resemble those of its parent framework, MITRE ATT&CK.

However, two tactics commonly found in ATT&CK are notably absent:

Lateral movement
Command and control

In contrast, MITRE ATLAS introduces two tactics specifically aimed at compromising AI systems, particularly focused on weakening the underlying machine learning models:

Machine Learning (ML) model access
Machine Learning attack stage

As a result, the MITRE ATLAS matrix consists of 14 tactics, covering everything from the early preparation stages of an attack to achieving malicious objectives and impacting the AI system.

Reconnaissance
Resource development
Initial access
Machine Learning model access
Execution
Persistence
Privilege escalation
Defense evasion
Credential access
Discovery/Harvesting
Collection
Machine Learning attack stage
Exfiltration
Impact

Now, let's take a closer look at the two unique tactics MITRE ATLAS introduces compared to ATT&CK.

2.1. Access to the Machine Learning Model

Using this tactic, adversaries aim to gain access to the Machine Learning model of the targeted system.

By doing so, they can acquire detailed information about how the model and its components function at the highest level of access.

However, as noted by MITRE ATLAS, attackers may exploit varying access levels throughout different stages of the attack.

To gain access to a Machine Learning model, adversaries may:

Enter the system where the model is hosted, such as via an API.
Gain access to the physical environment where data collection feeding the model occurs.
Indirectly access the model by interacting with a service that utilizes it in its processes.

The objectives of accessing a Machine Learning model include:

Obtaining detailed information about the model.
Developing targeted attacks against it.
Introducing manipulated data to compromise or disrupt the model's performance.

2.2. Machine Learning Attack Stage

While accessing the Machine Learning model is critical in the early stages of an attack, this tactic plays a key role in the later stages.

At this point, adversaries leverage their knowledge of the model and their access to the AI system to tailor their attacks and achieve specific objectives.

Four key techniques can be employed during this stage:

Model Replication: Attackers can create a proxy model by training new models, using pre-trained ones, or replicating models via the target system’s inference APIs. This allows them to simulate access to the model offline.

Backdoor Implementation: Inserting a backdoor into the Machine Learning model enables attackers to maintain persistence within the system and manipulate the model’s operation as needed.

Effectiveness Verification: Attackers can verify the success of their attack by using an inference API or testing against an offline copy of the model. This technique confirms that the attack is well-prepared and can be executed effectively.

Adversarial Data Creation: Adversaries can inject adversarial data into the model, altering its behavior to achieve specific outcomes.

3. MITRE ATLAS Maps Techniques for Undermining Large Language Models

In MITRE ATLAS, if tactics are the framework’s structural beams, then techniques serve as its supporting columns. For each tactic, a variety of techniques are provided, detailing how adversaries might successfully execute attacks.

MITRE ATLAS identifies and defines 56 techniques, a notable reduction compared to the 196 techniques found in the MITRE ATT&CK Enterprise matrix.

These 56 techniques offer a comprehensive and accurate view of how attacks on AI systems can be developed and carried out.

While many of the tactics overlap with those in the original MITRE framework, the techniques are uniquely tailored to Artificial Intelligence.

For instance, under the discovery tactic, four distinct techniques are specified.

Discover the underlying ontology of the Machine Learning model being targeted.
Identify the specific family of Machine Learning models used by the target system.
Locate the Machine Learning artifacts present in the system intended for attack.
Gain access to the meta prompt or initial instructions of a large language model (LLM), enabling attackers to steal intellectual property by reverse-engineering prompts.

Additionally, many techniques are further broken down into sub-techniques, providing a more granular view of the methods adversaries use to achieve their objectives.

For instance, three of the four techniques in the Machine Learning attack stage include several sub-techniques that detail these processes more thoroughly.

4. How Can Hostile Actor Techniques Be Prevented According to MITRE ATLAS?

In addition to categorizing and defining the tactics and techniques attackers may use against AI systems, MITRE ATLAS offers two valuable elements for preventing such attacks:

Case studies that provide deeper insights into how attacks unfold and their potential impact on AI systems. MITRE ATLAS includes numerous case studies that cover a variety of attack scenarios, such as:

Types of attacks: model poisoning, model replication, and others.
Actors capable of executing these attacks.
Specific characteristics of AI systems and their models, including attacks on machine learning as a service (MLaaS), and models hosted either on-premises or in the cloud.
Use cases of AI systems, ranging from highly sensitive applications, like cybersecurity, to less critical areas, such as customer service chatbots.

Procedures to Mitigate Malicious Techniques and Prevent Security Incidents

MITRE ATLAS outlines up to 20 security concepts and technologies designed to counteract adversarial techniques. These procedures include strategies like restricting public access to system information, closely monitoring who has access to machine learning models and their data during the production phase, and implementing essential measures such as training Machine Learning model developers in cybersecurity best practices. This includes secure coding techniques and conducting continuous vulnerability assessments to identify and address weaknesses before they can be exploited by hostile actors.

5. MITRE ATLAS
A Valuable Tool For Cyber Threat Hunting Services And Services That Simulate An External Attack That Poses A Threat To A Targeted Organization And Then Improve Its Security

Similar to MITRE ATT&CK, MITRE ATLAS is an invaluable resource for professionals responsible for two critical cybersecurity functions aimed at enhancing the resilience of AI systems and safeguarding the organizations that develop or utilize them regularly: cyber threat hunting services and services that simulate an external attack that poses a threat to a targeted organization and then improve its security.

5.1. Cyber Threat Hunting Services

Cyber threat hunting services continuously investigate potential compromise scenarios that have yet to be detected, allowing them to take a proactive approach to threat detection.

They leverage telemetry from EDR/XDR technologies to uncover malicious activity and gain insights into the tactics, techniques, and procedures (TTPs) used by adversaries to target AI systems.

MITRE ATLAS serves as a valuable guide for standardizing these TTPs globally, specifically for cyberattacks aimed at AI systems.

Cyber threat hunting services play a crucial role in:

Enhancing threat detection capabilities.
Identifying malicious tactics and techniques in the early stages of an attack.
Anticipating adversaries and preventing them from achieving their objectives.

5.2. Services That Simulate An External Attack That Poses A Threat To A Targeted Organization And Then Improve Its Security

The insights gained from cyber threat hunting services are vital for crafting and executing services that simulate an external attack that poses a threat to a targeted organization and then improve its security scenarios designed to evaluate how a company developing or utilizing AI systems would respond to an attack.

MITRE ATLAS proves invaluable in planning these scenarios, helping to define the type of adversary to simulate, the attack vector to use, and the specific targets.

A well-executed service that simulates an external attack that poses a threat to a targeted organization and then improves its security enhances an organization's resilience against attacks on its own or third-party AI systems, trains defensive teams to counter malicious techniques targeting AI, and optimizes detection and response capabilities.

As the AI revolution continues and research advances rapidly, the threat landscape for AI systems is expected to evolve significantly in the coming years.

MITRE ATLAS offers cybersecurity professionals a unified framework for understanding adversarial tactics, techniques, and procedures, as well as strategies for mitigating them.

Over time, as practitioners gain more experience, we as one of the leading web development companies in Kolkata with an ISO/IEC 27001:2022 certification, which is an international standard for information security management that helps organizations protect their data do acknowledge that this framework will be continuously refined to include newly developed TTPs as they emerge and are deployed in the years to come.

Our Office

USA

Seattle

2515 4th Avenue, Centennial Tower Seattle 98121
United States Of America

ratnesh.singh@sbinfowaves.com

+1-7543-335-140

Australia

Sydney

Rubix Alliance Pty Ltd Suite 305/30 Kingsway, Cronulla NSW 2230

dhiraj.accounts@sbinfowaves.com

+6-1480-023-313

India

Kolkata

Adventz Infinity, Office No - 1509 BN - 5, Street Number -18 Bidhannagar, Kolkata - 700091 West Bengal

dibyendu.mondal@sbinfowaves.com

+91-8240-823-048

India

Bengaluru

KEONICS, #29/A (E), 27th Main, 7th Cross Rd, 1st Sector, HSR Layout, Bengaluru, Karnataka 560102

pradipta@sbinfowaves.com

+91-9804-360-617

Blog

MITRE ATLAS and AI: How Are Artificial Intelligence Systems Attacked?

Leave a Reply Cancel reply

Our Office

USA

Seattle

Australia

Sydney

India

Kolkata

India

Bengaluru

Unleash the Sales Beast Within and Watch Your Revenue Soar!

GET A FREE ANALYSIS OF YOUR WEBSITE NOW!