Generative AI security: best practices for the LLM and agent era

Generative AI security is the practice of protecting large language models and autonomous agents across their full lifecycle, not classic application security with a model bolted on. LLMs and agents introduce risks that traditional controls miss, from prompt injection to excessive agency. This guide maps the threat landscape, the lifecycle controls that contain it, and the frameworks from OWASP, NIST, and MITRE that anchor a credible program. To turn it into a shipped system, see our generative AI development services.

By Kanika Mathur, Head of Service Delivery

Reviewed by Resourcifi engineeringPublished Apr 21, 2026Updated Apr 21, 202612 min read

Key takeaways

The short version

AI security now covers the whole AI system, extending beyond the surrounding app. The reference list of risks is the OWASP Top 10 for LLM Applications (2025), with prompt injection ranked number one.
The strongest programs treat three frameworks as complementary: OWASP tells you what goes wrong, the NIST AI Risk Management Framework tells you how to govern it, and MITRE ATLAS is the attacker playbook you red-team against.
Best practices follow the lifecycle: secure the data (provenance, poisoning defense, PII minimization), the model (trusted sources, signing, access control, evaluation), the application (input and output handling, least-privilege tools, guardrails), and operations (monitoring, red-teaming, incident response).
Agentic AI is the 2026 frontier. Because agents act on their decisions, a prompt-injection bug becomes a harmful action. OWASP published a dedicated Top 10 for Agentic Applications in December 2025, and excessive agency is its own OWASP LLM category.
On governance, the EU AI Act phases in: prohibited practices applied in February 2025, general-purpose AI rules in August 2025, and high-risk obligations and enforcement begin in August 2026. It is not fully in force today. NIST AI RMF stays voluntary, and ISO/IEC 42001 offers a certifiable management system.

The generative AI security threat landscape: OWASP Top 10 for LLM Applications

The reference list for generative AI security and large-language-model risk is the OWASP Top 10 for LLM Applications (2025), maintained by the OWASP Gen AI Security Project.¹ Prompt injection sits at number one. The list spans the model, the data pipeline, and the surrounding application, which is why securing an LLM product means more than hardening the API in front of it.

Two definitions matter most. Prompt injection (LLM01) is crafted input that makes the model ignore its instructions and follow an attacker’s instead. The harder variant is indirect prompt injection, where the malicious instruction is hidden in content the model later reads, such as a fetched web page, an email, or a calendar invite, so the victim never typed it. Improper output handling (LLM05) is the mirror image: model output is untrusted input, and passing it unsanitized into a browser, database, shell, or downstream API opens the door to cross-site scripting, injection, or remote code execution.

The OWASP Top 10 for LLM Applications (2025)

The industry reference list of the most critical security risks in LLM applications, maintained by the OWASP Gen AI Security Project.

OWASP Top 10 for LLM Applications, 2025 edition
ID	Risk	What it is
LLM01	Prompt Injection	Crafted input alters the model’s intended behavior, whether direct from the user or indirect from content the model ingests.
LLM02	Sensitive Information Disclosure	The model exposes PII, secrets, or proprietary content through outputs, context, or training-data leakage.
LLM03	Supply Chain	Weaknesses in third-party models, datasets, libraries, or fine-tuning artifacts compromise integrity or licensing.
LLM04	Data and Model Poisoning	Manipulated training, fine-tuning, or embedding data, or backdoored weights, inject malicious or biased behavior.
LLM05	Improper Output Handling	Unvalidated model output flows downstream and enables XSS, SSRF, SQL injection, or remote code execution.
LLM06	Excessive Agency	Too much functionality, permission, or autonomy lets manipulated or hallucinated output trigger damaging actions.
LLM07	System Prompt Leakage	Hidden system instructions are disclosed, revealing logic, credentials, or guardrails an attacker can then bypass.
LLM08	Vector and Embedding Weaknesses	RAG and embedding flaws: poisoned vectors, embedding inversion, and cross-tenant retrieval leakage.
LLM09	Misinformation	Confidently wrong or fabricated output that downstream users or systems trust as fact.
LLM10	Unbounded Consumption	Uncontrolled compute and cost: denial-of-wallet, model extraction, and resource-exhaustion attacks.

Source: OWASP Gen AI Security Project, OWASP Top 10 for LLM Applications (2025).

Read the list as a checklist, where no single rank deserves more dread than another. A RAG product weighs vector and embedding weaknesses (LLM08) and sensitive information disclosure (LLM02) heavily; a tool-using assistant lives or dies on excessive agency (LLM06) and improper output handling (LLM05). The point of defense-in-depth is that no single control stops every item, so you layer them.

Lifecycle best practices: data, model, application, operations

AI security best practices follow the lifecycle in four stages. Secure the data with provenance, poisoning defense, and PII minimization; secure the model with trusted sources, integrity signing, access control, and evaluation; secure the application with input and output handling, least-privilege tools, guardrails, and rate limiting; and secure operations with monitoring, red-teaming, and incident response. Each stage maps directly to OWASP risk categories.

Secure the data

Provenance and validation. Track dataset lineage and vet training, fine-tuning, and RAG sources before they enter the pipeline (LLM04).
Poisoning defense. Run anomaly detection on incoming data and screen for outliers and backdoors continuously, not as a one-time check (LLM04).
PII and minimization. Strip and scope sensitive data before training and before it enters the context window; classify, label, and enforce retention limits (LLM02).
RAG and vector hygiene. Enforce per-tenant and per-user access control on the vector store to prevent cross-tenant retrieval leakage and embedding inversion (LLM08).

Secure the model

Trusted sources and integrity checks. Use models from verifiable sources and confirm them with file hashes and cryptographic signing, since provenance assurances for published weights remain weak (LLM03).
AI bill of materials. Keep a signed inventory of models, datasets, and dependencies, an AI BOM, so a flawed component can be traced and replaced (LLM03).
Access control at the model layer. Authenticate and authorize inference endpoints, rate-limit to blunt model extraction, and protect the system prompt (LLM07 and LLM10).
Evaluation and testing. Benchmark for safety, robustness, and bias before deployment; pre-deployment testing is one of the NIST Generative AI Profile focus areas.³

Secure the application

Input handling. Apply layered prompt-injection defenses: content filtering, data marking to isolate untrusted content, and treating all retrieved or external content as untrusted (LLM01).
Output handling. Validate, encode, and sanitize model output before it reaches a browser, database, shell, or API (LLM05).
Least-privilege tools. Minimize the tools and extensions an assistant can call, remove unneeded functionality, and run in the user’s scoped security context instead of a broad service account (LLM06).
Guardrails and quotas. Add policy layers for input and output moderation, and cap requests, tokens, and cost to prevent unbounded consumption (LLM10).

Secure operations

Monitoring and logging. Log prompts, tool calls, and outputs, and watch for plan drift and anomalous tool sequences.
Red-teaming. Run continuous adversarial testing mapped to MITRE ATLAS techniques and the OWASP Top 10, with pre-deployment campaigns.⁴
Incident response. Maintain AI-specific runbooks and disclosure processes, which NIST names as a Generative AI Profile focus area.³
Continuous improvement. Feed findings back into governance and re-evaluate whenever the model or data changes.

This whole lifecycle is where education turns into work. Rolling it out, building the guardrails, and hardening the deployment is exactly what our AI deployment team does secure-by-design.

Securing agentic AI: excessive agency and tool permissions

Agentic AI is harder to secure because agents act on their decisions. They call tools, chain steps, hold memory, and delegate to other agents, which turns a prompt-injection bug from a wrong answer into a wrong action such as data exfiltration or an unauthorized transaction. OWASP published a dedicated Top 10 for Agentic Applications in December 2025, and excessive agency is already its own category in the LLM Top 10.⁵

OWASP traces excessive agency (LLM06) to three root causes: excessive functionality (the agent can call tools it never needs), excessive permissions (those tools run with more access than the task requires), and excessive autonomy (the agent acts on high-impact decisions without a human check). The Top 10 for Agentic Applications adds categories such as goal hijacking through prompt injection, tool misuse, identity and privilege abuse in delegation chains, and memory or context poisoning.⁵

The controls are concrete. Give an agent only the tools it strictly needs, and scope those tools to the user’s identity instead of a god-mode account. Require human-in-the-loop approval for high-impact actions, mediate every tool call, sandbox any code execution, and monitor for plan drift and anomalous tool chains. Where agents use the Model Context Protocol, treat loosely scoped server permissions as a known failure mode and enforce least privilege with scope expiry and regular access reviews.⁶ Operationalizing this, including red-team campaigns and a risk assessment, is the kind of program work we handle through AI consulting.

Frameworks that anchor a program: NIST AI RMF and MITRE ATLAS

Three frameworks anchor a serious AI security program, and they complement each other instead of competing. The NIST AI Risk Management Framework gives you the governance backbone through four functions, Govern, Map, Measure, and Manage. MITRE ATLAS gives you the attacker playbook to red-team against. The OWASP Top 10 gives you the prioritized list of what goes wrong.

The NIST AI RMF (AI RMF 1.0, January 2023) is a voluntary, sector-agnostic framework for trustworthy AI, with a Generative AI Profile (NIST AI 600-1, July 2024) that extends it to generative-AI-specific risks with more than 200 suggested actions.²³ Its four functions are the spine of most enterprise programs.

The NIST AI Risk Management Framework core functions

Four functions that structure a trustworthy-AI program. Govern is cross-cutting; Map, Measure, and Manage run iteratively across the lifecycle.

NIST AI RMF 1.0 core functions
Function	Purpose	What it covers
Govern	Cultivate a culture of AI risk management across the organization.	Policies, accountability and roles, workforce, and third-party oversight.
Map	Establish the context that frames risk for a given system.	Intended use, capabilities and limits, risk tolerance, and the go or no-go decision.
Measure	Analyze, benchmark, and monitor AI risk and impact.	Testing and evaluation, trustworthiness metrics, and drift monitoring.
Manage	Allocate resources to treat the risks that Map and Measure surface.	Risk prioritization and treatment, incident and recovery planning, and continuous improvement.

Source: NIST AI Risk Management Framework (AI RMF 1.0, NIST AI 100-1, 2023).

MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is a living knowledge base of adversary tactics and techniques against AI systems, modeled on MITRE ATT&CK and built from real-world attacks and red-team demonstrations.⁷ It catalogs techniques such as model poisoning, evasion through adversarial examples, model extraction, and, in recent generative-AI releases, RAG poisoning and prompt crafting, each with case studies and mitigations. Use it as the source of truth for what you simulate during red-teaming.

Governance and compliance: the EU AI Act, NIST, and ISO 42001

Governance ties the technical controls to legal and organizational accountability, but the regulatory picture is still phasing in. The EU AI Act applies in stages and is not fully in force today: prohibited practices applied in February 2025, general-purpose AI rules in August 2025, and high-risk obligations and enforcement begin in August 2026. NIST AI RMF stays voluntary, and ISO/IEC 42001 offers a certifiable AI management system.

EU AI Act phased timeline

The Act entered into force on 1 August 2024 and applies in phases. Most obligations are not yet in effect, so describe it as phasing in rather than fully in force.

EU AI Act key application dates
Date	What applies
Feb 2025	Ban on prohibited AI practices plus AI-literacy obligations.
Aug 2025	Governance rules and obligations for general-purpose AI (GPAI) models.
Aug 2026	High-risk system obligations, transparency rules, and the start of enforcement.
Aug 2027	Obligations for high-risk AI embedded in regulated products, and compliance deadline for earlier GPAI models.

Source: European Commission, regulatory framework on AI (2024 to 2027 phased dates).

Position the three pillars by what they give you. NIST AI RMF gives you the practices, ISO/IEC 42001 gives you a certifiable management system, published in December 2023 as the AI analog of ISO 27001 for information security, and the EU AI Act sets phased legal obligations for the EU market.⁸⁹ Most US teams start with the NIST RMF plus OWASP and MITRE technical controls, then layer ISO 42001 or EU AI Act readiness as their footprint demands. The accuracy guardrail is worth repeating: high-risk obligations and enforcement begin in August 2026, so do not describe the whole Act as already in effect.

Frequently asked

AI security questions

What are AI security best practices?

AI security best practices secure the full AI lifecycle: the data through provenance, poisoning defense and PII minimization; the model through trusted sources, signing, access control and evaluation; the application through input and output handling, least-privilege tools, guardrails and rate limiting; and operations through monitoring, red-teaming and incident response. Most organizations anchor this on the OWASP Top 10 for LLM Applications, the NIST AI Risk Management Framework, and MITRE ATLAS. The goal is defense-in-depth, because no single control stops every attack.

What is prompt injection?

Prompt injection is an attack where crafted input causes a large language model to ignore its instructions and follow the attacker’s instead. It is ranked the number one risk in the OWASP Top 10 for LLM Applications (2025). The most dangerous form is indirect prompt injection, where the malicious instruction is hidden in content the model later reads, such as a web page, email, or document, so the victim user never typed it.

What is the OWASP Top 10 for LLMs?

The OWASP Top 10 for LLM Applications is the industry-standard list of the most critical security risks in large-language-model applications, maintained by the OWASP Gen AI Security Project. The 2025 edition covers prompt injection, sensitive information disclosure, supply chain, data and model poisoning, improper output handling, excessive agency, system prompt leakage, vector and embedding weaknesses, misinformation, and unbounded consumption. It is the most widely used checklist for LLM application security.

How do you secure an AI agent?

Secure an AI agent by enforcing least privilege: give it only the tools and permissions it strictly needs, run it in the user’s scoped security context instead of a broad service account, and require human approval for high-impact actions. Add guardrails on inputs and outputs, mediate tool calls, and monitor for plan drift and anomalous tool chains. OWASP published a dedicated Top 10 for Agentic Applications in December 2025, because autonomous agents turn a prompt-injection bug into a harmful action and not just a wrong answer.

What is the NIST AI Risk Management Framework?

The NIST AI Risk Management Framework (AI RMF 1.0, January 2023) is a voluntary US framework for managing AI risk through four functions: Govern, Map, Measure, and Manage. Its Generative AI Profile (NIST AI 600-1, July 2024) extends those functions to generative-AI-specific risks with more than 200 suggested actions. It is widely adopted as the backbone of enterprise AI governance programs, though it carries no legal enforcement.

Kanika Mathur

Head of Service Delivery, Resourcifi

Kanika Mathur is Head of Service Delivery at Resourcifi, where her engineering pods red-team their own LLM applications, RAG pipelines, and agentic workflows before a client ever sees them. Her teams treat the OWASP Top 10 and the NIST AI RMF as a delivery standard rather than a compliance afterthought, which is the secure-by-design discipline this guide describes.

Resourcifi on LinkedIn →