Generative AI security: best practices for the LLM and agent era
Generative AI security is the practice of protecting large language models and autonomous agents across their full lifecycle, not classic application security with a model bolted on. LLMs and agents introduce risks that traditional controls miss, from prompt injection to excessive agency. This guide maps the threat landscape, the lifecycle controls that contain it, and the frameworks from OWASP, NIST, and MITRE that anchor a credible program. To turn it into a shipped system, see our generative AI development services.

The short version
- AI security now covers the whole AI system, extending beyond the surrounding app. The reference list of risks is the OWASP Top 10 for LLM Applications (2025), with prompt injection ranked number one.
- The strongest programs treat three frameworks as complementary: OWASP tells you what goes wrong, the NIST AI Risk Management Framework tells you how to govern it, and MITRE ATLAS is the attacker playbook you red-team against.
- Best practices follow the lifecycle: secure the data (provenance, poisoning defense, PII minimization), the model (trusted sources, signing, access control, evaluation), the application (input and output handling, least-privilege tools, guardrails), and operations (monitoring, red-teaming, incident response).
- Agentic AI is the 2026 frontier. Because agents act on their decisions, a prompt-injection bug becomes a harmful action. OWASP published a dedicated Top 10 for Agentic Applications in December 2025, and excessive agency is its own OWASP LLM category.
- On governance, the EU AI Act phases in: prohibited practices applied in February 2025, general-purpose AI rules in August 2025, and high-risk obligations and enforcement begin in August 2026. It is not fully in force today. NIST AI RMF stays voluntary, and ISO/IEC 42001 offers a certifiable management system.
The generative AI security threat landscape: OWASP Top 10 for LLM Applications
The reference list for generative AI security and large-language-model risk is the OWASP Top 10 for LLM Applications (2025), maintained by the OWASP Gen AI Security Project.1 Prompt injection sits at number one. The list spans the model, the data pipeline, and the surrounding application, which is why securing an LLM product means more than hardening the API in front of it.
Two definitions matter most. Prompt injection (LLM01) is crafted input that makes the model ignore its instructions and follow an attacker’s instead. The harder variant is indirect prompt injection, where the malicious instruction is hidden in content the model later reads, such as a fetched web page, an email, or a calendar invite, so the victim never typed it. Improper output handling (LLM05) is the mirror image: model output is untrusted input, and passing it unsanitized into a browser, database, shell, or downstream API opens the door to cross-site scripting, injection, or remote code execution.
| ID | Risk | What it is |
|---|---|---|
| LLM01 | Prompt Injection | Crafted input alters the model’s intended behavior, whether direct from the user or indirect from content the model ingests. |
| LLM02 | Sensitive Information Disclosure | The model exposes PII, secrets, or proprietary content through outputs, context, or training-data leakage. |
| LLM03 | Supply Chain | Weaknesses in third-party models, datasets, libraries, or fine-tuning artifacts compromise integrity or licensing. |
| LLM04 | Data and Model Poisoning | Manipulated training, fine-tuning, or embedding data, or backdoored weights, inject malicious or biased behavior. |
| LLM05 | Improper Output Handling | Unvalidated model output flows downstream and enables XSS, SSRF, SQL injection, or remote code execution. |
| LLM06 | Excessive Agency | Too much functionality, permission, or autonomy lets manipulated or hallucinated output trigger damaging actions. |
| LLM07 | System Prompt Leakage | Hidden system instructions are disclosed, revealing logic, credentials, or guardrails an attacker can then bypass. |
| LLM08 | Vector and Embedding Weaknesses | RAG and embedding flaws: poisoned vectors, embedding inversion, and cross-tenant retrieval leakage. |
| LLM09 | Misinformation | Confidently wrong or fabricated output that downstream users or systems trust as fact. |
| LLM10 | Unbounded Consumption | Uncontrolled compute and cost: denial-of-wallet, model extraction, and resource-exhaustion attacks. |
Read the list as a checklist, where no single rank deserves more dread than another. A RAG product weighs vector and embedding weaknesses (LLM08) and sensitive information disclosure (LLM02) heavily; a tool-using assistant lives or dies on excessive agency (LLM06) and improper output handling (LLM05). The point of defense-in-depth is that no single control stops every item, so you layer them.
Lifecycle best practices: data, model, application, operations
AI security best practices follow the lifecycle in four stages. Secure the data with provenance, poisoning defense, and PII minimization; secure the model with trusted sources, integrity signing, access control, and evaluation; secure the application with input and output handling, least-privilege tools, guardrails, and rate limiting; and secure operations with monitoring, red-teaming, and incident response. Each stage maps directly to OWASP risk categories.
Secure the data
- Provenance and validation. Track dataset lineage and vet training, fine-tuning, and RAG sources before they enter the pipeline (LLM04).
- Poisoning defense. Run anomaly detection on incoming data and screen for outliers and backdoors continuously, not as a one-time check (LLM04).
- PII and minimization. Strip and scope sensitive data before training and before it enters the context window; classify, label, and enforce retention limits (LLM02).
- RAG and vector hygiene. Enforce per-tenant and per-user access control on the vector store to prevent cross-tenant retrieval leakage and embedding inversion (LLM08).
Secure the model
- Trusted sources and integrity checks. Use models from verifiable sources and confirm them with file hashes and cryptographic signing, since provenance assurances for published weights remain weak (LLM03).
- AI bill of materials. Keep a signed inventory of models, datasets, and dependencies, an AI BOM, so a flawed component can be traced and replaced (LLM03).
- Access control at the model layer. Authenticate and authorize inference endpoints, rate-limit to blunt model extraction, and protect the system prompt (LLM07 and LLM10).
- Evaluation and testing. Benchmark for safety, robustness, and bias before deployment; pre-deployment testing is one of the NIST Generative AI Profile focus areas.3
Secure the application
- Input handling. Apply layered prompt-injection defenses: content filtering, data marking to isolate untrusted content, and treating all retrieved or external content as untrusted (LLM01).
- Output handling. Validate, encode, and sanitize model output before it reaches a browser, database, shell, or API (LLM05).
- Least-privilege tools. Minimize the tools and extensions an assistant can call, remove unneeded functionality, and run in the user’s scoped security context instead of a broad service account (LLM06).
- Guardrails and quotas. Add policy layers for input and output moderation, and cap requests, tokens, and cost to prevent unbounded consumption (LLM10).
Secure operations
- Monitoring and logging. Log prompts, tool calls, and outputs, and watch for plan drift and anomalous tool sequences.
- Red-teaming. Run continuous adversarial testing mapped to MITRE ATLAS techniques and the OWASP Top 10, with pre-deployment campaigns.4
- Incident response. Maintain AI-specific runbooks and disclosure processes, which NIST names as a Generative AI Profile focus area.3
- Continuous improvement. Feed findings back into governance and re-evaluate whenever the model or data changes.
This whole lifecycle is where education turns into work. Rolling it out, building the guardrails, and hardening the deployment is exactly what our AI deployment team does secure-by-design.
Securing agentic AI: excessive agency and tool permissions
Agentic AI is harder to secure because agents act on their decisions. They call tools, chain steps, hold memory, and delegate to other agents, which turns a prompt-injection bug from a wrong answer into a wrong action such as data exfiltration or an unauthorized transaction. OWASP published a dedicated Top 10 for Agentic Applications in December 2025, and excessive agency is already its own category in the LLM Top 10.5
OWASP traces excessive agency (LLM06) to three root causes: excessive functionality (the agent can call tools it never needs), excessive permissions (those tools run with more access than the task requires), and excessive autonomy (the agent acts on high-impact decisions without a human check). The Top 10 for Agentic Applications adds categories such as goal hijacking through prompt injection, tool misuse, identity and privilege abuse in delegation chains, and memory or context poisoning.5
The controls are concrete. Give an agent only the tools it strictly needs, and scope those tools to the user’s identity instead of a god-mode account. Require human-in-the-loop approval for high-impact actions, mediate every tool call, sandbox any code execution, and monitor for plan drift and anomalous tool chains. Where agents use the Model Context Protocol, treat loosely scoped server permissions as a known failure mode and enforce least privilege with scope expiry and regular access reviews.6 Operationalizing this, including red-team campaigns and a risk assessment, is the kind of program work we handle through AI consulting.
Frameworks that anchor a program: NIST AI RMF and MITRE ATLAS
Three frameworks anchor a serious AI security program, and they complement each other instead of competing. The NIST AI Risk Management Framework gives you the governance backbone through four functions, Govern, Map, Measure, and Manage. MITRE ATLAS gives you the attacker playbook to red-team against. The OWASP Top 10 gives you the prioritized list of what goes wrong.
The NIST AI RMF (AI RMF 1.0, January 2023) is a voluntary, sector-agnostic framework for trustworthy AI, with a Generative AI Profile (NIST AI 600-1, July 2024) that extends it to generative-AI-specific risks with more than 200 suggested actions.23 Its four functions are the spine of most enterprise programs.
| Function | Purpose | What it covers |
|---|---|---|
| Govern | Cultivate a culture of AI risk management across the organization. | Policies, accountability and roles, workforce, and third-party oversight. |
| Map | Establish the context that frames risk for a given system. | Intended use, capabilities and limits, risk tolerance, and the go or no-go decision. |
| Measure | Analyze, benchmark, and monitor AI risk and impact. | Testing and evaluation, trustworthiness metrics, and drift monitoring. |
| Manage | Allocate resources to treat the risks that Map and Measure surface. | Risk prioritization and treatment, incident and recovery planning, and continuous improvement. |
MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is a living knowledge base of adversary tactics and techniques against AI systems, modeled on MITRE ATT&CK and built from real-world attacks and red-team demonstrations.7 It catalogs techniques such as model poisoning, evasion through adversarial examples, model extraction, and, in recent generative-AI releases, RAG poisoning and prompt crafting, each with case studies and mitigations. Use it as the source of truth for what you simulate during red-teaming.
Governance and compliance: the EU AI Act, NIST, and ISO 42001
Governance ties the technical controls to legal and organizational accountability, but the regulatory picture is still phasing in. The EU AI Act applies in stages and is not fully in force today: prohibited practices applied in February 2025, general-purpose AI rules in August 2025, and high-risk obligations and enforcement begin in August 2026. NIST AI RMF stays voluntary, and ISO/IEC 42001 offers a certifiable AI management system.
| Date | What applies |
|---|---|
| Feb 2025 | Ban on prohibited AI practices plus AI-literacy obligations. |
| Aug 2025 | Governance rules and obligations for general-purpose AI (GPAI) models. |
| Aug 2026 | High-risk system obligations, transparency rules, and the start of enforcement. |
| Aug 2027 | Obligations for high-risk AI embedded in regulated products, and compliance deadline for earlier GPAI models. |
Position the three pillars by what they give you. NIST AI RMF gives you the practices, ISO/IEC 42001 gives you a certifiable management system, published in December 2023 as the AI analog of ISO 27001 for information security, and the EU AI Act sets phased legal obligations for the EU market.89 Most US teams start with the NIST RMF plus OWASP and MITRE technical controls, then layer ISO 42001 or EU AI Act readiness as their footprint demands. The accuracy guardrail is worth repeating: high-risk obligations and enforcement begin in August 2026, so do not describe the whole Act as already in effect.
AI security questions
What are AI security best practices?
What is prompt injection?
What is the OWASP Top 10 for LLMs?
How do you secure an AI agent?
What is the NIST AI Risk Management Framework?
Sources
- OWASP Gen AI Security Project, OWASP Top 10 for LLM Applications (2025).
- NIST, AI Risk Management Framework (AI RMF 1.0, NIST AI 100-1) (2023).
- NIST, Artificial Intelligence Risk Management Framework: Generative AI Profile (NIST AI 600-1) (2024).
- MITRE, MITRE ATLAS adversarial technique knowledge base (2025).
- OWASP Gen AI Security Project, Top 10 Risks and Mitigations for Agentic AI Security (2025).
- OWASP, OWASP MCP Top 10 (Model Context Protocol) (2025).
- MITRE, MITRE ATLAS Fact Sheet (2025).
- ISO, ISO/IEC 42001:2023, AI management systems (2023).
- European Commission, Regulatory framework on artificial intelligence (2024).
Strategy, architecture & ops
AI Architecture Patterns
Agentic design patterns explained: reflection, tool use, planning, and multi-agent collaboration, with a framework to pic...
Read guide →
Strategy, architecture & ops
AI Architecture Patterns for SaaS: A Technical Guide
Generative AI architecture for SaaS: layered design, multi-tenant isolation, LLM gateway, RAG, and security. Built by Res...
Read guide →
Strategy, architecture & ops
AI Cost Optimization
A senior-engineer guide to AI cost optimization: where LLM spend comes from, the levers ranked by payoff, the five number...
Read guide →
Strategy, architecture & ops
AI Deployment Checklist: 9 Gates Before You Ship
How to deploy AI models to production: a 9-gate pre-launch checklist anchored to the OWASP LLM Top 10 (2025), NIST AI RMF...
Read guide →
Strategy, architecture & ops
AI Evaluation and Evals
LLM evaluation and AI evals, explained: the eval taxonomy, how to build an eval suite, LLM-as-a-judge bias, offline vs pr...
Read guide →
Strategy, architecture & ops
AI Features SaaS Customers Actually Want
What AI powered SaaS customers actually want: the time-savers and answers they value, the automation they distrust, and h...
Read guide →
Agents & RAG
Agentic RAG: When to Use It and How to Build It
Agentic RAG explained: how it differs from naive and advanced RAG, the key patterns like corrective RAG and self-RAG, the...
Read guide →
Agents & RAG
AI Agent for Fintech: Risk, Compliance, Ops, Customer
AI agents in finance: fraud, AML, KYC and servicing use cases, how to build with money-movement guardrails and human appr...
Read guide →
Agents & RAG
AI Agent for Healthcare: Use Cases, Governance & Implementation
AI agents in healthcare: the use cases that pay off first, how to build one HIPAA-safe on FHIR with clinician review, and...
Read guide →
