AI Prompt Engineer
Design the prompts, eval suites and guardrails that decide whether an LLM feature is good enough to ship.
About the role
Prompting is engineering here, not trial and error. You will own the prompt and evaluation layer of real AI products: agents, retrieval pipelines and in-product copilots that run against live traffic. The job is to make model behaviour measurable and repeatable, so the team can change a model or a prompt and know, with numbers, whether the system got better or worse.
What you will do
- Design, version and test prompts for production agents, RAG systems and copilots, and document why each change moved the metric.
- Build evaluation suites that catch regressions before users do, with graded test sets and clear pass and fail thresholds.
- Tune retrieval and context assembly so the model gets the right information, not just more of it.
- Write guardrails and fallbacks for the cases that break a demo: refusals, hallucinations, jailbreaks and long-tail inputs.
- Work with the engineers serving the model on latency, cost-per-call and token budgets, the numbers Production-First AI locks first.
- Turn what you learn into reusable prompt patterns and eval templates the wider AI practice can pick up.
What we are looking for
- 2 to 5 years working with LLMs in a hands-on role, with shipped or near-shipped features you can talk through.
- Strong Python, and comfort reading and writing code around an API rather than only using a playground.
- A real understanding of evaluation: you can explain how you would prove a prompt change is an improvement, not a vibe.
- Familiarity with RAG, embeddings and vector search, and the failure modes each one brings.
- Precise written reasoning, since most of this job is making model behaviour legible to other engineers.
Nice to have
- Experience with eval tooling such as Ragas or Braintrust, and tracing tools for agent runs.
- Hands-on time with LangGraph or a similar orchestration framework.
- A background in linguistics, applied ML or data science.
What we offer
- Real production problems against live traffic, not prompt demos that never deploy.
- A method built around evaluation, so your work is measured by numbers rather than opinion.
- Senior AI engineers to build with and a bench that has carried AI to production since the shift in 2026.
- A hybrid seat at our Noida delivery center.
Apply: AI Prompt Engineer
Tell us a little about yourself and attach your resume. Our team reviews every application and replies if there is a fit.
- Every application is read by the hiring lead for that pod.
- We reply within about one business week.
- Your resume is stored privately, never shared outside Resourcifi.