← all jobs

[Remote] Senior AI Agent & Evaluations Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Vacatia is building the future of vacation ownership, focusing on transforming the industry through AI. They are seeking a Senior AI Agent & Evaluations Engineer to design and improve AI agents that directly impact customer experiences and operational efficiency, while owning the intelligence layer behind these systems.

Responsibilities

  • Design, refine, and optimize prompts, tool definitions, routing logic, and decision-making behavior across Vacatia's AI agent ecosystem
  • Build and maintain evaluation frameworks, golden datasets, grading systems, and regression testing pipelines that measure agent quality and reliability
  • Develop guardrails and safe-failure mechanisms that ensure agents operate responsibly in customer-facing and financially sensitive workflows
  • Monitor production performance, investigate failures, identify edge cases, and continuously improve agent outcomes through data-driven iteration
  • Partner with business stakeholders to translate policies, operational requirements, and domain expertise into measurable agent behavior
  • Collaborate with engineering teams to define context requirements, tool contracts, and integration specifications that support agent success
  • Create scalable frameworks and reusable patterns for deploying AI agents across new business workflows and use cases
  • Establish best practices for prompt engineering, evaluation methodologies, observability, and agent operations

Skills

  • Proven experience shipping and owning production AI agents or LLM-powered systems beyond proof-of-concept environments
  • Deep expertise in prompt engineering, including system prompts, tool usage, context management, output constraints, and agent behavior design
  • Hands-on experience building evaluation frameworks using golden datasets, scoring rubrics, LLM-as-judge methodologies, and regression testing
  • Strong familiarity with modern AI development tools such as Claude Code, Codex, or similar coding agents
  • Experience with agent observability and evaluation platforms such as LangSmith, Langfuse, Arize, Galileo, or comparable solutions
  • Ability to distinguish prompt issues from data, tooling, model, or evaluation failures and systematically improve agent performance
  • Strong written and verbal communication skills with the ability to work effectively across engineering and business teams
  • Demonstrated ownership mindset with a passion for building reliable, measurable, and continuously improving AI systems
  • Experience building agents that process communication-based workflows including emails, support tickets, chat interactions, or transcripts
  • Experience with multiple agent frameworks and a practical understanding of their tradeoffs
  • Familiarity with the evolving LLM landscape and model selection strategies
  • Experience designing and implementing end-to-end evaluation pipelines and agent operations workflows
  • Production experience with online evaluation systems and automated scoring of live traffic
  • Experience integrating AI systems with Salesforce, AWS Connect, or customer engagement platforms
  • Background in customer-facing industries where accuracy, compliance, and communication quality are critical
  • Contributions to open-source projects, technical writing, or public thought leadership in AI, prompt engineering, or agent development

Company Overview

  • Vacatia is the resort marketplace for vacationing families, whose mission is to make family vacations better It was founded in 2013, and is headquartered in Mill Valley, California, USA, with a workforce of 1001-5000 employees. Its website is https://vacatia.com.
  • Company H1B Sponsorship

  • Vacatia has a track record of offering H1B sponsorships, with 2 in 2025, 1 in 2022. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Staff Back End Engineer, Trading

    Work from home Full-time role

    [Remote] Senior Accountant

    Work from home Full-time role

    [Remote] Manager, Software Engineering (Reliability Platform)

    Work from home Full-time role

    [Remote] Community Support Forecasting and Demand Planning Analyst

    Work from home Full-time role

    [Remote] Senior Manager, Clinical Operations

    Work from home Full-time role

    🎮 Unity Technical Artists - Open Call ✨

    Work from home Full-time role

    Patient Account Representative II - Hospital Billing Unit (Onsite, Hybrid, Remote)

    Work from home Full-time role

    Account Executive

    Work from home Full-time role

    [Remote] Account Manager - Select Accounts

    Work from home Full-time role

    Industrial Engineering Transportation Analyst – Remote Data Entry & Logistics Support for careerzynith

    Work from home Full-time role

    NIH - Vulnerability Assessment Lead

    Work from home Full-time role

    Experienced Data Entry Specialist – Remote Opportunity with careerzynith

    Work from home Full-time role

    Sales Development Representative – B2B SaaS & IT Automation

    Work from home Full-time role

    Dental Billing Specialist-REMOTE

    Work from home Full-time role

    911 Dispatcher – Remote Job Opportunities Open Now! – Indeed Jobs US

    Work from home Full-time role

    Clinical Supervisor (Southeast)

    Work from home Full-time role

    SQL Reports Developer (Sft Engineer -Dev- III)

    Work from home Full-time role

    [Remote] Sr Manager, Agent Recruitment

    Work from home Full-time role

    Amazon Work-From-Home Jobs - Part-Time Positions Available

    Work from home Full-time role

    Recruiting Coordinator - Dallas, TX - RPO Consultant

    Work from home Full-time role

    Principal Statistical Programmer FSP

    Work from home Full-time role