
Senior AI Engineer
Job Description
Posted on: July 15, 2025
About ScultureAI ScultureAI is a B2B SaaS startup developing groundbreaking coaching solutions for shaping organisational culture using large-language models and other cutting-edge Al technologies. An organisation’s culture is a key driver of employee wellbeing and company success, and we are driven by the mission to improve the lives and performance of employees and companies all over the world. ScultureAI been named one of Europe’s hottest startups at The Europas and one of the leading UK AI startups by Generative Group. Having raised over £1.5m to date, we are currently onboarding our first major enterprise clients and this is just the beginning! Our team operates in a dynamic, supportive atmosphere where everyone's voice is heard, and every good idea is valued. We want to build a workplace community where passionate individuals can thrive, grow, and contribute to groundbreaking Al coaching solutions that transform organisational culture and have a positive impact on the world. Join us on this exciting journey and shape the future of coaching and workplace culture. About The Role We are looking for a talented Senior Prompt Engineer to help us push the boundaries of what’s possible with LLMs. This role has two core aspects. First, you’ll design, build and optimise complex multi-agent prompt pipelines that directly power our product and customer outcomes. Second, you'll build and scale rigorous evaluation systems to evaluate these constantly changing pipelines – both automated and human-in-the-loop – to continuously assess and improve the quality, performance, reliability, and cost-efficiency of our prompt architectures. You’ll need to be deeply immersed in the latest techniques across prompting, LLM behaviour, multi-agent orchestration and agent design, and be excited to apply that knowledge in a fast-paced, high-impact startup environment. Requirements
- Architect, design, and implement robust multi-agent pipelines leveraging a diverse range of LLMs
- Systematically decompose complex problems into structured, scalable prompt-driven solutions
- Use advanced prompt engineering techniques to drive desired results
- Build and maintain a library of atomic evaluation prompts to measure coaching output quality across several dimensions
- Develop and automate evaluation systems that can benchmark, identify regressions, and measure the stability of new prompts, models, and model versions
- Use statistical techniques and domain knowledge to define quality thresholds, analyse variance, and surface outliers or black swan failures
- Participate in designing and running human-in-the-loop processes to validate and improve evaluator prompts
- Contribute to internal tooling for prompt observability, output sampling, evaluator scoring, and user feedback
- Optimise the trade-offs between latency, quality and operational cost
- Implement safety and security best practices
- Build and manage modular, version-controlled prompt libraries with support for templating and reuse
- Collaborating with full-stack engineers to design and implement complex automatedsystems to evaluate the quality and consistency of outputs across pipeline stages and agents.
- Collaborating with other technical colleagues to develop tools and systems for prompt observability, such as usage tracking, output variance monitoring, and feedback loop integration.
Necessary to have
- 18+ months of hands-on experience in LLM prompt engineering, ideally with multi-prompt or agentic architectures
- Demonstrated experience designing or contributing to LLM evaluation systems, either for QA, R&D, or production monitoring
- Strong understanding of LLM behaviour, capabilities, and failure modes
- Comfortable working with prompt evaluation tools or libraries (e.g., OpenAI Evals, DeepEval, Promptfoo, TruLens, LangChain eval, etc.)
- Familiarity with advanced evaluation metrics and an ability to interpret results
- A Masters degree with distinction in a relevant subject
- Appreciation of coaching, behaviour change and organizational culture principles
- Experience designing few-shot evaluator prompts and LLM-as-a-judge pipelines
Personal Characteristics
- Great communicator who builds strong relationships with colleagues
- Self-starter and fast learner, able to operate in a fast-paced environment
- Creative problem solver with a can-do attitude
- Accountable, reliable with high attention to detail
- Passionate about our vision to reimagine coaching and corporate culture
- Great communicator who builds strong relationships with colleagues
- Self-starter and fast learner, able to operate in a fast-paced environment
- Creative problem solver with a can-do attitude
- Accountable, reliable with high attention to detail
- Passionate about our vision to reimagine coaching and corporate culture
Benefits
- Competitive salary and equity options.
- Flexible working hours and a remote-first environment.
- Opportunity to work on groundbreaking AI technology.
- Learning and development budget to support your career growth.
- A supportive, inclusive team culture where your contributions make a real impact.
Apply now
Please let the company know that you found this position on our job board. This is a great way to support us, so we can keep posting cool jobs every day!

Remote-Work.app
Get Remote-Work.app on your phone!

Senior AI Engineer

Artificial Intelligence Engineer

Data Administrator

Customer Insights Manager
