Speakers Agenda Why attend Partners Venue

Locations ▼

Get invited

Partnership opportunities

Secure your pass

Call to action

Your text goes here. Insert your content, thoughts, or information in this space.

Button

Back to speakers

Jonatan

von Martens

AI Safety Engineering

ElevenLabs

Jonatan von Martens is an AI safety engineer at ElevenLabs, working at the intersection of model behavior, reliability, and responsible deployment. His work focuses on identifying and mitigating failure modes in production AI systems, helping ensure advanced generative models are safe, robust, and fit for real-world use.

Button

15 April 2025 17:30 - 18:00

Panel | Evaluating autonomous agents: Closing the gap between tests and real-world behaviour

Evaluating autonomous agents is fundamentally harder than evaluating static models or prompt-based systems. Behavior unfolds over sequences of actions, interacts with tools and environments, and changes under real traffic in ways that are difficult to capture with offline tests alone. In this panel, engineers and system builders compare how they evaluate agent behavior in practice. The discussion will explore where traditional testing breaks down, how teams reason about trajectories rather than single outputs, and what signals matter most once agents are operating in dynamic, real-world environments. Expect candid perspectives on what works, what doesn’t, and where evaluation remains an open problem for agentic systems.