Applied Research

Our Research

We study the conditions that make AI systems reliable, useful, and honest. Our research is applied, meaning it begins with real deployment problems and ends with published findings that inform what we build.

Focus Areas

What We Study

Applied research requires choosing problems carefully. Each area below reflects a gap between what AI systems can do and what they reliably do, and a real deployment context where that gap has consequences.

Model Behavior & Reliability

We study how AI models behave under conditions they were not optimized for: distribution shift, ambiguous instructions, adversarial inputs. Reliability is not a property of benchmarks. It is a property of deployment.

Reasoning Under Constraint

Real-world AI operates under resource limits, incomplete information, and conflicting objectives. We study how reasoning degrades and the conditions under which it can be made more robust.

Alignment in Applied Settings

Alignment is not only a frontier problem. Even today's models exhibit misalignment in narrow, high-stakes applications. We study how alignment failures manifest in practical systems and how they can be detected early.

Evaluation Methodology

Most benchmarks measure the wrong things. We research evaluation frameworks that track the properties that matter for real deployment: robustness, calibration, behavior under covariate shift, and failure transparency.

Human-AI Collaboration

The most common deployment of AI is alongside humans, not in place of them. We study the interaction patterns, failure modes, and design principles that make human-AI collaboration more reliable and less prone to compounding errors.

AI Systems & Deployment Ecology

A model does not exist in isolation. We study the system-level effects of deploying AI: how models interact with data pipelines, feedback loops, organizational incentives, and long-term behavioral drift.

Published Work

Publications

2025

Applied Research

Reasoning Under Distribution Shift: A Behavioral Study of Instruction-Following Models

We examine how instruction-following degrades as input distributions shift away from training conditions, and propose lightweight behavioral probes for early detection.

ErisAI Research Team — March 2025

→

2025

Evaluation

On the Limits of Benchmark-Driven AI Evaluation: Toward Deployment-Grounded Metrics

Standard benchmarks conflate capability with reliability. We argue for evaluation frameworks grounded in deployment conditions, and demonstrate the divergence with three case studies.

ErisAI Research Team — January 2025

→

2024

Alignment

Narrow Misalignment in High-Stakes Applications: A Taxonomy and Detection Framework

Alignment failure is not exclusive to frontier systems. We identify twelve recurring misalignment patterns in narrow-application deployments and propose a detection checklist for practitioners.

ErisAI Research Team — November 2024

→

2024

Systems

Feedback Loop Dynamics in Deployed AI Systems: How Models Drift and Why It Matters

Deployed models interact with the data pipelines they influence. We study the conditions under which this creates behavioral drift and propose intervention thresholds for production monitoring.

ErisAI Research Team — August 2024

→

Load more publications →

From Research to Product

Research without application is incomplete.

Every paper we publish is connected to a product decision, a design constraint, or a deployment question. See how our research becomes what we build.

See Our Products Talk to Synapse →