Designing a Self-Serve Tool for Detecting Data Anomalies

Amplitude’s users rely on the platform to monitor core metrics and understand their trends. However, the existing methods for evaluating metric fluctuations are often imprecise, leading to wasted time and resources investigating noise rather than meaningful insights.

To solve this, we developed an anomaly detection tool that helps customers distinguish significant metric changes from statistical noise. By identifying "what’s unusual in the data," users can focus on actionable insights and make informed decisions with confidence.

Read more
completed UI screen for anomalies

Overview

How the Growth Team Pioneered Anomaly Detection

At Amplitude, I was part of the Growth team, a dedicated group focused on rapidly testing and validating concepts before committing them to the product roadmap. By exploring the risks and rewards of potential features, we helped product teams make more informed decisions and prioritize impactful opportunities.

In early March 2020, the Growth team inherited the anomalies and forecasting project from the analytics roadmap. Amid the uncertainty of COVID-19 and shifting product strategies, our headcount was paused, making it critical to validate and build efficiently. Our team developed the initial concept for anomaly detection and forecasting, working iteratively with customers to create a strong foundation for the analytics team to scale in future phases.

The Challenge of Finding Meaning in Noisy Data

How might we help users quickly identify and understand meaningful changes in their data while minimizing noise?
Monitoring core metrics was a noisy, inefficient process. Users relied on manual comparisons and guesswork, using raw numbers and percentages to determine if changes in their data were meaningful. This approach not only wasted time but also required dependencies on data analysts to uncover actionable insights.
To solve this, we needed a self-serve tool that highlighted statistically significant deviations and provided forecasts, empowering users to uncover insights independently and in real time.

A One-Click Solution for Meaningful Insights

Our solution was designed to provide value to all of Amplitude's user personas while optimizing for our target audience. Anomaly Detection allows users to uncover statistically significant changes in their data with just one click, reducing noise and highlighting meaningful trends.
The feature offers three customizable modes to suit different use cases:
  • Agile Mode: Optimized for quick insights on recent data.
  • Robust Mode: Accounts for seasonality by analyzing longer historical intervals.
  • Custom Mode: Lets users set their own confidence intervals and analysis periods for tailored insights.
By making anomaly detection intuitive and self-serve, we eliminated dependencies on data specialists and improved workflows for teams across Amplitude's customer base.

Driving Adoption Across Amplitude’s Customer Base

The feature was beta-tested with 900 customer accounts, and 17% engaged with in-app guides prompting them to try the tool. Today, one-third (800) of Amplitude's 2,400 paying customers actively use Anomaly Detection in their workflows. This adoption reflects the tool’s success in solving a critical pain point and driving meaningful impact across our customer base.

Objectives and Outcomes

Key Objective

Detecting Anomalies

Anomaly Detection empowers users to identify statistically significant changes in their data across any time series chart. This feature enables immediate recognition of unexpected fluctuations, accelerating data-driven decision-making.

Key Objective

Forecasting Anomalies

By analyzing historical data, forecasting visualizes expected future trends. This insight helps users predict changes in core metrics, set realistic goals, and make strategic decisions—empowering them to anticipate trends rather than merely react.

Goal

Establishing a Scalable Infrastructure

We aimed to build a robust platform that allows the Analytics team to efficiently scale both anomaly detection and forecasting. Completing phases 1-3 maximizes customer value while minimizing resource expenditure, ensuring the feature’s scalability for future enhancements without overextending the team.

Goal

Engagement Through Actionable Insights

Recognizing that sharing insights is a core use case in Amplitude, we hypothesized that enhanced anomaly detection would encourage users to share meaningful discoveries with colleagues. This increased collaboration is expected to boost overall product engagement.

KPI

User Adoption & Engagement

We tracked adoption by measuring the percentage of weekly active users interacting with the anomaly detection feature. Specifically, we monitored “chart compute” events—when the anomaly feature is enabled—to gain clear insights into user engagement.

KPI

User Stickiness

Retention was measured by the percentage of users returning to use the feature in their second week. By benchmarking against our “compare to past” feature—which achieved a 24% two-week retention rate—we evaluated the effectiveness of anomaly detection in driving sustained engagement.

Process Overview: Challenge Approach

01

Deep Diving Into Personas and Problem Areas

We conducted a comprehensive exploration of the problem space, focusing on Amplitude’s core personas:

1) Aspiring Pioneers: Growth marketers with limited data proficiency, 2) Pioneers: Product managers with moderate data expertise. 3) Data Scientists: Experts with advanced proficiency in interpreting data.

By analyzing each persona’s job functions and data proficiency, we uncovered the unique needs of each group. This groundwork helped us identify our target persona—Pioneers—later in the process, ensuring the solution was tailored to those who would benefit most.

02

Listening to Users Through Discovery and Testing

We conducted 30+ customer sessions, blending discovery interviews and usability testing, across 17 organizations of varying sizes and verticals. Our target cohorts were identified by analyzing users who had engaged with the “compare to past” feature in the last 90 days. Additionally, our customer success managers (CSMs) provided contacts to ensure we gathered insights from a broad spectrum of users. These sessions offered critical input to shape our feature development.

03

Identifying Core Themes Through Prototyping

Testing with early prototypes revealed three key feature requests that emerged as critical for delivering value. Users wanted the feature to align with their specific roles and responsibilities, demanding insights tailored to their needs. Transparency was another major theme, as users sought clarity on how the model worked to ensure they could trust its outputs. Lastly, configurability was essential, allowing users to tailor the feature to fit their individual workflows and priorities. By delving into the root causes behind these requests, we ensured our designs addressed the core problems effectively.

04

Focusing on Pioneers as the Target Persona

Through our research, we determined that Pioneers (product managers) were the ideal target persona. While data scientists could already spot outliers due to their expertise, and aspiring pioneers’ roles didn’t focus on data fluctuations, product managers faced significant challenges in detecting anomalies efficiently. Targeting this group allowed us to deliver maximum value by enabling them to identify statistically significant changes with fewer resources.

05

Building Trust Through Usability Testing

Users consistently raised concerns about trusting what they were seeing in the tool. Questions about parameters, clarity between future and past data, and the distinction between partial data and expected values highlighted areas for improvement. Early iterations revealed confusion in these aspects, which made the feature feel less reliable to users. We refined the design iteratively to address these gaps, ensuring the tool was both intuitive and trustworthy, and ultimately instilling confidence in its outputs.

06

Designing for Simplicity, Trust, and Automation

The final design prioritized simplicity and transparency to ensure adoption and usability. We created an intuitive interface that made it easy for users to understand the tool’s functionality and trust the insights it provided. Automation features, such as smart defaults and customizable modes, reduced complexity while enabling users to tailor the tool to their specific needs. This combination of clarity, trust, and adaptability resulted in a solution that empowered users to uncover actionable insights with confidence.

Research Insights

Testing early iterations with customers allowed us to gather valuable insights that informed both design and product decisions. During these sessions, we explored how customers currently find this information, how frequently they use the "compare to past" feature, and their expectations for an anomaly detection tool. Customers consistently emphasized the importance of ease of use, trust in the underlying model driving the computations, and confidence in selecting parameters to configure their output settings effectively. These insights shaped our approach to building a tool that feels intuitive and reliable.

  • Intuitive Understanding: The tool must be easy to grasp at a high level, supported by UX copy, tooltips, and clear affordances that guide the user seamlessly through the experience.
  • Transparency: Users need to trust the outputs by understanding how the computations are made. Providing visibility into the underlying processes ensures confidence in the tool’s accuracy and reliability.
  • Clear Anomaly Detection: The interface should make anomalies stand out effortlessly through the use of distinct colors, easily distinguishable data, and intuitive signifiers that direct attention where it matters most.
  • Guided Configuration: Users should feel confident selecting parameter settings through a “keep it simple” approach, including educational elements and smart defaults that reduce decision fatigue while enhancing usability.

Our Key Insight: Trust Is the Foundation for Adoption

A critical pain point for users was understanding the "model" applied to their charts and how results were calculated. While users already trusted Amplitude to perform analyses, they often felt uncertain about selecting the "right" parameters for their results.

To address this, we proposed leveraging Amplitude’s existing trustworthiness by introducing a set of predefined “modes.” These modes would represent industry-standard parameter configurations tailored to different use cases, removing the guesswork for users. This approach offered several benefits:

  • It provided value across varying levels of data proficiency, making the tool accessible to all personas.
  • It eliminated doubt about parameter selection, allowing users to focus on insights rather than technical details.
  • It supported user growth by creating a gradual learning curve, enabling less experienced users to build confidence while advanced users could still customize their experience.

By simplifying parameter selection and creating intuitive defaults, we bridged the gap between trust, usability, and growth, ensuring that users of all skill levels could gain equal value from the tool.

Screens used in testing

screen example from testing: UI shows forecasting hovered
screen example from testing: UI shows settings opened

The Solution

Drum roll...Empowering users with smart defaults and visual clarity.

Throughout our research, one insight became clear: users appreciated smart defaults. These defaults acted as predefined parameters applied to the chart via the associated “modes,” helping users feel confident in the selections being made without the need for extensive configuration. For forecasting, the default starts in an empty state but allows users to layer in complexity when needed, preserving the feature’s discoverability without overwhelming the interface.

The modes—Agile, Robust, and Custom—were designed to cater to different use cases. Agile mode responds quickly to recent trends by using a 95% confidence interval and 120 days of training data prior to the chart’s date range. Robust mode works best for stable metrics, incorporating a full year of additional data to better account for seasonality. Custom mode offers flexibility, allowing users to define their own confidence intervals and training durations to meet specific needs. This approach ensures accessibility for all user levels while enabling advanced customization when required.

To address visual noise, we introduced a hover effect for chart segments, allowing users to analyze anomalies across multiple related metrics without cluttering the interface. By default, up to 10 segments can be displayed at once—enough to provide meaningful insights without compromising the clarity of the design. Hovering allowed us to include more contextual details, such as confidence bands and forecasting parameters, while still letting users isolate and investigate specific anomalies. To further reduce confusion, we reserved a single color from the design system exclusively for displaying anomalous data points, minimizing visual noise amidst Amplitude's "blue sea" of features.

For interaction design, we implemented a button toggle to activate the feature rather than a traditional on/off UI toggle. Early explorations of an on/off design pattern felt inconsistent with Amplitude’s established interface and risked creating confusion. Instead, the button’s state signifies activation, automatically applying the smart defaults associated with the selected mode. Color and tag affordances clearly indicate when the feature is active, while users retain the ability to adjust global settings or modify parameters directly through the button or the tags themselves.

Finalized UI

ui shows finalized screen for anomaly landingui shows finalized screen for anomaly defaults appliedui shows finalized screen for anomaly settings openedui shows finalized screen for settings tooltipsui shows finalized screen for forecast addedui shows finalized screen for forecast hovered

Results

800

Active paying customers

17%

First touch engagement

33%

Amplitude users using anomaly detection

100%

Of the rocketship built