Betterment closes visibility gaps in chatbot performance

How Betterment used granular QA insights to improve chatbot accuracy, compliance, and the customer experience—at scale.

Industry
Fintech
Use Case
Company Size
~1000 Employees

Increase in CSAT

30%

Decrease in agent ramp time

120%

Increase in monthly coaching sessions

300%

CHALLENGE

Limited Insight Into Chatbot Accuracy and Risk

“Generative models can hallucinate and in a regulated industry, that’s a serious risk.”

Natalie Langdale, Chatbot Strategy Manager, Betterment

Betterment’s chatbot vendors provided only high-level, black-box metrics. The team could see outcomes, but never the reasons behind them — no visibility into missed intents, flawed answers, or whether the bot followed required disclosures.

To fill these gaps, the team ran manual, multi-iteration sprints to analyze bot performance, but each cycle was time-consuming and narrow in scope, offering little confidence in outcomes.

Without a transparent, scalable way to evaluate accuracy, detect hallucinations, or understand where models broke down, leaders couldn’t assess chatbot readiness or ensure customer safety in a regulated environment.

SOLUTION

A Transparent Framework to Analyze Chatbot Models

Betterment used Maestro to establish a clear, consistent framework for evaluating how their scripted and generative chatbots performed. The team gained transparent visibility into accuracy gaps, disclosure issues, hallucination risk, and how reliably each model followed required guardrails.

With this insight, Betterment could pinpoint where responses broke down, understand what needed refinement, and improve models far more efficiently.

Maestro provided a unified, scalable way to measure model behavior giving the team confidence in their decisions about chatbot readiness and improvement.

“Maestro is the most efficient way to monitor the output of a generative bot and make improvements quickly.”

Natalie Langdale, Chatbot Strategy Manager, Betterment

See how MaestroQA can help you unlock the power of conversation data

Get in touch to learn how MaestroQA helps you:

  • Uncover business-critical issues hidden in support conversations
  • Replace disconnected metrics with insights that drive action
  • Equip teams across product, ops, and support with data that matters
  • Connect the dots between customer experience and company outcomes

Fill out the form below

Customer Stories

/

Betterment

Betterment closes visibility gaps in chatbot performance

Watch the Webinar

Industry
Fintech
Use Case
Company Size
~1000 Employees

Increase in CSAT

30%

Decrease in agent ramp time

120%

Increase in monthly coaching sessions

300%

CHALLENGE

Limited Insight Into Chatbot Accuracy and Risk

“Generative models can hallucinate and in a regulated industry, that’s a serious risk.”

Natalie Langdale, Chatbot Strategy Manager, Betterment

Betterment’s chatbot vendors provided only high-level, black-box metrics. The team could see outcomes, but never the reasons behind them — no visibility into missed intents, flawed answers, or whether the bot followed required disclosures.

To fill these gaps, the team ran manual, multi-iteration sprints to analyze bot performance, but each cycle was time-consuming and narrow in scope, offering little confidence in outcomes.

Without a transparent, scalable way to evaluate accuracy, detect hallucinations, or understand where models broke down, leaders couldn’t assess chatbot readiness or ensure customer safety in a regulated environment.

SOLUTION

A Transparent Framework to Analyze Chatbot Models

Betterment used Maestro to establish a clear, consistent framework for evaluating how their scripted and generative chatbots performed. The team gained transparent visibility into accuracy gaps, disclosure issues, hallucination risk, and how reliably each model followed required guardrails.

With this insight, Betterment could pinpoint where responses broke down, understand what needed refinement, and improve models far more efficiently.

Maestro provided a unified, scalable way to measure model behavior giving the team confidence in their decisions about chatbot readiness and improvement.

“Maestro is the most efficient way to monitor the output of a generative bot and make improvements quickly.”

Natalie Langdale, Chatbot Strategy Manager, Betterment

Webinar

QA that Drives Real Impact: How Chegg Delivers Value with QA and Insights

Chegg’s Support Insights and QA teams have transformed how they capture and act on customer feedback. Learn how they leverage AI-powered analysis and targeted QA to surface critical insights, influence product decisions, and improve student experiences. Discover how structured Voice of the Customer (VoC) programs can drive real business impact—reducing churn, enhancing product usability, and aligning support with company strategy.

Webinar

Perfecting Chatbot Performance

Chatbots are often the first point of contact with your customers, and their performance can make or break the customer experience—don’t leave their performance to chance. Join Natalie Langdale, the driving force behind Betterment’s chatbot strategy, in a webinar that’s all about maximizing your chatbot’s effectiveness with MaestroQA.

Webinar

Quality & Risk Operational Excellence in CX

Discover how TaskRabbit transformed its approach to risk management, leveraging data-driven insights to enhance safety protocols, optimize training, and drive continuous improvement. Hear firsthand from TaskRabbit's team as they share their challenges and strategies. Learn how MaestroQA's tools can revolutionize your risk management practices!

Webinar

Next-Level Agent Performance

Discover the unique blend of empathy and efficiency that characterizes Getaround's modern coaching model. Learn how they use data-driven coaching conversations to strengthen the core relationship between team leads and frontline agents, reinforcing their core values and enabling them to adapt quickly to dynamic customer needs.

Customer Stories

/

Betterment

Betterment closes visibility gaps in chatbot performance

5%+

Improvement in BSAT

2%+

Improvement in Containment Rate & Automated Resolution Rate

4x

Faster bot model evaluation cycle vs manual multi-sprint testing

“Approval from Legal and Risk hinged on one thing: proving we could systematically review and improve bot outputs. Maestro made that possible.”

Natalie Langdale, Chatbot Strategy Manager, Betterment

About Betterment

Betterment is a digital financial services company offering automated investing, retirement planning, and cash-management products. Operating across regulated financial lines requires highly accurate compliance oversight.

Fintech
~1000 Employees

CHALLENGE

Limited Insight Into Chatbot Accuracy and Risk

“Generative models can hallucinate and in a regulated industry, that’s a serious risk.”

Natalie Langdale, Chatbot Strategy Manager, Betterment

Betterment’s chatbot vendors provided only high-level, black-box metrics. The team could see outcomes, but never the reasons behind them — no visibility into missed intents, flawed answers, or whether the bot followed required disclosures.

To fill these gaps, the team ran manual, multi-iteration sprints to analyze bot performance, but each cycle was time-consuming and narrow in scope, offering little confidence in outcomes.

Without a transparent, scalable way to evaluate accuracy, detect hallucinations, or understand where models broke down, leaders couldn’t assess chatbot readiness or ensure customer safety in a regulated environment.

SOLUTION

A Transparent Framework to Analyze Chatbot Models

Betterment used Maestro to establish a clear, consistent framework for evaluating how their scripted and generative chatbots performed. The team gained transparent visibility into accuracy gaps, disclosure issues, hallucination risk, and how reliably each model followed required guardrails.

With this insight, Betterment could pinpoint where responses broke down, understand what needed refinement, and improve models far more efficiently.

Maestro provided a unified, scalable way to measure model behavior giving the team confidence in their decisions about chatbot readiness and improvement.

“Maestro is the most efficient way to monitor the output of a generative bot and make improvements quickly.”

Natalie Langdale, Chatbot Strategy Manager, Betterment

No items found.
Impact

Confidence and Clarity in Chatbot Performance

Regulatory confidence and risk mitigation

Betterment can now evaluate generative output transparently, enabling them to catch hallucinations, verify disclosures, and mitigate risk before customer exposure, resulting in a 5%+ lift in BSAT.

Faster, data-backed decisions

A slow, multi-sprint workflow was replaced with a single, structured evaluation cycle, making model-readiness decisions 4× faster and reducing operational drag.

Clear visibility into model performance gaps

Betterment can now identify where models fall short, from accuracy gaps to disclosure misses, driving a 2%+ improvement in containment and automated resolution rates.

Strategic direction for future AI investment

With objective clarity on model fit, Betterment confidently pivoted from Ada’s generative system to a hybrid vendor aligned to their product complexity and compliance needs.