This article was published in 2026 and references a historical event from 2019, included here for context and accuracy.
Tension: Marketing teams need multiple predictive models to answer strategic questions, but building effective model stacks requires addressing foundational data issues first.
Noise: Platform vendors promise turnkey solutions while consultants debate model architectures, obscuring the critical prerequisite that determines whether any model delivers value.
Direct Message: The question isn’t which models to build or how to stack them, but whether your data infrastructure can support models worth building at all.
To learn more about our editorial approach, explore The Direct Message methodology.
How many customers can you upsell in your next campaign? How many cancellations should you expect? What lifetime value can you forecast for each new customer acquisition?
In 2019, these questions drove marketers toward predictive modeling with names that telegraphed their functions: “Predictive Response+LTV for Targeted Customer Acquisition” or “Predictive Churn to pre-empt cancellations.”
The concept sounded elegant. The execution revealed complications that persist today.
Starting from strategic clarity rather than analytical complexity
Jeff Tomlin, CMO at Vendasta, identified the core challenge in 2019: “We’re awash in data, but we are thirsty for knowledge.” Marketers succumb to paralysis by analysis or fail to ask the right questions. The solution starts from the top by understanding the high-level metrics that drive business decisions rather than building models hoping they’ll reveal insights.
This strategic discipline matters more in 2026 than it did seven years ago. Organizations gather data and form hypotheses, then cross-reference insights across stakeholders to establish forecasts they’re trying to hit, explained Jason Katz, founder of Growth Marketing Advisors.
That foundational strategy precedes model development, which addresses specific data gaps after basic business objectives are clear.
A firm might identify twelve different models to capture an opportunity. “But you don’t need to run all 12,” Katz noted. “If you get one or two right, it’s a big win.” This remains accurate today, though what constitutes “getting it right” has evolved.
Companies using predictive analytics across channels report 15-20% improvements in marketing ROI, but achieving those results requires infrastructure that many organizations still lack.
The collaboration between consultants and clients shapes model accuracy. “The clients can help to add business sense and insights that make the model more accurate,” said Yohai Sabag, Chief Data Scientist at Optimove in 2019.
Consultants bring best practices for choosing appropriate models, while clients contribute domain expertise that grounds predictions in operational reality. This partnership dynamic hasn’t changed, but the technical requirements have become more demanding.
Why data quality determines everything downstream
Tomlin’s observation about being “awash in data” understated the challenge.
The first step in building any model is preparing and understanding the data, with both consultants and clients responsible for input quality. Yet even this obvious foundation gets ignored or deprioritized under pressure to deliver insights quickly.
“Almost always, the data is terrible,” Katz observed in 2019. Larger, mature companies maintain some datasets in good order, but smaller organizations operate under a “grow now, test to see if it works, cleanup afterwards” mentality that treats marketing operations and data hygiene as afterthoughts.
Seven years later, this pattern has intensified rather than improved.
Research from Adverity found that CMOs estimate 45% of the data their teams use is incomplete, inaccurate, or outdated.
The challenge goes beyond individual data quality to structural problems: lack of uniformity across systems, inconsistent taxonomy, incorrect nomenclature that makes personalization impossible.
“You can’t unlock the power of your own data,” Katz explained, when there’s no standardization across the datasets feeding your models.
The consequences compound when AI enters the picture. MarTech research revealed that 76% of respondents reported less than 50% of their CRM data was accurate or complete, while 56% indicated poor quality data hindered AI implementations.
Getting people to understand data, how to use it, and think strategically about it remains a fundamental challenge that limits what models can accomplish regardless of their sophistication.
Building model stacks on reliable foundations
When data quality is addressed, the question of model architecture becomes tractable. Vendasta developed a platform-centric approach focused on efficiency of sales across demand generation, product metrics, sales metrics, retention, expansive sales, and scale.
These priorities emerged from surveying hundreds of clients about their greatest growth challenges, then building models that address those specific needs.
The platform enables a two-step view: first examining the macro picture, then drilling into details. The fundamental goal is understanding customer acquisition cost compared to lifetime value, which guides resource allocation across channels and campaigns.
This framework remains sound in 2026, though the technical implementation has evolved significantly.
“I’ve never seen a nice stack of models,” Katz added. Large companies typically model by channel or use the same model across channels, with differentiation emerging when figuring out cross-sell and up-sell approaches. “You get a probable lifetime value for each path. Highest LTV wins.”
The number of models often increases when resources are constrained, forcing units to model optimal approaches before committing limited access to email lists or restricted ad budgets.
The architectural principle holds: each model can use conclusions from previous models as inputs, creating interconnected systems without a single prescribed arrangement. “Even if there’s more than one model in the background, there should be a mechanism that aggregates their results into one figure, to provide a single answer,” Sabag explained. Sometimes higher accuracy requires an ensemble of models, but all results must gather into unified recommendations.
“There is no sweet spot here, the tradeoff is accuracy vs simplicity,” Sabag noted. That observation has proven durable, though in 2026 the scales tip differently.
Simpler models built on clean data consistently outperform complex architectures trained on corrupted information, which changed how organizations approach the accuracy-simplicity balance.
The implementation sequence that actually works
The practical path forward starts with infrastructure rather than algorithms.
Organizations that succeed with predictive modeling in 2026 validate data quality before selecting models. They establish governance frameworks, implement validation rules, create automated cleansing processes, and monitor continuously rather than periodically.
This reverses traditional procurement sequences. Instead of buying platforms hoping data quality improves, successful organizations audit their information foundations first.
They identify completeness gaps, consistency failures, accuracy problems, and timeliness issues. Only after establishing baseline reliability do they layer predictive capabilities on top.
Organizations that maintain data quality for fewer, more focused models often achieve better results than those managing sprawling analytical architectures built on questionable foundations.
The data preparation phase remains the most time-consuming part of building predictive models, which means every additional model increases the infrastructure burden.
Model selection becomes straightforward once data quality is verified. A simple regression model on clean data outperforms ensemble approaches trained on corrupted inputs. The marketing mix model that helps allocate limited ad dollars works only when the underlying spend and conversion data accurately reflects reality rather than organizational dysfunction.
The question Tomlin posed about avoiding “paralysis by analysis” takes on new meaning. Paralysis doesn’t come from too much data or too many questions. It comes from building analytical complexity on infrastructural weakness, then wondering why insights don’t translate to outcomes.
Understanding high-level business drivers matters, but only when the data measuring those drivers accurately reflects operational reality.
Managing model stacks as systems
When models are properly architected on reliable data, they function as integrated systems rather than competing analytical approaches.
Sabag’s observation that each model uses conclusions from previous models as inputs requires that every link in the chain meets quality thresholds. Break the chain at any point and the entire stack produces unreliable outputs.
Resource constraints drive model proliferation in predictable ways. Email campaigns with limited list access windows force optimization models. Restricted ad budgets require marketing mix models for channel allocation. Cross-sell opportunities demand probability calculations across customer paths.
Each constraint justifies another analytical layer, but only when the data supporting that layer warrants the complexity.
The coordination challenge intensifies as model counts increase. Companies pursuing global enterprise accounts compromise their efforts through fragmented territory views when data lacks uniformity. Databases become useless through inconsistent taxonomy and nomenclature, which is a complicated way of describing users inputting information in wrong sequences under incorrect labels.
These operational failures undermine even sophisticated models.
Platform-centric solutions that integrate data collection, transformation, and modeling can reduce this friction, but only when they enforce quality standards from the beginning rather than attempting to clean data retroactively.
The two-step view that Vendasta emphasizes (macro picture first, then detailed drill-down) works only when both levels of analysis draw from consistent, accurate information.
The tradeoff between accuracy and simplicity that Sabag identified hasn’t disappeared. What changed is recognition that simplicity often delivers better accuracy when it maintains data quality more effectively.
Organizations running twelve models to identify opportunities might achieve better results running two well-maintained models on verified data than running all twelve on information they can’t trust.
Seven years after the original conversations about model stacks, the fundamental questions remain: How many models do you need? How do you manage them? The answers depend less on analytical sophistication than on whether your organization treats data infrastructure with the discipline that predictive modeling demands.
Models work exactly as designed, processing whatever inputs they receive and generating outputs that reflect input quality with mathematical precision.
Start from the top by understanding what drives your business. Ensure the data measuring those drivers meets quality standards. Then build models that address specific strategic needs rather than hoping analytical complexity compensates for foundational weakness.
That sequence, more than any particular model architecture, determines whether predictive analytics delivers the business value that justified the investment.