Measurement Maturity Map: Most Teams Report Like Level 3 & Operate at Level 1

· Last updated · 9 min read

Most teams sit at Level 1. Their reports look like Level 3. The 4-level ladder (Ad Hoc, Operational, Analytical, Leader) isn't about adding tools. It's about how many methods you let argue with each other. Level 3 starts the moment you run MTA and MMM in parallel and trust the one that disagrees with your favourite channel. Score yourself across 6 dimensions in 3 minutes →

That gap is the source of a lot of bad budget decisions.

Australia's digital ad market hit $18.4 billion in 2025, up 11.5% year on year (IAB Australia/PwC, 2026). Globally, only 39% of buy-side users employ attribution, MMM, and incrementality testing together (IAB State of Data 2026). The majority rely on a single method, usually the numbers each ad platform reports about itself.

A peer-reviewed IEEE paper from Dropbox's data science team and Avinash Kaushik's Modern Analytics Maturity Model (MAMM) both landed in March 2026 with the same conclusion: marketing measurement sits on a maturity ladder, and most companies are far lower than they think. Kaushik's MAMM frames analytics organisations across 6 levels and 10 dimensions. The map below zooms in: can you trust the numbers you're using to allocate budget?

The Measurement Maturity Map

Where does your marketing measurement actually sit?

LEVEL 1 MOST COMPANIES

Ad Hoc

"Meta says Meta is great"

Each platform grades its own homework. Attributed conversions exceed actual conversions by 1.5–3x.

  • · GA4 + platform dashboards
  • · Last-click attribution
  • · Client-side tracking (30–40% data loss)
  • · No cross-channel view
  • · "Last year + 10%" forecasting
Spend floor: $0
Question answered: What does the platform say happened?
LEVEL 2

Operational

"One version of the truth"

A neutral attribution layer collects touchpoint data independently. Server-side tracking captures 30–40% more than client-side.

  • · Independent multi-touch attribution
  • · Server-side data capture
  • · Multiple models compared side by side
  • · Cross-channel deduplication
  • · Marginal CPA tracked, not just average
Spend floor: ~$60K+/yr
Question answered: What does our own data say happened?
LEVEL 3

Analytical

"Cross-validating"

You triangulate. MTA says one thing, MMM says another — the gap tells you where to investigate. Geo-holdouts start.

  • · MTA + MMM blended and compared
  • · Geo-holdout experiments
  • · Scenario-based budget modelling
  • · Offline channels included (TV, OOH)
  • · Privacy-resilient (MMM works without cookies)
Spend floor: ~$500K+/yr
Question answered: Do different methods agree?
LEVEL 4

Leader

"Proving and predicting"

Continuous incrementality testing. Attribution is calibrated by causal proof. Budget reallocates based on marginal returns.

  • · Continuous incrementality testing
  • · Incrementality-calibrated MTA
  • · Causal forecasting
  • · Clean rooms for user-level data
  • · Closed-loop test → learn → reallocate
Spend floor: ~$2M+/yr
Question answered: Can we prove what works and predict what to do?

Each level adds a tool. None replaces the previous one. You still do Level 1 reporting at Level 4 — you just don't trust it alone. When two methods disagree, that gap is the most useful signal you have.

Sources: Dropbox/IEEE Access (2026) · eBay/Econometrica (2015) · Gordon et al. (2019) · IAB State of Data (2026)

Each level is a tool. None of them is a winner.

The principle that runs through every level: climbing the ladder doesn't mean discarding what's below it. You still do Level 1 platform reporting at Level 4. You just don't trust any single number alone.

What changes as you climb is how you reconcile disagreement.

Every measurement method carries a known systematic bias. Platform self-reports overstate by 1.5-3x because each platform claims credit for the same conversion. Click-based attribution overstates by 2-10x compared to causal incrementality testing. MMM is privacy-resilient but lossy on short-term granularity. No method is reliable on its own.

A serious team runs two or more methods in parallel and treats the gap between them as a diagnostic signal. When MTA and MMM agree on a channel, confidence rises. When they disagree, you know which channel warrants a closer look. We unpack the mechanics in MTA, MMM & Lift Studies: The Triangulation Approach.

What we see across mbuzz customers: every Level 3 team has a moment where MTA and MMM disagreed by 30%+ on one channel, and that disagreement was where the budget moved. The discipline isn't running both methods. It's resisting the urge to declare one of them "right" when they fight.

Triangulation is what separates Level 3 from Level 2. It is the single capability with the highest return on budget-decision quality, and it is reachable by any team running ~$500K+/year in media.

Level 1: Ad Hoc

"Meta says Meta is great."

You check conversion numbers inside each ad platform's dashboard. Google reports 500 conversions. Meta reports 400. TikTok reports 300. Added up: 1,200 attributed conversions. Closer to 600 actual.

This isn't speculation. Gordon et al. (2019), in 15 large-scale Facebook field experiments covering 500 million user-experiment observations and 1.6 billion impressions, found that observational measurement methods produce systematically different results from true randomised experiments, even when controlling for thousands of behavioural variables. The gap between platform self-reports and incremental reality isn't accidental. It's built into the methodology.

Kaushik puts it bluntly: "No one shows up and buys immediately." Last-click, first-click, or any single-platform attribution window is a simplification that flatters the platform running it.

Most companies sit here. If the only attribution data your team reviews comes from inside the platforms you're buying media on, this is Level 1. Dashboard sophistication doesn't change that.

Level 2: Operational

A neutral measurement layer sits between you and the platforms. It collects touchpoint data independently (clicks, sessions, form submissions) and applies attribution models without any platform marking its own homework. (What is multi-touch attribution?)

The stronger implementations use server-side tracking, which captures 30-40% more data than client-side JavaScript. With roughly 35% of Australians using ad blockers (IAB Australia), Safari's Intelligent Tracking Prevention limiting cookies to 7 days, and iOS App Tracking Transparency reducing mobile signal, client-side tracking has become unreliable in ways most marketers haven't fully accounted for.

The limitation: multi-touch attribution still measures correlation, not causation. The Dropbox team's research showed click-based attribution can overstate channel performance by 2-10x compared to causal testing. Better than Level 1, but still carries a systematic upward bias.

Who's here: companies with a dedicated marketing ops or analytics function, typically spending $60K+/year on digital media. Independent MTA tools now start around $29-300/month, where five years ago this layer cost five figures and required a custom build.

Find your level in 3 minutes

The free Measurement Maturity Assessment scores you across six dimensions: reporting, attribution, experimentation, forecasting, channels, infrastructure. 10 questions. No login.

Take the Assessment

Level 3: Analytical

You don't trust any single method. You triangulate.

Touchpoint-level MTA tells you which touchpoints preceded conversion. Macro-level MMM tells you whether spending more on Channel X moved total outcomes. MMM uses aggregate spend and outcome data to estimate channel contribution without relying on individual user tracking, making it privacy-resilient in ways MTA is not.

When MTA and MMM agree on a channel, your confidence rises. When they disagree, you know which channel warrants investigation. The disagreement itself is the data point.

Google's Modern Measurement Playbook calls this the "measurement tripod": attribution for granular optimisation, MMM for strategic allocation, incrementality for causal validation. The tools exist. Google's open-source Meridian and Meta's open-source Robyn have made MMM accessible to teams without a dedicated data science function.

This level adds a real experimentation program. Geo-holdouts begin. Scenario-based budget modelling ("what if we cut SEM 20%?") replaces gut-feel reallocation. Offline channels (TV, OOH, events) get folded into models alongside digital.

Who's here: companies spending roughly $500K+/year on media with at least one analyst or data scientist. Requires both touchpoint data infrastructure (for MTA) and 2+ years of historical spend/outcome data (for MMM).

Level 4: Leader

Continuous causal experiments rather than periodic ones. Attribution at this level is calibrated by incrementality results: MTA outputs adjusted to match what holdout tests prove is real. Budget reallocates based on marginal returns the team can defend in front of a CFO.

Dropbox ran month-long channel blackouts across US geographic regions and used three independent statistical methods (Difference-in-Differences, Bayesian Structural Time Series, and Augmented Synthetic Control) to measure causal impact. They reallocated approximately $25 million in annual ad spend, improved marketing efficiency by 81%, and lifted incrementality-adjusted LTV:CAC by 53% (Chivukula, Jin & Zhan, 2026, IEEE Access).

eBay reached the same conclusion a decade earlier. When they paused branded paid search, 99.5% of the traffic arrived through organic search instead (Blake, Nosko & Tadelis, 2015, Econometrica). They were paying for clicks they already owned.

Kaushik, who advises incrementality platform Measured, describes this level as moving from "smart" (MTA) to "super smart." His distinction: "Attribution might show 100 conversions for $500. Incrementality testing reveals only 10 of those were incremental." Same data, very different budget implications.

Continuous measurement, always-on incrementality in privacy-safe clean rooms, is where Dropbox describes their next phase heading. The infrastructure is emerging: AWS Clean Rooms, Google Ads Data Hub, and the clean room market consolidation of 2024-25 (WPP acquired InfoSum, LiveRamp acquired Habu, Amazon made its clean room free for Sponsored Ads advertisers). Roughly 66% of organisations report using clean rooms in some capacity (Skai, 2025), but most use them for audience matching, not continuous measurement.

The access barrier is steep. This requires a dedicated data science function, millions in annual ad spend (per-geo spending sufficient for statistical significance and enough geographic markets for valid treatment/control groups), and C-suite tolerance for turning off revenue-generating channels for a month or more. Haus's 640 Meta experiments showed platform-reported and incremental metrics diverge significantly (Haus, 2025). The platform itself charges $132-170K annually before any internal cost.

Practical floor: roughly $2M+ annual digital media spend.

When MTA and MMM disagree: the paid social pattern

The most common Level 3 setup we see: MTA shows paid social driving 40% of attributed conversions. MMM shows paid social with a much smaller marginal contribution. The diagnosis isn't "one method is wrong." The diagnosis is: paid social is taking credit for conversions that would have happened anyway. Click attribution loves remarketing. MMM doesn't.

The Level 1 number says paid social is your best channel. The triangulated reading says paid social is over-attributed because much of its measured volume is incremental-zero retargeting to people already in the funnel.

The difference between Level 1 and Level 3 isn't better tools. It's whether you have a second method to argue with the first one.

This is the pattern Holly came from. Running weekly client optimisation sessions at Forebrite, watching forecasts hit or miss, the moment that flipped the most budget was always the channel where two methods disagreed and we picked the lower number. Every time. The reflex to trust the higher number is the most expensive habit in marketing measurement.

Where to start

Pick a channel you're confident about. Run its numbers through a second model. If first-touch and last-touch tell wildly different stories, that gap is diagnostic information you weren't using before.

You don't need MMM or incrementality testing to start. You need a second method to argue with the first one. The full mechanics live in our triangulation guide.

Kaushik describes two KPIs every CMO should be able to answer for the CFO: "marketing-driven incremental sales" and "cost per incremental sale." Most can't answer either with confidence. The maturity map tells you why, and what it would take.

Sources

  1. IAB Australia/PwC (2026). Internet Advertising Revenue Report: Full Year 2025.
  2. IAB & BWG Global (2026). State of Data 2026: The AI-Powered Measurement Transformation.
  3. Chivukula, S., Jin, Y. & Zhan, J. (2026). From Attribution to Causality in Digital Advertising. IEEE Access. DOI: 10.1109/ACCESS.2026.3670337
  4. Gordon, B. R., Zettelmeyer, F., Bhargava, N. & Chapsky, D. (2019). A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook. Marketing Science, 38(2), 193-225.
  5. Blake, T., Nosko, C. & Tadelis, S. (2015). Consumer Heterogeneity and Paid Search Effectiveness: A Large-Scale Field Experiment. Econometrica, 83(1), 155-174.
  6. Brodersen, K. H., Gallusser, F., Koehler, J., Remy, N. & Scott, S. L. (2015). Inferring Causal Impact Using Bayesian Structural Time-Series Models. Annals of Applied Statistics, 9(1), 247-274.
  7. Runge, J., Skokan, I., Zhou, G. & Pauwels, K. (2024). Packaging Up Media Mix Modeling: An Introduction to Robyn's Open-Source Approach. arXiv:2403.14674.
  8. Haus (2025). The Meta Report: Lessons from 640 Haus Incrementality Experiments.
  9. Kaushik, A. Marketing Analytics: Attribution Is Not Incrementality.
  10. Google. Modern Measurement Playbook.
  11. Skai (2025). The 2025 State of Data Clean Rooms in Retail Media.
Holly Henderson
Holly Henderson

Co-Founder, mbuzz

Holly Henderson is Co-Founder of mbuzz. With 10+ years in marketing including roles at Westpac, Avon, and Forebrite, she's obsessed with making measurement actually useful.

Harvard Extension School Forebrite Westpac Avon

How mature is your marketing measurement?

The free Measurement Maturity Assessment shows where you stand, where you're exposed, and what to fix first. 10 questions, 3 minutes.

Take the Assessment

Ready to try server-side attribution?

Set up in 10 minutes. Free up to 30K records/month.