Post-Hoc Classification Visual

The Reactive Transformation Pipeline

"fb"

Date: 2023-01-15

"facebook"

Date: 2023-04-10

"Facebook.com"

Data is inconsistent and messy at the source.

ENGINE

CASE WHEN source IN ('fb', 'facebook', 'Facebook.com') THEN 'Facebook'

CASE WHEN MONTH(date) IN (1,2,3) THEN 'Q1'

Rules applied in BI Tool, Warehouse, or Analytics Platform.

Source: Facebook CLEAN

Fiscal Quarter: Q1 DERIVED

Source: Facebook CLEAN

Fiscal Quarter: Q2 DERIVED

Data is now structured, consistent, and ready for reporting.

Flexibility: Rules can be created, modified, and applied to historical data.
Non-disruptive: No changes needed to upstream data collection or marketing behavior.
Empowering: Analysts can create custom, business-specific views of the data.

Computationally Intensive: Can slow down reporting queries.
Analyst Burden: Requires specialized skills (SQL, Python, DAX) to build and maintain rules.
Risk: Raw data remains messy. "Garbage in, gospel out" if rules are flawed.

Clean Data

Source

Simple Analysis

Democratized

Front-loads investment in tools & governance. Data is trustworthy for all users.

Messy Data

Source

Complex Logic

Analyst Team

Analysis

Centralized

Back-loads investment onto the analytics team. Creates potential bottlenecks.

Mature organizations blend both: Proactive governance with Reactive flexibility for exceptions.