Octave OlivettiEurope/Paris
Octave Olivetti
Work

Luxury beauty CDP on GCP: 18M+ contacts, 32 countries

CDP on GCP for an international luxury beauty house: 18M+ contacts, 32 countries, Adobe Campaign enrichment, advanced segmentation, and multi-country governance.

Luxury beauty CDP on GCP: 18M+ contacts, 32 countries

In large organizations, the challenge isn't finding yet another tool. It's making systems work together. Systems that have accumulated over time, each with its own local logic, legacy usage patterns, and trade-offs that have become invisible. Customer data exists in abundance, but it often remains fragmented across channels, markets, and teams, without a foundation coherent enough to activate quickly and steer effectively.

It's in this context that I help structure a global CDP platform for an international luxury beauty house. The scope covers more than 18 million contacts, 32 countries, over €500M in e-commerce, and marketing use cases tied to more than €1B in global sales.

The goal isn't to rip and replace the existing stack overnight. It's to make customer data actionable, accelerate marketing activation, better govern data flows, and establish governance that holds at this scale. The guiding principle throughout: keep what works on the business side, and add the data depth and industrial rigor that's missing.

#The challenge: a legacy marketing repository, heterogeneous flows, and no room for approximation

A common challenge in omnichannel luxury organizations is that a single customer can exist multiple times depending on their entry channel. E-commerce, in-store, marketing forms, event-driven collection, or any number of local touchpoints. On top of that, partial histories, consent managed differently across countries, and data quality that varies from one channel and market to the next.

From one country to another, the legal bases for collection, opt-in mechanisms, and retention rules diverge. Forms were designed at different times, with uneven levels of granularity. Harmonizing these practices without breaking the existing setup or slowing down local teams requires deep collaboration with legal and DPO teams, particularly on the multi-country opt-in strategy.

On the quality side, we don't pretend we can fix everything at the source. We also implement controls within the platform itself: volume consistency checks, anomaly detection on key attributes, and alerting when drift occurs. This automated data quality monitoring secures downstream usage without depending on a uniform level of maturity across all markets.

So this isn't just about centralization. We have to work with a single customer repository already in place within Adobe Campaign, a repository that remains the operational backbone of marketing activation. Meanwhile, business expectations have shifted. Teams no longer just want to execute campaigns. They want to segment, enrich, personalize, measure, and do it fast.

We have neither a truly homogeneous global CIAM nor a sufficiently robust data quality framework at the source across the full scope. The platform must therefore absorb some of this complexity, without pretending it doesn't exist.

#The architecture decision: augment the existing repository rather than spin a false disruption narrative

The key trade-off on the project is to acknowledge Adobe Campaign for what it truly is, the legacy customer repository and the central marketing activation engine.

Google Cloud Platform isn't meant to replace that foundation. Its role is to augment it.

GCP is where we structure the processing chain, centralize flows, enrich profiles, produce calculated attributes, prepare analytical data, and feed back into Adobe Campaign whatever needs to be activated on the marketing side. This approach preserves the existing business setup while adding the computing power, analytical depth, and industrialization discipline that Adobe Campaign can't carry alone.

Adobe Campaign stays central for orchestration and activation. Asking it to also carry advanced consolidation logic, massive enrichment, analytical computation, and cross-channel structuring at this scale hits its architectural limits fast. The right call is to let it keep its role on the activation side, while offloading preparation, computation, and enrichment to GCP.

#An industrial foundation to absorb scale

On the technical side, the platform runs on Google Cloud Platform, with BigQuery as the analytical backbone, Airflow for orchestration, Python and SQL for processing, and a group-level framework capable of auto-generating DAGs from configuration files.

At this scale, every need treated as a bespoke pipeline quickly becomes maintenance debt. Standardized DAG generation delivers consistency, faster execution, and the ability to scale. On a platform processing over 10 million events per day across 32 markets, a broken pipeline on Friday evening can block a campaign on Monday morning.

The architecture follows a simple, readable logic. The RAW layer absorbs incoming flows without trying to clean them up too early. It guarantees traceability, historization, and the ability to replay processing. The ODS layer normalizes the operational layer: formats, naming conventions, consent fields, identifiers, and data structures that vary across markets. The Data Warehouse layer handles consolidation, analytical preparation, and enrichments. This is where data becomes truly actionable, because it gains stability, readability, and practical value.

Finally, the datamarts expose use-case-oriented views, particularly for BI, marketing, and certain analytical use cases. This prevents business tools from plugging directly into overly complex technical layers and limits the proliferation of ad hoc exports, always charming until the day no one knows where the numbers come from. As tables, calculated attributes, and segments multiply, discoverability becomes a central concern. Implementing a data catalogue with descriptions, owners, lineage, and certification levels allows marketing users to find and understand available data without systematically going through the data team.

#Collected data, calculated data, activable data

GCP primarily enables a clear separation between source data and derived data. BigQuery provides the computing power to produce enrichments at scale without burdening the operational repository. Airflow ensures reproducibility and traceability of processing. In practice, this creates a distinction between what we collect and what we compute.

A date of birth is collected data. Age is calculated data. A purchase history is source data. A VIC segmentation or a time-to-repurchase indicator are derived objects. Consent is collected and regulated information. A score or a behavioral segmentation results from analytical processing.

Collected data
– Email · date of birth
– Consents · transactions
Derived data
– Age · RFM · VIC segments
– Time to repurchase
Predicted data
– Churn score · purchase propensity
– Predictive LTV

This distinction changes a lot. On the governance side, it prevents mixing raw attributes, enrichments, scores, and activable segments. On the business side, it makes clear what comes from collection, what is calculated, and what is ready to be activated.

Most importantly, it enables a clean enrichment model for the existing repository: Adobe Campaign keeps its operational role, while GCP produces the calculated attributes, enrichments, and new segmentations that are then fed back into Adobe Campaign for use in campaigns.

#From source data to marketing activation

Collected data
– Profile · email · date of birth
– Transactions, consents, events
Calculated data
– Identity resolution · RFM · VIC
– Time to Purchase · LTV scoring
Activable data
– 100+ segments · personalization
– Triggers · multichannel campaigns

#Advanced segmentation, data science, and activation

The platform also paves the way for more advanced segmentation and data science use cases, particularly through Dataiku.

The goal isn't to produce models that feed committee slide decks, it's to create activable objects. The platform enables the calculation of new segmentations, the production of derived attributes, the generation of propensity scores and indicators, and the injection of these results back into Adobe Campaign for targeting, personalization, and orchestration.

This is the context in which we work on enrichments such as VIC segmentations, time-to-purchase logic, time-to-send, and time-to-repurchase. Computation sophistication matters, but only if the output actually reaches the marketing engine. An enriched data point isn't worth much if it sits in a corner of the platform gathering dust. It gains value when it flows back into Adobe Campaign in a directly usable form.

#BI, steering, and overall consistency

The platform isn't designed solely for marketing. The datamarts exposed in GCP also feed Power BI, with a more stable approach to reporting and steering.

In practice, marketing teams, market directors, and the executive committee can view dashboards built on the same definitions used in campaigns and segmentations. An "active customer" in Power BI corresponds to the same scope as the "active customer" targeted in Adobe Campaign. The datamarts cover several dimensions: performance by market, cohort analysis, opt-in rate tracking by channel and country, and customer lifecycle management.

In many organizations, BI, marketing automation, and analytical use cases evolve in parallel, with different datasets and sometimes diverging definitions. Steering meetings then turn into debates about the numbers rather than decisions about actions. One of the project's key contributions is precisely to connect these components so that the figures in BI, in campaigns, and in analytics tell the same story.

#Governance and compliance: making the platform sustainable

The project's value doesn't rest solely on the stack. It also depends on the framework built around it.

A significant part of my role involves structuring the platform's governance: documentation, naming conventions, construction rules, reusable delivery patterns, and a more industrial approach to evolving data flows. Before this, many processes were handled on an ad hoc basis, sometimes manually, with practices varying across teams and needs. This way of working can survive for a while. It doesn't hold up across 32 countries.

Compliance follows the same logic. I work with DPO, legal, and business teams to evolve collection forms, prepare staff, and clarify data usage rules according to local contexts, as part of a privacy program designed as an operational discipline rather than a one-off exercise.

In this type of project, well-designed compliance isn't a hindrance. It's primarily a way to avoid industrializing chaos. And, if you know the right levers and local specificities, compliance can even become an accelerator.

#Results

The project established a robust foundation for:

18M+ contacts
Activated across 32 countries
10M+ events/day
Ingested and processed daily
50+ marketing users
Creating 100+ segments/month
< 24h time-to-market
Down from 3–7 days

The most visible change is in time-to-market. Previously, launching a new segmented campaign went through the data team: extraction, preparation, qualification, then manual transfer to Adobe Campaign. Count 3 to 7 days. Now, segments are directly available in the activation engine and business teams create them themselves.

But beyond the KPIs, the real change lies elsewhere. The platform is no longer a stack of ad hoc flows and one-off processes. It's a documented and governed system that marketing, BI, and analytics rely on daily. In short, it no longer just exists. It delivers.

If you're facing a similar challenge like structuring a customer data platform, industrializing marketing flows, or implementing governance that holds at scale, feel free to get in touch.