How Marketers Can Benefit from a Warehouse-First Approach to Data

Last year may well have been the year of the cloud data warehouse (CDW). Snowflake had the biggest software IPO ever (at the time) and a blistering growth rate to go with it. It became a household name in IT and tech. This year looks to continue that trend, with Databricks raising $1B in funding at a $28B valuation already. CDWs have become a huge, growing business.

A cohort of applications built on top of CDWs has risen from this huge, growing business. This trend was included as one of the most impactful ideas of 2020 in Kleiner Perkins’ A 2020 Perspective. Applications such as Observe use Snowflake to process all of their data and as their central data store, driving down the cost of storage to “little more than the cost of Amazon S3.”

In part two of this two-part blog series (find part one here), we’ll outline why the Marketing organization should even care about a warehouse-first architecture and dive deep on the martech benefits. Next week, we’ll recap our findings in a joint webinar.

What’s a “warehouse-first approach”?

This is a data ecosystem that is scattered with no centralized source of data

Basically, the idea is that your messaging strategy and tools should all revolve around your data warehouse — hopefully, where all your customer data will eventually reside, if it doesn’t already. The data warehouse is the star at the center of your strategic galaxy, whose gravity everything else succumbs to.

Not your ESP’s marketing cloud. Not your CDP. The data warehouse isn’t a peripheral tool that helps feed your message-building products. Those products — ESP, CDP, CRM, whatever you might use — should revolve around and plug directly into your data warehouse, allowing you to use the data where it lives rather than copying it and shipping it elsewhere. That’s putting your warehouse first.

This is what your data ecosystem looks like with a warehouse-first approach.

How does doing this help the marketing team?

Our friends at RudderStack did a great job of identifying the chief data-centric benefits of being warehouse-first. The big follow-up question, then, is how these help the marketing team:

Improved data control

“The data lake that warehouse-first applications build and operate on top of is stored in their customers’ data warehouses,” wrote RudderStack Product Marketer Gavin Johnson. “So, if you use warehouse-first tools, you don’t have to rely on the vendor to protect your sensitive data. It’s in your data warehouse, and you have control of its security and privacy.”

This is key. If you’re a data security-conscious organization — and you should be — it’s important for marketers to understand the implications of shipping copies of your data to your ESP’s marketing cloud or your CDP. That means it’s leaving the safety, security, and control that exists behind your company’s firewall, and going out into someone else’s world. And, secure as they may say they are, you can’t control their mistakes.

In many cases, what happens to the marketing team is I.T. shuts them out from using much of the data they’ve collected because it’s considered to be Personally identifiable information (PII) and, therefore, too sensitive to risk its exposure to a vendor’s potential data breach.

But once your data warehouse is at the center of your strategy, and your vendors are plugging directly into it, now you’re able to put that data to use while it’s still safe and secure. The cloud is still used for getting those messages out the door at enterprise scale, but not for storage or campaign building. That allows you to use all your data, and maintain it in whatever schema makes sense for your team.

Increased flexibility with no duplicated data

“Many customers want the flexibility to use the data from data-intensive applications for analysis and activation in other tools,” Johnson wrote. “This flexibility isn’t possible with traditional vendors because most don’t allow direct access to their data lake. So if you want to perform analysis on your data or enrich it and use it for activation in another tool, you have to export it to your data warehouse first. This data duplication is expensive, inefficient, and unnecessary.”

Every time you keep a copy of your data in another system, you’re paying extra money just to keep data in two places. Built into this warehouse-first design is that, once your data is housed in your data warehouse, marketers should want to duplicate and move that data around as little as possible in order to perform the tasks needed to use it effectively.

Every time you keep a copy of your data in another system, you’re paying extra money just to keep data in two places.

Data-enrichment tools can be very useful when it comes to identity resolution, for instance, and that’s often an important part of the process of creating personalized campaigns. But if you’re having to copy your data and send it all over the place just to analyze and enrich it, you’re unnecessarily wasting time and energy for no benefit.

Tools that connect directly to your data warehouse enable these data-intensive applications to access the data where it lives, eliminating pointless infrastructure and maintenance of connections that are inefficient when working and always carry the possibility of breaking.

Lower costs

“Since warehouse-first applications don’t store their customer’s data, they can’t charge for it,” Johnson wrote. “This has resulted in significantly lower pricing compared to traditional vendors.”

We probably don’t have to sell marketers on the benefits of lower costs, but let’s dive in a little more deeply on exactly why — and how much — your team might be able to save on its budget with a warehouse-first approach.

Data storage is one of the most expensive parts of any Super Sender’s marketing strategy. When you get a bill from your ESP or CDP, that’s likely where a large chunk of that money is going. But the thing is, the data you’re storing there isn’t fresh, live, and original. It’s a copy that becomes quickly outdated. It’s adjusted to fit with someone else’s schema. And — most importantly — it’s the exact same data you’re already paying to store in your data warehouse.

Why would you pay to store the same data twice? And, with the massive size of datasets at enterprise organizations, all those costs of data duplication and storage starts to seriously add up. RudderStack says that customers report a saving of up to 66% compared to Segment.io, and we’ve seen similar results with our customers.

When you think about your ESP as the center of your messaging universe, it’s easy to think, “Well, of course the data needs to go there so I can work with it.” But once you shift your mindset to the warehouse being at the center of everything, hopefully you start to realize how absurd it is that the data would be copied and shipped elsewhere. Put your warehouse first, and marketing will see the benefits add up quickly.

About the Author

Jeff Haws

As MessageGears’ Senior Marketing Manager, Jeff is focused on producing engaging and thoughtful content that resonates with enterprise marketers, helping them to better understand how MessageGears makes their jobs easier. He’s passionate about understanding the way data impacts messaging, and he’s also hopelessly obsessed with baseball.