When your business has a mountain of data to deal with, it can be hard to manage and structure every source to gather valuable insights and drive decision-making.
With data coming from different places, the key is to collect your business data and load it into a service that's easy for any user in the business to access.
The solution quite often lies within a data warehouse. Let’s look at why.
Here’s a short definition of a data warehouse:
A data warehouse is a unified data structure that aggregates data from multiple sources into a single database location for business intelligence and decision-making, based on the needs of an organization.
Organizations have been using data warehouses for their data storage since the 1990s. Once unwieldy projects, they’ve evolved into more agile models that provide multiple benefits. Let’s take a look at them in turn.
Businesses that need fast and accurate insights into core processes such as inventories and KPIs can benefit from a data warehouse. They’re particularly useful in marketing research, financial management and sales.
That said, data warehouses aren’t for everyone. Let’s look at why.
The specificity required for the analytics within a data warehouse is both a blessing and a curse. Here are a couple of factors that might swing your decision in another direction.
While a data warehouse can provide fast and accurate answers to specific questions, it takes a lot of time, effort, and capital to enable this level of sophistication. As such, another model of enterprise data architecture might be appropriate for you.
Data warehouse expert Ralph Kimball stated that a data warehouse is nothing more than a union of all data marts. Bill Inmon, ‘the father of data warehousing’, responded with this:
“You can catch all the minnows in the ocean and stack them together — they still do not make a whale”. - Bill Inmon
Inmon means that you should feed data into the data warehouse right after the ETL process. Kimball’s method, however, involves loading the data into data marts to make up the structure of the warehouse.
But before we get into the ins and outs of Kimball vs Inmon, here’s a short definition of a data mart:
A data mart is an outlet in a data warehouse for a specific department or subject area. Data marts often structure data in a different way to a data warehouse, making it a better fit for department‑specific questions.
And yet, there’s more to each method than their approach to data marts. Let’s examine each in turn.
The Kimball method, or the bottom-up method as it’s also known, focuses on designing data marts first. These become the foundation of the data warehouse. Labels such as people, places and products allow you to connect to a particular department or specific function within your business.
What are the benefits?
What are the disadvantages?
The Kimball method is perfect for businesses that don’t need to connect with data from other departments in order to perform well. A marketing firm is a good example of this, as is the customer relationship management department within a financial institution.
Also known as the top-down method, the Inmon method involves building the data warehouse first and then categorizing it into different data marts. It’s considered to be the data-driven approach, providing more accuracy of data for the entire enterprise.
What are the benefits?
What are the disadvantages?
While the Inmon method provides more accurate and far-ranging insights, it’s expensive and requires advanced skills even after set-up. That’s why it’s perfect for mature organizations that can afford the investment and will benefit from analytics that span the whole enterprise.
An insurance firm would be a good example of this, as it needs access to a range of data to form an accurate picture of the policyholder. Similarly, manufacturing companies need to access data across all processes to ensure everything runs smoothly.
Whether you choose the Kimball method or the Inmon method depends on your business. You’ll also need to determine what your budget is, what talent you can access, and how much time you have to set up your data warehouse.
A data warehouse can involve:
At CloverDX, we usually follow the ETL process, although we can (and have) utilized ELT.
With ETL, you extract data from your sources (marketing systems, CRM, ERP, etc.), transform it into a format and load it into the data warehouse. This allows any user to pull the necessary data and respond to queries quickly.
The source data is restructured to match the dimensions of a data warehouse. This requires some upfront planning, such as thinking about the type of data that needs to be stored. However, you can reduce your manual effort by building a repeatable, visual approach to the data workflow.
How do you do this? By automating the data transfer process through a schedule, and/or specific event triggers (e.g. a new file arriving at a location) or APIs.
You can build and automate reliable data pipelines, removing any bottlenecks. And the automation doesn’t stop there. In addition to the ETL process, you can automate the monitoring and error handling around it to make it robust and fully capable of running on autopilot.
Data Warehousing with CloverDXHere’s a successful data warehouse case study from one of our customers.
EE, the UK’s largest mobile network with more than 28 million, has a lot of data in multiple systems. This makes consolidating data difficult for small departments.
The ‘digital insights’ team within the company needed to find a new and easier way to access data. They decided to adopt an agile approach to growing a data warehouse as opposed to previously failed, large-scale and slow-paced traditional projects.
They wanted to demonstrate value quickly to secure further budget and generate buy-in from other departments, which would encourage access to more data and make their projects more attractive.
To achieve this quickly with limited budget and resources, they needed to onboard new data with full ETL automation.
A combination of CloverDX and Amazon Redshift fit the bill perfectly. The Amazon Redshift data warehouse gave them an easy place to store the data without too much upfront commitment, while CloverDX provided the level of productivity, flexibility and enterprise features they needed to ensure the project was a success.
If your business handles a lot of data from multiple sources, consider a data warehouse. After all, it aggregates the data in a centralized location, providing you with accurate insights that inform better business decisions.
That said, there’s a lot to think about before you take the plunge. Firstly, will the benefits offset the cost? And if a data warehouse is the right choice for you, what type do you need?
If you’d like to learn more about data warehouses and other enterprise data architecture options, read through our guide.