By Use Cases
By BI Tool
Subscribe to our newsletter>
Get the latest products updates, community events and other news.
Over the past 20 years, the market landscape of data architecture has undergone tremendous changes, from the traditional on-premises BI/DW (Business Intelligence and Data Warehouse) architecture to the big data-based distributed architecture (Hadoop) that emerged around 2010. Later on, with the rise of cloud computing, the data landscape evolved to a cloud-native architecture. At present, the market mainly promotes emerging data architectures that are different from the previous two generations of architectures. As many refer to this emerging data architecture, the modern data stack usually revolves around cloud data warehouses (Snowflake, Amazon Redshift, and Google BigQuery) and Cloud Data Lake (Databricks), or their cousin Cloud Data Lakehouses.
In the first two generations of architectures, distributed big data architectures such as traditional data warehouses or Hadoop have played a critical role in proving that people can extract real value from massive data. Still, their overall technical complexity ultimately limits its adoption to a small group of enterprises.
Because of their ability to store large amounts of data cost-effectively, not require technical experts to maintain, and their consumption-based pricing (pay-as-you-go), data warehouses and data lake (data lakehouse) are fundamental needs for every company to become a data company.
The modern data stack has opened up an entire ecosystem around itself: new market segments have evolved that fit the need for a new data landscape paradigm and the need of a modern company, to name a few, reverse ETL, Metrics Store, and Data Catalog. (You can learn more from our 7 Must-Know Data Buzzwords in 2022 blog)
A metrics store is, in the simplest words, a middle layer between upstream data warehouses/data sources and downstream business applications. It can be called the Metrics Platform, Headless BI, the Metrics Layer, or the Metrics Store — — they are ultimately the same thing.
Unlike traditional BI reporting, metrics store decouples metrics definition from the BI reporting and visualizations. And the teams who own the metrics can define their metrics once in the metrics store, forming that single source of truth, and be able to consistently reuse the metrics across BI, automation tools, business workflows, or even advanced analytics.
Benn Stancil from Mode, in his blog, “The missing piece of the modern data stack,” has a nice graph that clearly states the metrics reporting nowadays. Without a centralized metrics store, the metrics logic will be defined repetitively in different tools, causing metrics inconsistency and discrepancy.
Some investors and practitioners sense the trending market opportunities of the metrics store and have a piece about their perspective.
In the same article above, Benn Stancil considers the metrics layer (aka metrics store) the missing piece of the modern data stack.
As described by Ankur Goyal and Alana Anderson in their article, “Headless Business Intelligence”, a truly scalable “Headless BI” (aka metrics store), has a massive open opportunity.
In the past, metrics were usually defined in data warehouses or BI applications, but this is causing increasing pains for enterprises with growing data volume and complexity. The rise of metrics store is essentially an attempt to find solutions for these challenges enterprises are encountering:
Inconsistency of key metrics definition across business units causing discrepancy for decision-making: Different teams will get entirely different reporting numbers for very simple business questions. To make matters worse, no one knows exactly which number is correct.
Incapability to reuse defined metrics in more business applications that go beyond just BI dashboards: for example, to reduce user churn, the product growth team hopes to timely obtain information about inactive users in the past 30 days adopt activation strategies, such as giving users a free renewal. Only defining and analyzing metrics in BI cannot meet such demand scenarios, which will involve feeding metrics to business applications such as CRM systems.
The difficulty for business users to define metrics with SQL: as Ankur Goyal and Alana Anderson in their article, “Headless Business Intelligence”, puts it -
"Simple tasks like user sessionization, funnel analysis, and data deduplication often require 1,000+ line SQL queries which must be written by expert data engineers or generated programmatically."
The high complexity of data architecture and pipelines results in low efficiency of data analytics: materializing metrics in the data warehouse layer is a commonly used current solution. The data warehouse supports defining metrics in views and then letting other tools query the views.
Many companies I’ve worked with are currently using views to solve last-mile queries. The problem with using views is that they can only be materialized for some query requirements. When requirements are numerous, the data engineering team needs to prepare a large number of views. As a result, the development and maintenance cost is extremely high; What’s worse, the data pipeline is complicated and error-prone.
The concept advocated by the metrics store is that “the metrics can be defined once, and then reused anywhere.” That means metrics store can be used flexibly across BI visualizations, SaaS integrations, and an API, opening up tons of new use cases that were not previously possible with BI reporting.
In the current solution, the tight coupling between the metrics layer and the BI system that consumes it restrains the value of metrics in more application scenarios. However, suppose the metrics layer and BI can be decoupled to create standalone metrics stores, the single source of truth. In that case, the metrics consistency can be achieved when all kinds of downstream systems consume the unified metrics.
Metrics is designed for business rather than data/engineering function. So the metrics store must enable EVERYONE to become a data analyst regardless of age, data literacy, and technical skills. To make that happen, the metrics store needs to provide an extremely intuitive user interface to allow non-technical business users to define and analyze their business metrics. We have seen this design pattern in one of our most successful customer cases, which you can read and learn their stories in the customer story section below.
What the ideal metrics store solution achieves will no longer be serving only canned BI dashboards like the old days. Instead, it completes the foundation on which operational BI and exploratory data science both live. With operational, data science, business self-service use cases kicking in, users adding up, queries growing in exponential numbers, the metrics store needs a strong computation engine to back it up to achieve the scalability and concurrency of business metrics consuming.
A top commercial bank successfully rolled out a self-service metrics platform — Pandora, to democratize data across the bank in December 2019
Previously, as illustrated, it typically took 12 workdays to deliver a data product embedded with 50 metrics. Like the traditional dashboard delivery process, it has five phases: requirement clarification, data sourcing, pipeline implementation, dashboard creation, and UAT. The most frustrating part of the entire workflow is that IT engineers have to communicate back and forth to align various business units and get each data owner’s approval to access and collect data. In addition, they are inundated with tedious, one-off projects and repetitive work since the outdated BI architecture was not designed for metrics reuse.
Since Pandora went live, the end-to-end delivery time has been reduced to 5 workdays because 30 out of 50 metrics are already available in the repository and ready to ship, and the other 15 metrics could be derived from the existing ones by applying simple filtering or mathematical transformations. So BI engineers only need to focus on creating the 5 new metrics instead of implementing all the 50 metrics. The improved efficiency also comes from applying the concept of Universal Design — Designing for Everyone in Pandora.
Pandora has an extremely intuitive user interface that allows non-technical folks to drag and drop ready-to-use metrics to assemble dashboards while IT experts are engineering the 5 new metrics for them. Creating dashboards is now “delegated” to business end-users; as a result, IT departments are free to drive new values.
Bonus: The 15 derived metrics and 5 new metrics — 20 metrics in total will be added to the metrics repo as an asset, and other business users could reuse them out of the box in the future.
Note: Pandorametrics platform blog posts series were originally written by Lori Lu, you are welcome to read her original blog series: Enterprise Metric Platform in Action
The metrics store solution brings significant value to the enterprise: with the completion of the data analysis, the BI report may end its lifecycle. In contrast, enterprises will stick with the metrics store that tightly integrates with their business workflow and be reused anywhere, generating more possibilities.
I am personally relentlessly bullish on the future of Metrics Store as we have witnessed the tremendous values that metrics store has brought to our customers. In the example above, our customers choose to build their metrics store in-house and leverage Kyligence as the metrics computation engine, and we are confident in the near future there will be more mature “off the shelf” metrics stores that serve the needs of every business.
Learn about the fundamentals of a data product and how we help build better data products with real customer success stories.
Come to see the Next Generation of SQL Query Engine
Learn how to achieve alternatives to SSAS.
In this article, we’ll dive into the unified Metrics Platform at Beike, introduce Beike’s practice of building the Metrics Platform infrastructure using Apache Kylin and some real use cases at Beike.
Learn Kyligence Cloud model design principles and how to use Kyligence Cloud to build models.
99 Almaden Boulevard Suite #663
San Jose, CA 95113
+1 (669) 256-3378
Ⓒ 2022 Kyligence, Inc. All rights reserved.
Already have an account? Click here to login