Meet Your AI Copilot fot Data Learn More

What Is OLAP (Online Analytical Processing)?

Author
Tony Zhang
Product Evangelist
Apr. 05, 2023
 

What is OLAP?

 

OLAP stands for Online Analytical Processing. OLAP technology is an approach designed to analyze business data from different points of view.

 

This process combines large and often separate datasets into a structure known as an OLAP Cube. This cube is designed to make analyzing data easier. It allows for quick and efficient examination of data from various perspectives, offering a more effective way to query data and understand the information.

OLAP and OLAP Cubes have been crucial in the field of business intelligence, particularly with big data. They group and pre-calculate data, which is useful. This approach helps avoid long processing times and slow responses in modern BI tools and easily handles a large volume of data, making everything much more efficient.

 

OLAP vs OLTP

In the business world, handling data can be understood in two main ways: OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing).

 

OLTP database is built for handling transactional data.

Think of OLTP as the process you experience at an ATM. When you deposit money, your account balance updates immediately. This is OLTP in action - it's about quick, specific transactions, recording what happened, who was involved, and when it occurred.

It's the backbone of daily business activities, ensuring that every transaction, like sales or customer interactions, is recorded in real-time.

This is crucial for systems that manage day-to-day operations, such as Enterprise Resource Planning (ERP), Customer Relationship Management (CRM), and Office Automation (OA) systems, which must be accurate and up-to-date at all times.

 

On the other side, OLAP technology is an approach designed to analyze business data.

OLAP is like analyzing a classroom's test scores over a semester to understand trends and performance. It involves digging into data and examining it from various angles to make strategic decisions.

This process is vital in data warehouses where complex analysis is performed to provide clear insights. For example, a business might use OLAP to look at sales data from different regions, times, or customer groups to decide on marketing strategies or product development.

 

Since OLTP and OLAP serve different purposes in data processing, the technical systems behind them – databases for OLTP and data warehouses for OLAP – are designed differently to best meet these needs.

OLTP systems are built for speed and accuracy in recording transactions, while OLAP systems are structured to enable complex analysis and decision-making.

 

OLAP Architecture

An OLAP (Online Analytical Processing) structure is designed for fast data analysis across multiple dimensions. Think of an OLAP database as a 'cube', a helpful way to visualize multi-dimensional data.

 

Dimension

The term 'Dimension' in OLAP refers to different viewpoints from which data can be analyzed. Dimensions are high-level categories that often have complex hierarchical relationships. By viewing several key attributes of data as different dimensions, users can compare and understand data more effectively. Therefore, OLAP can be seen as a set of tools for analyzing data in multiple dimensions.

 

OLAP Cube

What is an OLAP Cube?

OLAP Cube is a data structure where pre-calculated and aggregated data is stored to speed up analysis. For example, if you want to summarize sales data by product, time period, and color, an OLAP cube allows you to do this efficiently and quickly gather insights.

 

 For instance, you would like to summarize sales data by product, time period, and color to compare and quickly gather insights on sales results. The cube (in theory) may look something like this: 

 
olap cube
 

OLAP Operation

 
OLAP operations include actions like drill up, drill down, slice, dice, and pivot
 

OLAP operations include actions like drill up, drill down, slice, dice, and pivot.

  • Drilling is changing the level of dimension and changing the granularity of analysis. It includes rolling up and drilling down.
  • Roll-up is summarizing detailed data into a more general form within a dimension or reducing the number of dimensions. On the other hand, 'drill down' is the opposite process. It involves going from general summary data to more detailed data for closer examination or introducing new dimensions for analysis.
  • Slicing and dicing refers to analyzing the distribution of data across different dimensions after selecting specific values in some of them. A 'slice' happens when this analysis is done using only two dimensions.
  • Pivot is to transform the orientation of dimensions, i.e., to rearrange the placement of dimensions in a table (e.g., row swap).
 

OLAP Implementations Approaches

The core concept of OLAP (Online Analytical Processing) is straightforward: it uses technology to enhance data analytics.

However, handling big data isn't without challenges, and there are several methods besides OLAP databases:

 
  • In-Memory: This method stores interim results in the computer's memory, hoping to speed up future queries. It's based on the idea that memory is fast and that a cache system can use previous results for similar future queries.
  • Massively Parallel Processing (MPP) Databases: This approach works on dividing data and processing it in parallel. While effective and used by systems like Hadoop, it faces challenges with data movement and concurrent usage by multiple users.
  • Data Virtualization: A promising idea, but often unsuccessful in practice. The challenge lies in efficiently combining small aggregated data sets from different sources. 
  • Cloud Data Warehouses: These involve moving data from data lakes to RDBMS-based warehouses for analysis. The process can be costly and complex, especially when data is moved back and forth between systems, leading to operational difficulties.

In summary, while OLAP offers a robust way to analyze data, other methods also exist, each with its strengths and challenges, especially as the amount of data increases. 

 

OLAP Use Cases

 
  • Finance: Think of OLAP Cubes as a financial toolkit. They align your financial data with operational, marketing, and sales information. This alignment helps users understand what affects their profits and losses. For instance, OLAP Cubes can store and quickly analyze essential calculations like daily and monthly sales, offering valuable insights for decision-making.
  • Marketing: In marketing, BI tools act like a digital nerve center, tracking campaign metrics. These tools offer real-time tracking of campaigns, evaluate the performance of each marketing effort, and aid in planning ongoing campaigns. The data provides a clear view of overall performance and helps marketing analysts create visual reports to share with stakeholders. OLAP Cube-based BI reports swiftly pull together data from various marketing channels, making it easier to understand the big picture.
  • Sales: For sales teams, BI dashboards and key performance indicators are like high-powered telescopes into complex information. They help in analyzing data such as customer buying patterns, profitability, and long-term value. Sales managers use these dashboards to track revenue goals, evaluate sales staff performance, and monitor the sales pipeline. OLAP Cubes are essential here for their speed in delivering insights and identifying business opportunities.
  • Operations: In operations, time and resource efficiency is key. Operation managers use BI to examine supply chain metrics and discover optimization opportunities. It's about ensuring service levels are met and improving distribution routes. The operations team employs OLAP for automating routine tasks and regularly updating historical data, which helps streamline processes and boost productivity.
 

Types of OLAP Systems

 

OLAP systems come in three main types: ROLAP (Relational OLAP), MOLAP (Multidimensional OLAP), and HOLAP (Hybrid OLAP).

  • ROLAP: This system uses a relational database to store multidimensional data for analysis. It creates certain 'views' or tables in the database as needed for the application. Not every SQL query needs to be stored as a view; only those that are frequently used and computationally intensive are defined as such.
  • MOLAP: In MOLAP, multidimensional data is physically stored as an array, forming a 'cube' structure. The values of dimensions are turned into array subscripts, and summary data are stored in the array cells.
  • HOLAP: HOLAP combines the features of both ROLAP and MOLAP. Since each has its strengths and weaknesses, HOLAP is designed to leverage the benefits of both systems.
 

Is OLAP Still Relevant? Options for OLAP Tools

 

OLAP analytics remain important in modern architecture for a reason. Like SQL, it's based on well-proven theory that stands the test of time and can handle today's more complex requirements.

 

The criticism often heard is not about OLAP itself but about older OLAP technologies, such as Cognos and SSAS. These criticisms have merit. The older systems have issues like rigid manual modeling that needs a lot of upkeep, limited data cube sizes, and a scale-up architecture that's been inadequate for a while.

 

Open-source OLAP tools like Apache Kylin, and its commercial OLAP software counterpart Kyligence, combine proven classic precomputation theory with new big data and AI technologies to create Augmented OLAP. This modern, extreme OLAP engine is still modeled but created and maintained by an intelligent, AI-augmented engine.

Apache Kylin OLAP Architecture

Apache Kylin OLAP Architecture

What's more, it provides a cloud-native architecture with storage and compute separation capabilities to fit for modern enterprise's need. 

 

Where OLAP is Headed?

 

Some data management approaches, like data virtualization, MPP databases, and Cloud Data Warehouses, are like a pay-per-use plan. You're charged based on how much data you query. In contrast, Augmented OLAP, like Apache Kylin or Kyligence, operates more like a flat-rate, unlimited plan. Especially for companies dealing with medium to large data volumes or with more than a dozen users, this flat-rate approach often ends up being more cost-effective.

 

Why is this? It's all about how OLAP handles data. OLAP creates pre-computed results, storing them in multi-dimensional cubes. This means that when multiple analysts run the same data-intensive operation, it doesn't overburden the primary data system. Think of it as cooking a large meal once and then serving it to many people, rather than cooking the same meal repeatedly for each person. In OLAP systems like Apache Kylin or Kyligence, a complex operation is done once, and the results are reused efficiently. This isn’t just marketing talk; it's simple math.

 

With data volumes constantly growing, a pay-per-query approach can become increasingly expensive. This is due to rising usage costs, hidden expenses like the price of inaccurate data or delayed decisions, and the small print in service agreements that might limit performance.

 

That said, it's not always best to stick with older, established technologies in the fast-evolving world of big data. Exploring new solutions is essential for innovation and improvement. However, it's important to remember that time-tested technologies like OLAP, in the right context, can take on a new life and offer significant advantages.

 

OLAP on the Cloud

Despite their popularity, data lakes still fall short of enterprise expectations when it comes to production-level delivery (concurrency, latency, workload management, etc.) when built on a relational database. 

 

The way to tackle these issues of the data lake is to add an analytics query accelerator. OLAP on the Cloud such as Kyligence provides distributed OLAP that provides production-level query latency and concurrency and works directly on your data lake. It features:

 
  • Cloud-Native SQL on Data Lake: This is a cloud-native architecture optimized for cloud object storage platforms such as AWS S3, Azure Data Lake Storage Gen 2, and Google Cloud Storage.
  • Elastic and Distributed OLAP: Elastic distributed pre-computing offers the perks of traditional OLAP while benefiting from distributed computing's scalability.
  • Unified Data Services: Kyligence Smart Pushdown™ enables OLAP Engine to push down queries that are not answered by cube or table indexes to the SQL engines where the source data is hosted. The query results can be obtained through an instant detailed data query to meet the needs of exploratory analysis.
  • Seamless BI Integration: Kyligence provides an SQL interface and seamless BI integration that allows business users to perform self-service interactive analysis on the BI tool of their choice.
 

Azure OLAP

On Azure, users can find several options for OLAP, including option provided by Microsoft such as Azure Analysis Services. User can also find third-party OLAP tools such as Kyligence that support Azure.

Data held in OLTP systems such as Azure SQL Database or Data Lakes such as Azure Data Lake Storage can be copied into the OLAP system Data exploration and visualization tools like Power BI, Excel and provide users with highly interactive and customizable reports.

AWS OLAP

Amazon Web Services (AWS) offers a proprietary cloud data warehouse, Amazon Redshift for analytical purposes. Users may also consider cloud-native Online Analytical Processing (OLAP) solutions on AWS, such as Kyligence, which are tailored for faster data analysis.

 

Kyligence Enterprise is a comprehensive, cloud-based OLAP solution that works with Azure, AWS, and Google Cloud Platforms. It streamlines analytics in the cloud, delivering high-speed and simultaneous OLAP processing for cloud data.

This solution uses cloud-native computing and storage, compatible with Azure Blob Storage and AWS S3. It offers versatility, allowing users to operate it either as a standalone system or alongside cloud-based Hadoop deployments.

With Kyligence's unified semantic layer, you can efficiently manage data and connect with key BI tools and Excel. This feature enables business analysts to easily perform self-service data analysis.

 
OLAP on Data Lake

Next Action

 

If you are interested in Kyligence‘s OLAP on Data lake solutions, please download our solution one-pager to learn more.