With AWS Re:Invent coming to a close, cloud computing is front and center for many in the technology space. I have been focused on the Cloud for the last decade of my career, and specifically on the ways in which the Cloud enables companies to get value out of their data.
The Cloud offers an inexpensive way to store petabytes of data, along with scalable computing resources that make it possible to analyze those volumes of data. This data can be leveraged to truly understand what is happening in any organization and its value is only growing as new technologies and algorithms are developed to analyze it. A Cloud Data Lake represents an effective way to organize the volume and diversity of data that many organizations are dealing with, driving innovation through Machine Learning and other techniques for advanced data analysis.
Many of the hottest startups and growth stage companies in Silicon Valley today are built around data and analytics, and the vast majority are building in the Cloud. On top of that, the primary cloud vendors (AWS, Azure, Google, and even Alibaba) have all invested heavily in data and analytics offerings.
As a result, organizations have a wide variety of powerful tools and platforms readily available to analyze their data. Migrating to the cloud opens up a host of possibilities, but it does require a shift in the way organizations of all sizes think about the challenges they face in managing and analyzing their data. Sometimes though, the best solution is one that is tried and tested.
Alphabet Soup: OLAP, OLTP, ROLAP, and SSAS
Online Analytical Processing (OLAP) is a term coined to differentiate data technologies focused on analyzing data from those focused on collecting data (OLTP or Online Transaction Processing). Over time, OLAP came to specifically represent a set of data technologies focused on making multidimensional data available in an easily consumable format with extremely low latency.
OLAP Cubes represent the most popular implementation of OLAP technology with powerful solutions like Essbase (now owned by Oracle), Microsoft SQL Server Analysis Services (SSAS), and Cognos employing this technology. OLAP Cubes make it simple for non-technical users to easily slice and dice the data they need to understand what is happening with their business.
ROLAP (Relational OLAP) was developed as an alternative to Cubes which provided additional flexibility at the cost of added complexity and reduced query performance. The key difference between ROLAP and an OLAP Cube comes down to how the queries are handled.
OLAP Cubes run in their own compute environment, separate from the data warehouse or Data Lake. This can be a significant advantage over ROLAP, which relies on the compute provided by those systems and creates resource contention when a large number of users are interacting with the data.
As data volumes began to explode at the start of the decade, along with the shift to cloud storage accelerating in subsequent years, legacy OLAP solutions struggled to keep up. Cube management went from a difficult problem to a nearly intractable one, and often these systems became unusable. Some companies created complex ecosystems to feed their OLAP solution, but many simply abandoned OLAP altogether.
A New Way Forward
One company that struggled with this situation was eBay. Their Data Lake provided a single location for all their data, but access for business users was cumbersome and slow. Fortunately, they had a brilliant technical team that created a new way to deliver OLAP Cube functionality on top of their Data Lake.
Their solution evolved into the Kylin project, which eventually became a top-level Apache project in 2015. One year later, the team that built Kylin formed Kyligence to deliver an enterprise version of this adaptive OLAP engine for Data Lakes.
Kylin included a new structure for OLAP cubes that made them much more scalable than legacy solutions, enabling sub-second query performance against billions of rows of data. Kyligence added intelligent cube building powered by machine learning to optimize the cubes based on the queries users run within the system.
Kyligence made it possible for organizations to get the query performance that they had come to expect from OLAP and also scale well beyond what was previously possible. Even better, the solution required very little effort to configure or maintain and was able to handle hundreds of concurrent queries. With Kyligence, there was finally a way for those who had been forced to abandon their legacy OLAP solutions due to scalability concerns to replicate that functionality at a greater scale with reduced maintenance.
Shifting to the Cloud
While organizations with on-premises Data Lakes began to discover the benefits of Kyligence, the migration to the Cloud of companies across every industry was accelerating. The separation of storage and compute created a perfect environment for the Data Lake. On top of that, the widespread adoption of Spark and other cloud-friendly compute frameworks made it much easier to process all of this data.
The team at Kyligence saw this development and created Kyligence Cloud to allow organizations to quickly and easily deploy in the Cloud. As an added bonus, Kyligence supports all major Cloud vendors to give customers the option to deploy on the Cloud wherever their data is today.
If your organization has a Cloud Data Lake and is struggling with query performance for business users, you need to take a look at Kyligence Cloud. In less than a day, and with very little setup, you can achieve sub-second query performance across any size dataset.
Once you’ve connected your preferred business intelligence (BI) tools, you’ll see firsthand why OLAP that’s built for Data Lakes can’t be beat when it comes to accelerating your analytics.