Big Data OLAP Analytics Solutions: Apache Kylin vs. Vertica

Tyler Wishnoff
Manager, Kyligence Marketing
Mar. 13, 2019

Are you drowning in Big Data? If so, you’re not alone.

In today’s digital world, everything that can be tracked, will be tracked. Data-driven technologies like artificial intelligence and machine learning are promising major economic benefits to those who can exploit them. This has left businesses and their data science teams struggling to manage the massive datasets they’ve acquired.

In the data collection arms race, everyone has ended up losing. The explosion of proprietary data has placed intense resource requirements on data infrastructure and degraded business intelligence (BI) tool performance.

This creates a conundrum. You need fast insights, but for those insights to be useful they require lots of data to back them up. This makes insight generation slower as your queries struggle to sift through all that data.

Ultimately, it doesn’t matter how much data you have, that data is worthless if you can’t efficiently process it.

OLAP Solutions for Modern Big Data Analytics Needs

For organizations looking to improve their big data processing, OLAP engines offer an easy path to success.

OLAP? Isn’t that technology kind of old?

It’s true OLAP has been around for a while, but it’s still just as relevant as ever. In fact, its maturity is an asset. It’s well understood, developed, and still outperforms newer technology approaches on the market today. If it can power the analytics of Global Fortune 100 companies, it’ll work for you, too.

When it comes to selecting the right solution, results can vary depending on IT infrastructure, preferred BI tools, and company size. Recently, Stratebi evaluated two popular proprietary and open source Big Data OLAP products: Vertica and Apache Kylin (respectively). Both solutions performed well, but Stratebi’s results highlight the performance divides common with OLAP solutions on massive (petabyte) datasets.

Benchmarking OLAP Solutions: Apache Kylin vs. Vertica

So, what were Stratebi’s final conclusions about Apache Kylin and Vertica. To get all the details, it’s recommended you view the full report, but here are the highlights:

  • When it comes to query latency, Apache Kylin comes out ahead of Vertica. For massive datasets, Kylin not only surpassed Vertica, but Vertica performance degraded substantially.
  • Loading data was another story, however. For billions of rows, Vertica managed to take less time, but when adding or reloading past data, Kylin came out ahead.
  • For smaller datasets (several billion rows), performance was similar for Kylin and Vertica.

While Apache Kylin proved to be the overall winner in performance, that doesn’t mean everything. Stratebi suggested that other factors such as installation time and hardware requirements should also be considered when conducting a thorough evaluation. With that in mind, Vertica can be a reasonable choice for smaller operations with major resource constraints.

For smaller teams with budget constraints but a bit more time, the open source nature of Kylin can be a differentiator. If you’re looking for an OLAP processing engine that delivers stunning results on enterprise-scale datasets, Kylin leads the way.

But your search doesn’t have to end there. Kylin and Vertica are only two popular options. The early contributors to Kylin have also come together to offer Kyligence. Kyligence is powered by the same core technologies that makes Kylin so powerful, but with a suite of features that make it ideal for enterprise-scale Big Data work.

Features such as cell-level security, multi-tenancy and BI vendor-proprietary connectors, make Kyligence a perfect match for the complex IT needs all data-driven enterprises face today. It also rewrites the storage to replace the Hbase that Apache Kylin uses which simplifies maintenance and ensures more stable high query performance.

Supercharge Your Big Data Technology Stack

Companies that are able to capitalize on their data will have a tremendous advantage over the competition. This year, billions of dollars will be spent upgrading data warehouses and software licenses for new BI tools. Unfortunately, many of those investments will be squandered because of poor data management.

It doesn’t need to be this way. Don’t let Big Data turn into ‘Big Disappointment’. A Big Data OLAP engine may be the jumpstart your analytics operations need to extract more valuable insights.

…Before your competition does.

To get all the details of Stratebi’s benchmark research, you can view the full report here.

Also, if you hadn’t heard of Apache Kylin before this post, you can learn more about it here.