Build the Common Data Language with the Metrics Platform Start Now
By Use Cases
By BI Tools
Subscribe to our newsletter>
Get the latest products updates, community events and other news.
Opportunities abound for AI and machine learning endeavors across every industry. By 2021, it’s predicted that business investment in AI will exceed $57 billion*. This has generated a global race amongst organizations to capture, analyze, and act on every byte of data they can get. In response, analytics teams are turning to solutions like Kyligence and MapR to meet the demands of this evolving Big Data technology.
For the uninitiated, Kyligence is the extreme OLAP engine on big data platform built by the team behind the open source software Apache Kylin. MapR is a high performance and highly scalable big data platform. In this article, we are going to look at how Kyligence Enterprise and MapR can enhance each other and provide the best big data analytics solution.
Kyligence can be deployed onto, or on the edge nodes of, a Hadoop cluster. Users can define and load (pre-calculate) multi-dimensional models using Apache Spark or MapReduce jobs, with the OLAP data structures being stored on Hadoop file systems.
MapR is an advanced distributed file system and converged data platform that supports Hadoop Distributed File System (HDFS), HBase, Document database, and stream processing (using Kafka API). MapR provides some really great features which distinguish it from other Hadoop distributions. Combined with Kyligence Enterprise, MapR users can not only extend the capabilities of the MapR platform, but also enhance their experience.
MapR-FS was developed from the ground up as a
distributed read write file system that is compatible with HDFS interfaces. It
is written in C++ and installed on disks directly (see diagram below). MapR-FS
does not have to deal with the overhead of JVM and Linux File System when
When Kyligence engineers tested MapR-FS against generic Hadoop systems with similar hardware configurations, they got some very interesting results. The design of MapR-FS delivers a clear performance advantage over HDFS. As illustrated in following diagrams, both cube building and cube querying jobs run significantly faster on the MapR platform.
Today, many companies are choosing to build data
lakes to address their data storage needs. Typical data lake setups focus on
either storing historical data or large amounts of raw data (such as log data).
Querying this data can be time-intensive. Query engines can accelerate the
process in certain scenarios, but you’ll usually need to push the data to
outside data marts to reduce latency.
With Kyligence Enterprise, aggregated results
are pre-calculated and stored in the MapR file systems. There is no need to
aggregate data in another data warehouse or data mart. This greatly simplifies
your data lake architecture and removes traditional data warehouse dependencies
that clash with Big Data technologies.
If a user needs answers from the detailed records (not aggregated), they can still send the query to Kyligence. In this case, the query is routed to the data store in MapR-FS or MapR-DB. Now, your data lake can serve both detailed and aggregated queries with superb performance.
Kyligence employs two types of workloads. The first is cube building, which can take place before a query is served or during a cube update. The second type of workload is cube querying, where 1000’s of users and applications read data from the cubes. For Hadoop installations, Kyligence recommends a cube building and query cluster to separate the workloads for the best query performance.
MapR supports the concepts of node topology and volume topology. With these configurations, you can place data and jobs specifically on certain nodes in your cluster. You can also separate two types of workloads on two sets of nodes while changing the configurations on demand (e.g. add more nodes to support cube building at year end). Setup and management of two clusters is no longer required. Now, you have the flexibility of allocating resources within the same cluster.
Detailed records stored in the cluster may contain sensitive data such as Personally Identifiable Information (PII). In some instances, regulations (like GDPR) may restrict you from moving this data out of the country. This presents a challenge for companies using Hadoop distributions or data virtualization tools that offer no aggregation.
MapR’s mirroring capability solves this problem by keeping data synced across clusters in different locations. Instead of mirroring raw data, MapR mirrors the pre-calculated cubes from the remote cluster. This allows access to aggregated results from the cubes at HQ while keeping the PII in the original cluster.
You can see by now why these great features and capabilities make MapR is an ideal platform for Kyligence to run on. This joint solution enables businesses to accelerate their analytics on petabytes of data at the speed of thought while releasing IT from tedious administrative work.
If you’re ready to supercharge your MapR experience with augmented OLAP analytics or just want to know more, you can visit www.kyligence.io and www.mapr.com and get started today for free. Also, if you're evaluating Apache Kylin and would like to know how it compares to Kyligence as an OLAP solution, we recommend you check out our Apache Kylin Comparison page.
Learn about the fundamentals of a data product and how we help build better data products with real customer success stories.
Learn about the importance of the Metrics Layer and its impact on data analysis and decision-making. Enables businesses to measure, track, and interpret KPI effectively.
Learn about metrics store and how it can help enterprises achieve metrics reusability, consistency, self-service definition, and scalability.
Everything you should know about Metrics Store and how to extend DataOps practices to managing your business metrics. Read Now.
Read on to learn the key competencies and critical features to look for when evaluating a semantic layer offering for your BI tool.
Kyligence Zen intelligently manages data in the retail industry. Read to learn how to develop the "North Star Metric" system to track goals and progress.
99 Almaden Boulevard Suite #663
San Jose, CA 95113
+1 (669) 256-3378
Ⓒ 2023 Kyligence, Inc. All rights reserved.
Already have an account? Click here to login
您还可以在云平台中 部署 Kyligence
直接获得 30 天免费试用
请填写真实信息，我们会在 1-2 个工作日内电话与您联系。