Excel Your KPIs with AI Copilot Start for free today
Your AI Copilot for Data
Definitive Guide to Decision Intelligence
Subscribe to our newsletter>
Get the latest products updates, community events and other news.
Apache Kylin is an open source distributed big data analytics engine. It constructs data models on top of huge datasets, builds pre-calculated OLAP cubes to support multi-dimensional analysis, and provides a SQL query interface and multi-dimensional analysis on top of Hadoop, with general ODBC, JDBC, and RESTful API interfaces. Apache Kylin’s unique pre-calculation ability enables it to handle extremely large datasets with sub-second query response times.
1.The mature, Hadoop-based computing engines (MapReduce and Spark) that provide strong capability of pre-calculation on super large datasets, which can be deployed out-of-the-box on any mainstream Hadoop platform.
2. Support of ANSI SQL that allows users to do data analysis with SQL directly.
3. Sub-second, low-latency query response times.
4. Common OLAP Star/Snowflake Schema data modeling.
5. A rich OLAP function set including Sum, Count Distinct, Top N, Percentile, etc.
6. Intelligent trimming of Cuboids that reduces consumption of storage and computing power.
7. Support of both batch loading of super large historical datasets and micro-batches of data streams.
Druid was created in 2012. It’s an open source distributed data store. Its core design combines the concept of analytical databases, time-series databases, and search systems, and it can support data collection and analytics on fairly large datasets. Druid uses an Apache V2 license and is an Apache incubator project.
Apache Druid Architecture
From the perspective of deployment architectures, Druid’s processes mostly fall into 3 categories based on their roles.
The Historical node is in charge of loading segments (committed immutable data) and receiving queries on historical data.
Middle Manager is in charge of data ingestion and commit segments. Each task is done by a separate JVM.
Peon is in charge of completing a single task, which is managed and monitored by the Middle Manager.
Broker receives query requests, determines on which segment the data resides, and distributes sub-queries and merges query results.
Coordinator monitors Historical nodes, dispatches segments and monitor workload.
Overlord monitors Middle Manager, dispatches tasks to Middle Manager, and assists releasing of segments.
At the same time, Druid has 3 replaceable external dependencies.
Druid uses Deep storage to transfer data files between nodes.
Metadata Storage stores the metadata about segment positions and task output.
Druid uses Zookeeper (ZK) to ensure consistency of the cluster status.
99 Almaden Boulevard Suite #663
San Jose, CA 95113
+1 (669) 256-3378
Ⓒ 2023 Kyligence, Inc. All rights reserved.
Already have an account? Click here to login
A complete product experience
A guided demo of the whole process, from data import, modeling to analysis, by our data experts.
Q&A session with industry experts
Our data experts will answer your questions about customized solutions.
Please fill in your contact information.We'll get back to you in 1-2 business days.