Meet Your AI Copilot fot Data Learn More
Your AI Copilot for Data
Kyligence Zen Kyligence Zen
Kyligence Enterprise Kyligence Enterprise
Metrics Platform
OLAP Platform
Customers
Definitive Guide to Decision Intelligence
Recommended
Resources
Apache Kylin
About
Partners
Whether you’ve just deployed Apache Kylin or have been using it for a while, there’s a lot Kylin can do, or ways it can be further optimized, that you might not know about.
The good news is that there’s no shortage of online resources for discovering the tips, tricks, and features that can greatly enhance the impact Kylin has on Big Data work in your organization.
In this blog post we are going to share some of the most useful Kylin resources out there so you can be sure you’re getting maximum value out of your Kylin deployment.
Some users may be concerned that Apache Kylin uses HBase as its storage engine for cubes. HBase is a high-performance wide column database in the Hadoop ecosystem, and while HBase is great for processing large amount of data in a distributed, scalable architecture, it is also known for being hard to manage and maintain.
For an overview of how Kylin uses HBase, you should check out this presentation on Apache Kylin and HBase by Kylin PMC Shaofeng Shi.
When purging, dropping, or merging cubes, some HBase tables may be left in HBase and will no longer be queried. This section on storage cleanup for HDFS and HBase in the Apache Kylin documentation is extremely helpful if you’re searching for how to clean up storage.
Sometimes, you may want to export metadata and cubes stored in HBase for migration or backup. If that’s the case, you’ll want to refer to this post regarding the cleanup and backup of HBase tables.
HBase coprocessor is required by Kylin. To update, please refer to this command for updating HBase coprocessor.
Lastly, If HBase requirements are an issue for you, you may want to take a look at Kyligence which is built on top of Kylin but does NOT require HBase. You can find more information about Kyligence here or just keep reading to the end of this article.
Kylin has evolved from a batch-based historical data analytics engine to support both batch and real time data. A great resource for learning more about this evolution is the 2017 talk Apache Kylin PMC Yang Li gave examining the real-time processing capabilities of Kylin.
The most recent Apache Kylin release has improved real-time processing by reducing delays down to a few seconds. This is enough to support most soft real-time requirements where a couple seconds of delay (from event happening to data being available in the cube) is acceptable.
If you’re in need of an overview, this documentation on the Real-Time OLAP in Kylin 3.0 should have everything you’re looking for. Two additional resources you’ll find useful will be these high-level real-time designs and a recent post that offers an even deeper look into real-time processing with Kylin.
If you’d like to better understand how Kylin developers are benchmarking the software, the best resource will be this guide to star schema benchmarking for Apache Kylin.
You can also go beyond benchmarking and improve performance further with this great documentation covering optimized OLAP cube design.
If you find the documentation above useful, eBay has also published some insightful supplementary content on how to build OLAP cubes efficiently and the visualization of cuboids. You may also be interested in this post outlining how to improve Spark Cubing.
Only Kylin makes it possible to implement popular advanced functions on large datasets. One such example is the powerful CountDistinct function. CountDistinct accuracy is extremely difficult when working with large datasets, that’s why most other systems chose to estimate distinct count with a HyperLogLog algorithm.
Kylin, however, implements both an approximate and precise CountDistinct. This makes it extremely valuable for use cases such as user behavior analysis. You can get started using CountDistinct with Kylin with this introduction on CountDistinct in Kylin.
Another useful advanced function is Top-N calculation. This is especially true in data mining where finding Top N entities within a dataset is a common requirement. Kylin performs impressively when it comes to implementing Top-N functionality efficiently in a Big Data environment.
If Top-N calculations are important to you, you can easily implement it into your Kylin deployment with this introduction to Kylin and Top-N calculations.
If you’re reading this, you likely already know how great Kylin is for improving the performance and scalability of your team’s Big Data work. But Depending on your organization’s size, the scale of your datasets, and your IT and data governance rules, Kylin may be falling short.
The links and advice above can help a lot with ensuring your Kylin deployment can keep up with your technical and business-related needs, but there may come a time when you feel it isn’t enough.
Fortunately, you don’t have to abandon OLAP and the performance it delivers to your Big Data work when that happens. Kyligence offers a suite of solutions that take a similar OLAP-based approach to Big Data, but with a focus on serving enterprise-level needs.
Available on-prem, the cloud, or in hybrid environments, Kyligence offers the same powerful features of Kylin and then some. It’s also built to enable a higher level of performance on truly massive datasets along with high concurrency capabilities that make securely scaling and expanding your user base very easy.
Is Kyligence right for you? It is possible Apache Kylin is sufficient for your business's unique needs, but if you’ve been running into roadblocks with it or find that it’s missing critical features and integrations you wish it had, it could be time to see if an upgrade to Kyligence makes sense.
If you’re curious, we recommend you take a look at our Kylin comparison page and download our detailed Kylin vs. Kyligence comparison guide.
Compare Kylin and Kyligence Now
And if you've got some time, another great introduction to Kyligence, its connection to Kylin, and how it might be the best choice for the work you're trying to do, you should check out this recent presentation by Kylin PMC (and Kyligence CTO) Yang Li:
Kyligence was developed by the same founding team behind the Apache Kylin project, and we’re always here to help if you have additional questions about Kylin and Kyligence. Feel free to make use of our experience to help troubleshoot any future Kylin issues you may run into. Just contact us.
Also, be sure to follow us on LinkedIn and Twitter where we’re always sharing more updates about Kylin and OLAP technology.
Learn about the fundamentals of a data product and how we help build better data products with real customer success stories.
Unlock potentials of analytics query accelerators for swift data processing and insights from cloud data lakes. Explore advanced features of Kyligence Zen.
Optimize data analytics with AWS S3. Leverage large language models and accelerate decision-making.
Optimize data analytics with Snowflake's Data Copilot. Leverage large language models and accelerate decision-making.
Discover the 7 top AI analytics tools! Learn about their pros, cons, and pricing, and choose the best one to transform your business.
Discover operational and executive SaaS metrics that matter for customers success, importance, and why you should track them with Kyligence Zen.
Unlock the future of augmented analytics with this must-read blog. Discover the top 5 tools that are reshaping the analytics landscape.
What website metrics matter in business? Learn about categories, vital website metrics, how to measure them, and how Kyligence simplifies it.
Already have an account? Click here to login
You'll get
A complete product experience
A guided demo of the whole process, from data import, modeling to analysis, by our data experts.
Q&A session with industry experts
Our data experts will answer your questions about customized solutions.
Please fill in your contact information.We'll get back to you in 1-2 business days.
Industrial Scenario Demostration
Scenarios in Finance, Retail, Manufacturing industries, which best meet your business requirements.
Consulting From Experts
Talk to Senior Technical Experts, and help you quickly adopt AI applications.