By Use Cases
By BI Tool
Subscribe to our newsletter>
Get the latest products updates, community events and other news.
Business intelligence and analytics tools aren’t just used by data analysts and IT teams. Today, Business Intelligence (BI) is also used by executives, partners, and customers. Due to self-service business models and personalized marketing, and many other changes in the market, secure, reliable, and fast BI is of increasing importance in every industry.
For the enterprise running hyper-targeted marketing campaigns, tracking unique buying patterns in regional markets, or analyzing user behavior to detect fraud, BI users must have immediate access to the most recent company data. BI is only effective if the data is correct, complete, and relevant. In this article, I’ll share the top three reasons why your BI applications might be frustratingly slow – impacting both completeness and relevance of reporting.
Firstly, BI must be clearly defined in this article. BI is the technology-driven process for analyzing data and presenting actionable insights to decision makers. It helps managers and executives make informed decisions, which can result in a higher return-on-investment, reduced risk, and a competitive edge. BI is not simply an application, data analytics tool, or organization in a company, however in this article, I am specifically addressing why your BI applications might be slow.
Often, making an update to a visualization can take several seconds, but many times there are updates that take tens of minutes to process before being displayed. In the worst cases, large amounts of data simply aren’t available to the analyst.
Imagine delivering a report about the total sales in North America, and later learning that a new product bundle hadn’t been included in the data model – a new product bundle that marketing and sales had been pushing for the past three months. Without that data in the report, not only could the executive team risk making an uninformed decision, but they also wouldn’t be very happy to hear that the report was incomplete, period.
This scenario isn’t uncommon. Analysts rushing to meet a reporting deadline, for a quarterly business review or an all-hands meeting, are often met with a tough decision. If the data model hasn’t been updated to include new business data, then the analyst usually can either miss the deadline, and request that the data model be updated, or they can deliver an incomplete report, providing explanation for the missing data.
Sometimes the time that is required to develop and deploy the new data model, and process the changes, just takes too long to consider the deadline as feasible. To prevent this difficult decision from ever actually occurring, the enterprise first requires an understanding of why their BI applications may be too slow in the first place. Below are three major causes.
The number one reason why BI applications are running slowly is because of the amount of data that needs to be processed. If the total amount of data can fit nicely into a spreadsheet, requiring less than 20,000 rows and 20 columns, well then, that report can be ready in “no time”. With today’s big data requirements, though, exceeding a billion rows and hundreds of columns can result in mind-numbing wait times – assuming your big data analytics tools and applications can even handle that much data.
Now, imagine if an analyst made the small, human mistake while building that massive report by accidently selecting the wrong dimension. Not only was that time just wasted, but the report has to run all over again, and costly memory resources are being devoted to a junk report. Time to correct the mistake and wait for that updated report, all over again.
Most enterprises have large-scale data sets like this that can cause analysts to wait over 10 minutes for a report to update, and reupdate, and reupdate… And it’s not just traditional architectures that are challenged by the volume of data. Distributed systems also have limitations, especially for interactive experiences.
Massive volumes of data generally don’t reside on simple systems. The technology implemented so that BI can handle massive data volumes is very complicated. Billions of rows of structured and unstructured raw data simply won’t fit into a normal spreadsheet for analysis, or into a simple database on a desktop computer.
Normally, the data starts out stored in multiple databases. Then the data from each database is extracted, transformed, and loaded into a data warehouse. The data warehouse is often built on cloud and datacenter technology, distributing the processing and storage work across specialized equipment.
If any part of this technology stack is configured incorrectly or fails, then BI will certainly suffer. That means that the software that manages each type of data, and every process that touches the data, must operate effectively. When there is a problem with the software or any process, if the report doesn’t fail completely, it can result in a very long wait time.
While many enterprises have adopted new technology to keep up with market demands and competition, their IT organizations often haven’t migrated every business unit to a unified Big Data analytics platform, haven’t developed their technology with BI in mind, or haven’t implemented technology that supports OLAP Big Data analysis. This is a very common challenge that results in slow BI.
Even with the perfect system built, and massive sets of data moving across it, there is one more major cause for slow BI – many users accessing data and resources at the same time. With many users accessing the same resources, IT infrastructure has to split processing power and bandwidth to serve each user. In short, reporting-speed suffers with every additional user. Enterprises often have thousands of employees that need access to the data, and many times, partners and customers need access to some of the same data.
Potentially, hundreds-of-thousands of concurrent users could be accessing the same data and resources at the same time. Imagine using only one one-thousandth of your current computer’s processing power to run an analysis on billions of rows of data – good luck meeting those tight deadlines. Often, enterprises have implemented Impala or PrestoDB to improve the speed of their reporting, but those can’t handle high concurrency. Even with just 10 users, BI will slow down due to the nature of the architecture.
I’d like to say that overcoming these Big Data analytics challenges is simple and that every forward-thinking enterprise has overcome them, but that wouldn’t be true. Many enterprises deal with other challenges that present roadblocks to success, such as shadow IT or buggy, automated scripts. While I can’t deliver the “magic bullet” to resolve all BI performance headaches, there are three things to think about when evaluating potential solutions.
Firstly, make sure the solution can accept a variety of data sources from distributed locations. This will prepare you to not only analyze any current distributed data, but also any likely data from future sources.
Secondly, make sure the solution can consistently process petabytes of data in seconds, rather than minutes. If the system can’t demonstrate stability, that’s an obvious long-term issue. Data is only expected to grow, so don’t set yourself up for only handling “small data” in the future.
Thirdly, make sure it handles thousands, or hundreds of thousands, of concurrent users. As new Big Data platforms are developed, the number of users and potential users are rising incredibly fast. Data won’t be shrinking anytime soon, and neither will user counts. To enable new services and product innovation, make sure your system is scalable enough to provide stable services to thousands of concurrent users, and not just ten or one hundred.
Using BI applications shouldn’t be a frustrating experience, and thankfully there are solutions available to ease analyst pain (other than tradeshow stress balls). The best solutions were built for scale from the very start and make use of artificial intelligence to eliminate IT workload.
You don't have to waste another minute trying to troubleshoot your slow BI tools. Kyligence Enterprise and Kyligence Cloud's OLAP on Hadoop BI solutions are recognized across the Big Data industry for their unmatched speed and massive scalability. Want to see why? This video on augmented OLAP analytics explains it all:
If you think you're ready to evaluate Kyligence now, you can request a free demo of Kyligence, here. And if you're considering Apache Kylin as a solution, you can see how its features and capabilities match up with Kyligence on our Apache Kylin Comparison page.
Learn about the fundamentals of a data product and how we help build better data products with real customer success stories.
Learn about the components of Kyligence Cloud platform that enable it to be a production-ready self-managed distributed computing system.
See how data engineers can leverage the newly launched Data Assets Dashboard to identify the most valuable data, dimensions, and metrics.
Learn about SQL Server Analysis Services (SSAS), an online analytical processing (OLAP) and data mining tool in Microsoft SQL Server.
Learn about the Distributed MDX Engine, an enhanced semantic layer architecture on the cloud, and how it can boost your data analytics.
A detailed step-by-step guide on how to connect Excel to ClickHouse within Kyligence Tiered Storage and achieve sub-second query latencies.
The fact that OLAP software can be utilized to gather these big data insights is proof of its relevancy today. And it will continue to persist because, again, it is rooted in sound theory.
99 Almaden Boulevard Suite #663
San Jose, CA 95113
+1 (669) 256-3378
Ⓒ 2022 Kyligence, Inc. All rights reserved.
Already have an account? Click here to login