Scale Your OLAP to Petabytes on Google Cloud Platform

08 - 14 - 2018

Kyligence – the company behind Apache Kylin – announces that its enterprise OLAP (online analytical processing) engine on Hadoop – Kyligence Enterprise – can now run on Google Cloud Platform. Highly integrated with Google Cloud Storage and DataProc, Kyligence scales your OLAP to petabytes of data, make your BI applications work on your data lakes.

Kyligence Enterprise can achieve sub-second query latency on petabyte-scale dataset on Hadoop by building pre-aggregation cubes. And with an easy drag and drop user interface and intelligent data modeling assistant that Kyligence Enterprise has offered, it greatly simplifies the OLAP cube building process.

By deploying on Google Cloud Platform, Kyligence Enterprise helps you matching the dynamic computing and analytics requirements, reducing the operation cost and accelerating business development on cloud.

In this article, we will introduce you a step-by-step guide of how to run Kyligence on Google Cloud Platform. We will have 3 sections:

  1. Intro of Kyligence
  2. Prepare Google Account
  3. Deploy and run Kyligence on Google Cloud Platform

About Google Cloud Platform

Google Cloud Platform, offered by Google, is a suite of cloud computing services that runs on the same infrastructure that Google uses internally for its end-user products.  Alongside a set of management tools, it provides a series of modular cloud services including computing, data storage, data analytics and machine learning.

About Kyligence

Kyligence is a data-tech company focusing on big data analytics and founded by the team who created Apache Kylin – the first top-level project in Apache Software Foundation (ASF) from China. Powered by Apache Kylin, Kyligence provides Kyligence Enterprise – an intelligent enterprise big data analytics platform for the on-premise market, and also Kyligence Cloud for the cloud solution. Kyligence has won lots of well-known customers across different industries including Huawei, China Unicom, OPPO, SAIC, Pacific Insurance Group, China UnionPay, Guotai Junan Securities.

Prerequisite

  1. Apply a Google account
  2. Apply free trial of Kyligence via https://cloud.kyligence.io/#/cloudapply and select Google Cloud Platform as platform to run your analytic workloads

Prepare Your Google Account

Create related resources on Google Cloud Platform

To run Kyligence on Google Cloud Platform, it relies on the following Google cloud resources and services. You need to prepare your Google account as well as related resources beforehand.

  • DataProc
  • Compute Engine
  • Storage
  • VPC
  • Cloud SQL
  1. Create a Project

In Google Cloud Platform console, click the current project name in the upper left corner.

Select NEW PROJECT in the pop-up window, enter a new project name, and click CREATE.

  1. Enable API to access the Project

Please ensure the following APIs are enabled:

  • Stackdriver Logging API
  • Compute Engine API
  • Cloud Dataproc Control API
  • Cloud Dataproc API
  • Cloud SQL Admin API

To see API usage in the current Project, select API and Services in the menu, then select Dashboard.

To find the API and enable it, select the library in the API and services, search the API in the API library, select the required API and click Enable.

  1. Create a Service Account and authorize

Select IAM & admin and in the menu, then select Services accounts.

At the top of the page, select CREATE SERVICE ACCOUNT and fill in the service account name on the right. Select the owner in the project role.

Check the box below to Furnish a new private key, and select the key type as JSON, and click Save.

  1. Create Storage

Select Storage – Browser – CREATE BUCKET

Name the new bucket, select the storage category Multi-Regional, and finally click create.

The differences between the different storage types are shown below:

  1. Create VPC

Select the VPC network in the VPC network on the left side of the console and click CREATE VPC NETWORK.

Fill in the VPC network name, select the subnet creation mode as automatic, then click create.

Deploy Kyligence on Google Cloud Platform

Kyligence offers an online service – Kyligence Cloud to ease the deployment of Kyligence Enterprise on Google Cloud Platform. You can finish the Hadoop as well as Kyligence service deployment via simple clicks within 30 minutes.

  1. Login https://cloud.kyligence.io using your own account and click Create Cluster.

  1. Fill in the cluster name.

  1. At cluster topology section, specify your cluster size with input the number of worker nodes.

Note: Edge node is where Kyligence service runs

  1. Click +Account and fill in your Google account in the first blank, and private key (please select the private key, launch the saved private key file in note text format, copy and paste all the contents of the private key into the second blank) in the second blank, click submit.

  1. After selecting the entered Google account, select the region and select VPC, subnet, storage space.

Note: Kyligence Cloud will read the list of VPCs, subnets, and storage accounts in your Google account. If it is not possible, please check if the Google account and private key you entered are correct.

  1. Choose the version of Kyligence Enterprise you want to deploy. You can also choose to install KyAnalyzer and enable email notification. Then click Submit.

 

  1. Start the cluster. In the cluster page of Kyligence Cloud Portal, click the start button  and wait until the cluster status is changed to RUNNING.

Note: The new cluster startup will take about 20 minutes

  1. After the cluster is successfully started, you can launch Kyligence Enterprise for OLAP modeling and analysis. For more details about how to use Kyligence Enterprise, please visit HERE.

Summary

Via seamless integration with Google Cloud Platform, Kyligence Enterprise scales OLAP to petabytes of data, make your BI applications work on your data lakes. It helps you matching the dynamic computing and analytics requirements, reducing the operation cost and accelerating business development on cloud. 

Recent Post

Apache Kylin v2.5.0 Release Announcement

Apache Kylin v2.5.0 Release Announcement

Sep 20, 2018 • Shaofeng Shi The Apache Kylin community is pleased to announce the release of Apache Kylin v2.5.0. Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Big Data supporting extremely large datasets. This is a major release after 2.4.0. There are many […]
Read More

Quick Start Guide: Kyligence Enterprise on Microsoft Azure Marketplace

Quick Start Guide: Kyligence Enterprise on Microsoft Azure Marketplace

Chapter 1: Take the First Steps to Explore Kyligence Enterprise   In this section, you will learn to install Kyligence Enterprise on a new HDInsight cluster and play with the sample cube using Kyligence Enterprise, highlighting some of the most common tasks. In fact, you can either create a new cluster or use an existing […]
Read More

Kyligence Insight for Superset: Data Visualizations enriching Apache Kylin ecosystem

Kyligence Insight for Superset: Data Visualizations enriching Apache Kylin ecosystem

With the ever-growing data volume and complexity, the legacy method of data processing and analytics can longer satisfy the demand of uncovering actionable insight from big data. To build an infrastructure that supports swift and effective big data analytics, enterprises turn to Kyligence technology as their de facto big data analytics solution. For those companies, […]
Read More