Kyligence Enterprise 3: Quick Start Guide

This guide will explain how to quickly install Kyligence Enterprise on a single node.

Download and Install Kyligence Enterprise

0. Before proceeding, please make sure the Prerequisite is met.

1. Get Kyligence Enterprise software package. You can visit Kyligence Download Center and choose a version according to your environment.

2. Decide the installation location and the Linux account to run Kyligence Enterprise. All the examples below are based on the following assumptions:

  • The installation location is /usr/local/.
  • Linux account to run Kyligence Enterprise is root. It is called the Linux account hereafter.

Note: Replace the above with your real installation location and Linux account in all the steps in this guide. For example, the default user for the CDH sandbox should be cloudera rather than root.

3. Copy Kyligence Enterprise software package to your server or VM, and unpack.

cd /usr/local
tar -zxvf Kyligence-Enterprise-{version}.tar.gz

4. Set environment variable KYLIN_HOME to be the folder path where Kyligence Enterprise is unpacked:

export KYLIN_HOME=/usr/local/Kyligence-Enterprise-{version}

5. Create a working directory for Kyligence Enterprise on HDFS and grant the Linux account with the permission with r/w access. The default working directory is /kylin. Also ensure the Linux account has access to its home directory on HDFS.

hdfs dfs -mkdir /kylin
hdfs dfs -chown root /kylin
hdfs dfs -mkdir /user/root
hdfs dfs -chown root /user/root

If necessary, you can modify the path of the Kyligence Enterprise working directory in $KYLIN_HOME/conf/kylin.properties.

Note: If you do not have permission to execute the above commands, you can su to HDFS account and try again.
su hdfs
# then retry the above hdfs dfs commands

 

Quick Configuration for Kyligence Enterprise

Under $KYLIN_HOME/conf/, there are two sets of configuration ready for use: profile_prod and profile_min. The former is the default configuration, which is recommended for production environment. The latter uses minimal resource, and is suitable for sandbox and other limited single node. Run the following command to switch to profile_min if your environment has only limited resource:

rm -f $KYLIN_HOME/conf/profile
ln -sfn $KYLIN_HOME/conf/profile_min $KYLIN_HOME/conf/profile

 

Start Kyligence Enterprise

Run the following command to start Kyligence Enterprise:

$KYLIN_HOME/bin/kylin.sh start

Note: If you want to observe the detailed startup progress, run:
tail -f $KYLIN_HOME/logs/kylin.log

Once the startup is completed, you will see information prompt in the console. Run the below command to check the Kyligence Enterprise process at any time.

ps -ef | grep kylin

 

Use Kyligence Enterprise

After Kyligence Enterprise is started, open web GUI at http://{host}:7070/kylin. Please replace host with your host name, IP address, or domain name. The default port is 7070. The default username and password are ADMIN and KYLIN . After the first login, please reset the administrator password according to the password rule.

  • At least 8 characters.
  • Contains at least one number, one letter, and one special character (~!@#$%^&*(){}|:”<>?[];’,./`).

Now, you can verify the installation by building a sample cube. Please continue to Install Validation.

 

Stop Kyligence Enterprise

Run the following command to stop Kyligence Enterprise:

$KYLIN_HOME/bin/kylin.sh stop

You can run the following command to check if the Kyligence Enterprise process has stopped.

ps -ef | grep kylin

 

Import Sample Dataset

Kyligence Enterprise support to use Hive as the default data source. You can import the Kyligence Enterprise built-in sample data into Hive using executable scripts. The script is sample.sh. Its default storage path is the bin directory under KYLIN_HOME. For more details, please refer to Sample Dataset.

$KYLIN_HOME/bin/sample.sh

Note: After sample.sh is executed, it is required to choose Reload Metadata under the System page. Otherwise, there will be errors in data modeling.

 

Build Cube

After importing the sample data, please access learn_kylin project and build kylin_sales_cube.

Build Cube

Verify SQL

After cube built successfully, you can query cube in Insight page.

Note:

1. Only SELECT query is supported.

2. When query pushdown is enabled, queries that cannot be served by cube will be routed to the pushdown engine for execution. In this case, it will take longer to return.

After query results returned successfully, you can find the name of the answering cube in the Query Engine item.

select PART_DT,COUNT(*)
from KYLIN_SALES
group by PART_DT

 

FAQ

Q:How to change the default port?

You can use the below commands to change the port, which will be set an offset to 7070.

$KYLIN_HOME/bin/kylin-port-replace-util.sh set PORT_OFFSET

Q:How to use Beeline to connect Hive?

Please refer to Use Beeline to Connect Hive.

Q:If my cluster is based on JDK 7, how to run Kyligence Enterprise?

Please follow the steps in Run Kyligence Enterprise on JDK 7.

Q:Does Kyligence Enterprise support to integrate with Kerberos?

Yes, if your cluster enables Kerberos security, the Spark embeds in Kyligence Enterprise needs proper configuration to access your cluster resource securely. For more information, please refer to Integrate with Kerberos.

Prerequisite

 

For better system performance and stability, we recommend that you run Kyligence Enterprise on a dedicated Hadoop cluster. Each server in the cluster should be configured with HDFS, YARN, MapReduce, Hive, Kafka and so on. Among them, HDFS, Yarn, MapReduce, Hive, and Zookeeper are mandatory components.

Next we will introduce the prerequisite of Kyligence Enterprise installation.

Java Environment

Kyligence Enterprise, and the Hadoop nodes that it runs on top of, require:

  • JDK 8 (64 bit) or above

If your Hadoop cluster is based on JDK 7, please refer to Run Kyligence Enterprise on JDK 7.

Account Authority

The Linux account running Kyligence Enterprise must have required access permissions to Hadoop cluster. These permissions include:

  • Read/Write permission of HDFS
  • Create/Read/Write permission of Hive table
  • Create/Read/Write permission of HBase table
  • Execution permission of MapReduce job

Verify if user have access to the Hadoop cluster assuming the account is KyAdmin. Here are the specific test steps:

  1. Verify whether user have HDFS read and write permissionsAssuming the HDFS storage path for cube data is /kylin, setting in conf/kylin.propertiesis:
    kylin.env.hdfs-working-dir=/kylin
    The storage folder must be created and granted with permissions. You may have to switch to HDFS administrator, usually the hdfs user, to do this:su hdfs
    hdfs dfs -mkdir /kylin
    hdfs dfs -chown KyAdmin /kylin
    hdfs dfs -mkdir /user/KyAdmin
    hdfs dfs -chown KyAdmin /user/KyAdmin

    Verify the KyAdmin user have read and write permissions
    hdfs dfs -put /kylin
    hdfs dfs -put /user/KyAdmin
  2. Verify whether the KyAdmin user have Hive read and write permissionsAssuming that the Hive database storing the intermediate table is kylinDB. You need to manually create a Hive database named kylinDB, and define it in conf/kylin.properties:kylin.source.hive.database-for-flat-table=kylinDBThen verify the Hive permissions:#hive
    hive> show databases;
    hive> create database kylinDB location "/kylin";
    hive> use kylinDB;
    hive> create table t1(id string);
    hive> drop table t1;
    In Hive, the current user needs to be authorized to access the Kyligence Enterprise HDFS working directory, which is /kylin in this case:hive> grant all on URI "/kylin" to role KyAdmin;
  3. If you use HBase as metastore, please verify whether the KyAdmin user have HBase read and write permissionsAssume that the HBase table for storing metadata is XXX_instance (Kyligence cluster unique identifier), the HBase namespace is XXX_NS. Setting in conf/kylin.properties is:kylin.metadata.url=XXX_NS:XXX_instance@hbaseVerify:#hbase shell
    hbase(main)> list
    hbase(main)> create 't1',{NAME => 'f1', VERSIONS => 2}
    hbase(main)> disable 't1'
    hbase(main)> drop 't1'
    If user do not have permission, ask administrator for authorization, authorization command hbase(main)> grant 'KyAdmin', 'RWXCA'.

Supported Hadoop Distributions

The following Hadoop distributions are verified to run Kyligence Enterprise. The bold are major test versions.

  • Cloudera CDH 5.7 / 5.8 / 5.11 ~ 5.13 / 6.0 / 6.1
  • Hortonworks HDP 2.4 / 2.6
  • MapR 6.0.1
  • Huawei FusionInsight C60 / C70
  • Azure HDInsight 3.6
  • AWS EMR 5.14 ~ 5.16 / 5.23

The following Hadoop distributions used to be verified but the tests are not maintained any more:

    • Hortonworks HDP 2.2
    • MapR 5.2.1

Hadoop Cluster Resource Allocation

To enable Kyligence Enterprise to efficiently complete the tasks, please ensure that the configuration of the Hadoop cluster satisfies the following conditions:

  • yarn.nodemanager.resource.memory-mb configuration item bigger than 8192 MB
  • yarn.scheduler.maximum-allocation-mb configuration item bigger than 4096 MB
  • mapreduce.reduce.memory.mb configuration item bigger than 700 MB
  • mapreduce.reduce.java.opts configuration item bigger than 512 MB

If you need to run Kyligence Enterprise in sandbox and other virtual machine environments, please make sure that the virtual machine environment can get the following resources:

  • No less than 4 processors
  • Memory is no less than 10 GB
  • The value of the configuration item yarn.nodemanager.resource.cpu-vcores is no less than 8

We recommend that you use the following hardware configuration to install Kyligence Enterprise:

  • 32 vCore, 128 GB memory
  • At least one 1TB SAS HDD (3.5 inches), 7200RPM, RAID1
  • At least two 1GbE Ethernet ports

We recommend that you use the following version of the Linux operating system:

  • Red Hat Enterprise Linux 6.4+ or 7.x
  • CentOS 6.4+ or 7.x
  • CPU: 2.5 GHz Intel Core i7
  • Operating System: macOS / windows 7 / windows 10
  • RAM: 8G or above
  • Browser version:
    • Firefox 60.0.2 or above
    • Chrome 67.0.3396 or above

Read More

 

For more detailed information about how to use Kyligence Enterprise, please refer to Kyligence Enterprise 3 Manual.

If you’re curious about Apache Kylin and would like to know how it stacks up against Kyligence, visit our Apache Kylin Comparison page.

Also, be sure to follow us on LinkedIn and Twitter for the latest Kyligence product updates and Augmented Analytics announcements.

Want to know what Kyligence can do for you?