Kyligence Cloud: Azure Marketplace Quick Start Guide

Get going with Kyligence Cloud in the Azure Marketplace with this step-by-step quick start guide.

Kyligence Cloud on Azure Marketplace

Prerequisites:

In order to create your Kyligence Cloud application, you will need to request an Azure account and a resource group from your cloud platform system administrator.

The role in this resource group needs to have at least a Contributor.

Create Azure AD account

You can create your Azure AD account by Create Account document.

Create resource group

You can add a resource group by clik Add button in Azure resource groups page.In this case it is named Kylin.

Create app

Click on Azure Active Directory in the Azure portal, click on Select App registrations -> New registration, in this case we use the name kyligence-cloud.

Select the application type as Web and the login URL can be empty. Then click on the newly created application Kyligence-cloud , and record the Application ID and Directory ID.

Click Certificate & secrets under the management page, and click +New client secret button,then select the key validity period and click Save. The page will display the key value, please save this key value properly.

Assign resource group with Contributor role

We only need to assign this newly created resource group with Contributor role , and the later action taken from Kyligence Cloud will be limited to this resource group.You can refer to Azure Access Control document.

Make sure your Azure account has at least the participant role permissions for the relevant resource group or subscription.

Go to Azure Resource groups - Access control page, select the Kylin Resource Group, click the +Add button and select Add role assignment.Choose the application Kyligence-cloud you just created, and give it Contributor or higher role permissions :

assign Contributor to your app

Please search for Kyligence Cloud in Azure Marketplace and click create button, Please fill in the following information:

resource group:use created Kylin before

Identifier:name the created resource

Applicationid:Application ID

Applicationkey:Please enter the key value

Tenantid:Please enter Object ID

Vm User Name:Create a username to log in to the virtual machine

Vm Password:Create a password to log in to the virtual machine

Vminstance Vm Size:Select the size of the virtual machine

After the information is completed, please check the agreement of terms, and then click purchase. Kyligence Cloud will automatically create the required resources and services in the selected resource group. After the creation is successful, you can view the created virtual machine in the deployed resource group.The Kyligence Cloud service to start takes
about 20 minutes, depending on the network service environment.

After the service is started, use the virtual machine IP:8079 to enter the Kyligence Cloud main page. The first time you enter the main page, you will be prompted to enter the Kyligence Cloud license, and then use the user/password: ADMIN/KYLIN to log in.

Create Datasource & Project

Kyligence Cloud provides administrators with a globally unified datasource and project management page where administrators can manage all data sources and view/exit all projects created by the current Kyligence Cloud.

  • Go to the Management-Datasource page. The Kyligence Cloud has a built-in file data source named Default. Click Edit to enter the file data source edit page. On the edit page, you can access files in your Azure Blob Storage directly using Kyligence Cloud.

Note: If you have not used Azure Blob Storage yet, you can refer to the Azure Blob Storage Quickstart documentation.

  • In this example, we use the provided sample file data. First you need to download these CSV files and Upload them to the directory location on blob storage:

KYLIN_ACCOUNT
KYLINCALDT
KYLINCATEGORYGROUPINGS
KYLIN_COUNTRY
KYLIN_SALES

After modify the DDL with right location, Click +Database/Table on the right side of the page, select DDL Mode and enter the following table statement:

Note: The location in the following table statement is for example only and does not exist.

For Global Account:

\<your*blob*address\> is generally @.blob.core.windows.net

For China account:

\<your*blob*address\> is generally @.blob.core.chinacloudapi.cn

use default;
drop table if exists kylin_sales;
create table kylin_sales(TRANS_ID bigint,PART_DT date COMMENT "Order Date",
LSTG_FORMAT_NAME string COMMENT "Order Transaction Type",
LEAF_CATEG_ID bigint COMMENT "Category ID",
LSTG_SITE_ID int COMMENT "Site ID",
SLR_SEGMENT_CD smallint,
PRICE decimal(19,4) COMMENT "Order Price",
ITEM_COUNT bigint COMMENT "Number of Purchased Goods",
SELLER_ID bigint COMMENT "Seller ID",
BUYER_ID bigint COMMENT "Buyer ID",
OPS_USER_ID string COMMENT "System User ID",
OPS_REGION string COMMENT "System User Region")COMMENT "Sales order table,
fact table" ROW FORMAT DELIMITED FIELDS TERMINATED BY "," STORED AS TEXTFILE location "wasb://your_blob_address/sampledata/kylin_sales";

drop table if exists kylin_account;
create table kylin_account(
ACCOUNT_ID bigint ,
ACCOUNT_BUYER_LEVEL int COMMENT "Account Buyer Level" ,
ACCOUNT_SELLER_LEVEL int COMMENT "Account Seller Level" ,
ACCOUNT_COUNTRY string COMMENT "Account Country" ,
ACCOUNT_CONTACT string COMMENT "Account Contact Info" )
ROW FORMAT DELIMITED FIELDS TERMINATED BY "," STORED AS TEXTFILE location "wasb://your_blob_address/sampledata/kylin_account";

drop table if exists kylin_cal_dt;
create table kylin_cal_dt(
CAL_DT date COMMENT "Date, PK" ,
YEAR_BEG_DT date COMMENT "YEAR Begin Date" ,
QTR_BEG_DT date COMMENT "Quarter Begin Date" ,
MONTH_BEG_DT date COMMENT "Month Begin Date" ,
WEEK_BEG_DT date COMMENT "Week Begin Date" ,
AGE_FOR_YEAR_ID smallint ,
AGE_FOR_QTR_ID smallint ,
AGE_FOR_MONTH_ID smallint ,
AGE_FOR_WEEK_ID smallint ,
AGE_FOR_DT_ID smallint ,
AGE_FOR_RTL_YEAR_ID smallint ,
AGE_FOR_RTL_QTR_ID smallint ,
AGE_FOR_RTL_MONTH_ID smallint ,
AGE_FOR_RTL_WEEK_ID smallint ,
AGE_FOR_CS_WEEK_ID smallint ,
DAY_OF_CAL_ID int ,
DAY_OF_YEAR_ID smallint ,
DAY_OF_QTR_ID smallint ,
DAY_OF_MONTH_ID smallint ,
DAY_OF_WEEK_ID int ,
WEEK_OF_YEAR_ID tinyint ,
WEEK_OF_CAL_ID int ,
MONTH_OF_QTR_ID tinyint ,
MONTH_OF_YEAR_ID tinyint ,
MONTH_OF_CAL_ID smallint ,
QTR_OF_YEAR_ID tinyint ,
QTR_OF_CAL_ID smallint ,
YEAR_OF_CAL_ID smallint ,
YEAR_END_DT string ,
QTR_END_DT string ,
MONTH_END_DT string ,
WEEK_END_DT string ,
CAL_DT_NAME string ,
CAL_DT_DESC string ,
CAL_DT_SHORT_NAME string ,
YTD_YN_ID tinyint ,
QTD_YN_ID tinyint ,
MTD_YN_ID tinyint ,
WTD_YN_ID tinyint ,
SEASON_BEG_DT string ,
DAY_IN_YEAR_COUNT smallint ,
DAY_IN_QTR_COUNT tinyint ,
DAY_IN_MONTH_COUNT tinyint ,
DAY_IN_WEEK_COUNT tinyint ,
RTL_YEAR_BEG_DT string ,
RTL_QTR_BEG_DT string ,
RTL_MONTH_BEG_DT string ,
RTL_WEEK_BEG_DT string ,
CS_WEEK_BEG_DT string ,
CAL_DATE string ,
DAY_OF_WEEK string ,
MONTH_ID string ,
PRD_DESC string ,
PRD_FLAG string ,
PRD_ID string ,
PRD_IND string ,
QTR_DESC string ,
QTR_ID string ,
QTR_IND string ,
RETAIL_WEEK string ,
RETAIL_YEAR string ,
RETAIL_START_DATE string ,
RETAIL_WK_END_DATE string ,
WEEK_IND string ,
WEEK_NUM_DESC string ,
WEEK_BEG_DATE string ,
WEEK_END_DATE string ,
WEEK_IN_YEAR_ID string ,
WEEK_ID string ,
WEEK_BEG_END_DESC_MDY string ,
WEEK_BEG_END_DESC_MD string ,
YEAR_ID string ,
YEAR_IND string ,
CAL_DT_MNS_1YEAR_DT string ,
CAL_DT_MNS_2YEAR_DT string ,
CAL_DT_MNS_1QTR_DT string ,
CAL_DT_MNS_2QTR_DT string ,
CAL_DT_MNS_1MONTH_DT string ,
CAL_DT_MNS_2MONTH_DT string ,
CAL_DT_MNS_1WEEK_DT string ,
CAL_DT_MNS_2WEEK_DT string ,
CURR_CAL_DT_MNS_1YEAR_YN_ID tinyint ,
CURR_CAL_DT_MNS_2YEAR_YN_ID tinyint ,
CURR_CAL_DT_MNS_1QTR_YN_ID tinyint ,
CURR_CAL_DT_MNS_2QTR_YN_ID tinyint ,
CURR_CAL_DT_MNS_1MONTH_YN_ID tinyint ,
CURR_CAL_DT_MNS_2MONTH_YN_ID tinyint ,
CURR_CAL_DT_MNS_1WEEK_YN_IND tinyint ,
CURR_CAL_DT_MNS_2WEEK_YN_IND tinyint ,
RTL_MONTH_OF_RTL_YEAR_ID string ,
RTL_QTR_OF_RTL_YEAR_ID tinyint ,
RTL_WEEK_OF_RTL_YEAR_ID tinyint ,
SEASON_OF_YEAR_ID tinyint ,
YTM_YN_ID tinyint ,
YTQ_YN_ID tinyint ,
YTW_YN_ID tinyint ,
KYLIN_CAL_DT_CRE_DATE string ,
KYLIN_CAL_DT_CRE_USER string ,
KYLIN_CAL_DT_UPD_DATE string ,
KYLIN_CAL_DT_UPD_USER string )
COMMENT "Date Dimension Table" ROW FORMAT DELIMITED FIELDS TERMINATED BY "," STORED AS TEXTFILE location "wasb://your_blob_address/sampledata/kylin_cal_dt";

drop table if exists kylin_country;
create table kylin_country(
COUNTRY string,
LATITUDE double ,
LONGITUDE double ,
NAME string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "," STORED AS TEXTFILE location "wasb://your_blob_address/sampledata/kylin_country";

drop table if exists kylin_category_groupings;
create table kylin_category_groupings(
LEAF_CATEG_ID bigint COMMENT "Category ID, PK" ,
LEAF_CATEG_NAME string ,
SITE_ID int COMMENT "Site ID, PK" ,
CATEG_BUSN_MGR string ,
CATEG_BUSN_UNIT string ,
REGN_CATEG string ,
USER_DEFINED_FIELD1 string COMMENT "User Defined Field1" ,
USER_DEFINED_FIELD3 string COMMENT "User Defined Field3" ,
KYLIN_GROUPINGS_CRE_DATE string ,
KYLIN_GROUPINGS_UPD_DATE string COMMENT "Last Updated Date" ,
KYLIN_GROUPINGS_CRE_USER string ,
KYLIN_GROUPINGS_UPD_USER string COMMENT "Last Updated User" ,
META_CATEG_ID decimal ,
META_CATEG_NAME string COMMENT "Level1 Category" ,
CATEG_LVL2_ID decimal ,
CATEG_LVL3_ID decimal ,
CATEG_LVL4_ID decimal ,
CATEG_LVL5_ID decimal ,
CATEG_LVL6_ID decimal ,
CATEG_LVL7_ID decimal ,
CATEG_LVL2_NAME string COMMENT "Level2 Category" ,
CATEG_LVL3_NAME string COMMENT "Level3 Category" ,
CATEG_LVL4_NAME string ,
CATEG_LVL5_NAME string ,
CATEG_LVL6_NAME string ,
CATEG_LVL7_NAME string ,
CATEG_FLAGS decimal ,
ADULT_CATEG_YN string ,
DOMAIN_ID decimal ,
USER_DEFINED_FIELD5 string ,
VCS_ID decimal ,GCS_ID decimal ,
MOVE_TO decimal ,
SAP_CATEGORY_ID decimal ,
SRC_ID tinyint ,
BSNS_VRTCL_NAME string )
COMMENT "Detail category inforamtion, Dimension Table" ROW FORMAT DELIMITED FIELDS TERMINATED BY "," STORED AS TEXTFILE location "wasb://your_blob_address/sampledata/kylin_category_groupings";

Click Submit , After the system detects it, you can view the table structure just defined under the DEFAULT database.

  • Go to the Management-Project page, click +Project, please name the project Kylingence_demo, and select the default file data source for the data source, select Expert Mode for the project type, click Submit and you can then view the project you
    just created in the project list.

Start Cluster

  • Go to the Management-Cluster page. When you first use Kyligence Cloud, you need to set the cluster resources. Click the Add button to set the username and password for accessing the cluster edge node.
  • By default, Kyligence Cloud provides three types of cluster configurations. You can adjust the number of instance models and node number as needed. After the cluster is started, you can switch between the three cluster specifications by one-click according to the system load. You can also customize more configurations by clicking +Customize.
  • After filling out the cluster information form, click Submit and Start in the lower right corner to automatically create a Spark cluster. The creation process takes about 3-5 minutes.

Create Model

  • Before building the data model, you need to synchronize the tables to the project. First go to the Data-Sync Data page, click the Load Tables icon , select all the tables under the DEFAULT database, then click on Load metadata bottom to load.

Data sampling will be performed by default during the process of synchronizing metadata. You can find the automatically triggered Sample Table task in Data-Job page. After the task is completed, you can view the sample data of the source table in the Data-Sync Data-interface-Sample Data.

  • Then you can create the model in the Data-Model interface and enter the model editing interface to visually complete the creation of the multidimensional model. Here is a brief overview of the following four steps:

step one:Defining fact table

From the list of data source tables on the left, drag the fact table KYLIN_SALES from this sample dataset to the center of the modeling canvas, then click the Settings icon in the upper right corner of the table to select the type of the table as fact table.

step two:Defining dimension table

The dimension table in this sample dataset is:KYLINCALDT, KYLINCATEGORYGROUPINGSKYLIN_ACCOUNTKYLIN_COUNTRY

Since the KYLIN_ACCOUNT and KYLIN_COUNTRY tables contain both sellers and buyer users, we can drag them out twice, modifying the table aliases to SELLER_ACCOUNT, BUYER_ACCOUNT, and SELLER_COUNTRY and BUYER_COUNTRY

step three:Establish table and table join relationship

In this example, we drag the time foreign key field PART_DT of the fact table KYLIN_SALES to the primary key field CAL_DT of the time dimension table KYLINCALDT, and modify the connection mode to INNER JOIN in the popup window, the connection condition Is
KYLINSALES.PARTDT = KYLINCALDT.CAL_DT. Click the OK button to save the connection.

Refer to the above method and set all connection conditions (as shown below):

1) KYLIN_SALES INNER JOIN KYLINCALDT connection condition: KYLINSALES.PARTDT = KYLINCALDT.CAL_DT

2) KYLIN_SALES INNER JOIN KYLINCATEGORYGROUPINGS connection condition:KYLINSALES.LEAFCATEG_ID = KYLINCATEGORYGROUPINGS.LEAFCATEGID , KYLINSALES.LSTGSITEID = \KYLINCATEGORYGROUPINGS.SITEID\

3) KYLIN_SALES INNER JOIN BUYER_ACCOUNT (alias of KYLIN_ACCOUNT) connection condition: KYLINSALES.BUYERIDBUYERACCOUNT.ACCOUNTID

4) KYLIN_SALES INNER JOIN SELLER_ACCOUNT (alias of KYLIN_ACCOUNT) connection condition: KYLINSALES.SELLERIDSELLERACCOUNT.ACCOUNTID

5) BUYER_ACCOUNT (alias of KYLIN_ACCOUNT) INNER JOIN BUYER_COUNTRY (alias of KYLIN_COUNTRY) connection condition: BUYERACCOUNT.ACCOUNTCOUNTRY = BUYER_COUNTRY.COUNTRY

6) SELLER_ACCOUNT (alias of KYLIN_ACCOUNT) INNER JOIN SELLER_COUNTRY(alias of KYLIN_COUNTRY) connection condition: SELLERACCOUNT.ACCOUNTCOUNTRY = SELLER_COUNTRY.COUNTRY

step four:Add dimensions and measures

dimension :In this case we choose: KYLINSALES.PARTDT, SELLERCOUNTRY.NAME , BUYERCOUNTRY.NAME , KYLINCATEGORYGROUPINGS.LEAFCATEGNAME , KYLINCATEGORYGROUPINGS.CATEGLVL2NAME , KYLINCATEGORYGROUPINGS.CATEGLVL3NAME

measure :In this case we choose: KYLINSALES.PRICE(Aggregation method:SUM) , KYLINSALES.ITEM_COUNT(Aggregation method:SUM)

Click the Save button at the bottom right of the model editing interface, the system will prompt you to set the time partition column in the pop-up window, this setting can help you load data incrementally according to the time partition column. If you do not need to load the
data incrementally, please select No Partition in the selection box under the Time Partition column tab, after which the system will load the data in full for the model. In this case, we select No Partition.

  • Then click Add Index in the pop-up prompt window. Click Edit Aggregation Group to go to the Edit Aggregation Group page under the Aggregate Index.The aggregation group uses the following four concepts to control the combination of dimensions:
    • Include: Select the dimensions that need to appear in the index from the list of dimensions in the model.
    • Mandatory: The dimension corresponding to the business angle that must be analyzed, such as the customer statistics.
    • Hierarchy:Dimensions with hierarchical relationships, such as country, province, and city.
    • Joint:Dimensions that must be joined together, such as supplier and order date must appear at the same time.

In this case, select all six dimensions to Include and add KYLINSALES.PARTDT as Mandatory.

  • After creating the model and editing the index, you need to load the data for the model, and when you load the data, will also build the index based on the model definition. Models that have not been loaded data (which have no data) cannot serve query analysis. In this case, we choose to load all the data. Click Load Data on the right side of the model and you will be prompted to load all the data for the model.

Analyze Data

This chapter uses the built-in BI tools for analysis. The specific steps are as follows:

Create dataset

  • create a new dataset: After the data is loaded, you can analyze your data freely. First click on Analysis-BI Tools-Connect to the built-in Kyligence Insight in the navigation bar to enter the Kyligence Insight page. Click Log in and enter the Dataset page.Then click the +Dataset button in the top left corner and select the Dataset Usage as SQL.
  • Define dataset: First name the dataset as Kyligence_Sales in Basic Information and click Next.

Drag the desired model to the right in Define Relationships and click Next.

  • Define Semantics: In Defining Semantics, you can define as follows:
    • Click the edit button on the right of the dimension to change the dimension name, for example, rename KYLINSALESPART_DT in the KYLIN_SALES table to "sales_date".
    • Click the edit button on the right of the measure to change the measure name.
    • Click the +Add Hierarchy button to create a hierarchy.For example, create a new hierarchy called "ProductType" and add \KYLINCATEGORYGROUPINGS.CATEGLVL2NAME* and *KYLINCATEGORYGROUPINGS.CATEGLVL3_NAME\ in order to form a hierarchy of product types.
    • Click the +Calculated Measure button and enter Calculated Measure Name and Expression to create a calculated measure。For example, create a calculated measure named "SalesYear" with the expression: year(KYLINSALES.PART_DT) to directly display the annual sales.

  • Once all the definitions have been completed, click the Submit button in the lower right corner to save the newly created data set.

Create Chart

Click Chart in the navigation bar, then click the +Chart button in the upper left corner, select the new Kyligence_Sales dataset, and click Submit to enter the analysis interface.

Drag the required dimensions and measures to the right, then click Run Query in the top left corner to run the query and get the analysis chart.

For example, first click the Click to change visualization type button in the page and select the visualization type as Pivot Table. Drag the newly created hierarchy "ProductType" as the dimension, drag "GMVSUM" as the measure, drag the KYLINCALDT.CAL_DT into the filter,
select the past 10 years,then you can get the past 10 years sales data of each product.

Once you have the analysis result set, you can save your chart by clicking the Save button on the page, or you can download the query result set by clicking the Export CSV button in the upper right corner.

Want to know what Kyligence can do for you?