Blog > Apache Kylin

Apache Kylin Through the Eyes of the Founders – Episode Three

Samantha Berlant

Marketing Communications Manager

Jul. 10, 2020

This article is part of a series of conversations with the founding members of Apache Kylin and Kyligence on the origins of Apache Kylin. You can find the first two installments here: Episode One, Episode Two.

Episode Three: Milestones

Apache Kylin is an open source distributed analytical data warehouse built with big data in mind. Via a clever combination of multi-dimensional cubes, plug-in architecture and precomputation technology, Kylin can provide near constant query speeds no matter the size of your dataset with sub-second latency – cutting costs for adopters of this technology in both time and manpower needed for effective analysis. 

This recipient of multiple industry awards has been adopted by over a thousand organizations worldwide seeking a solution to the problem of storing and analyzing big data fast enough for their insights to make an impact on their business. 

This is the origin story of the unexpected hero of modern big data analytics, Apache Kylin, as told by its inventors. 

It is an absolute pleasure to sit down and speak with the six founding members of both the open source, top-level ASF project Apache Kylin and its international, enterprise-ready counterpart, Kyligence. As we carry on with our discussion on the origins of Kylin, let’s talk about the growth of Apache Kylin from your perspectives and experience.

What have been the key milestones in the evolution of Apache Kylin from its infancy till now? What would you say the most significant developments were that made it possible for Kylin to have the success it does today?

L U K E: There are several major milestones. The most significant is for sure when we graduated to a top-level project with the Apache Software Foundation. This is key to building trust within the community because a lot of people trust your project when you graduate from the incubation program to a top-level ASF project.

Another milestone before that is when we got the award from InfoWorld in 2015. In September of 2015, we won an award from the magazine InfoWorld. They pick 10 projects to be the top open source big data tools of the year. This award came out of nowhere for us.

I first heard that we won in the middle of the night at about 2 A.M. – I did a lot of overtime then! I went to my Twitter account and saw a lot of people had mentioned me and were saying congratulations. I was very curious about why they were congratulating me – and for what? I had no idea!

Then, I was like – “Oh good! Oh my god, we won an industry award!” I did a little bit of research and thought, “Oh my god, this is very cool.” So, I wrote an email to the team, my managers and the senior VPs of eBay. A VP replied within minutes – this VP is now our board advisor – asking me how much I paid them, and I replied that they never contacted us. This was a really great moment. We had been recognized by the industry and it brought a lot of confidence to the team.

**Kylin Quickly Catches the Attention of Industry Leaders like Arun Murthy, Debashis Sasha, and Chris Riccomini, 2014**

S H A O F E N G: It definitely did. Everyone felt incredible and was amazed that the project had been selected as one of the best open source tools for big data.

One year later, we had left eBay and founded Kyligence. I remember one morning I was travelling to Beijing when I heard that Kylin was again getting the Bossie (Best Open Source Software) Award from InfoWorld for 2016.

I feel very proud of that because InfoWorld is an American media platform and most of Kylin’s developers are far from America, yet the project was still recognized by this platform. We all feel very proud of that. That morning is one of my favorite memories from working on Apache Kylin.

As far as key technical milestones – we released Kylin 2.0 in 2017, which was a newly implemented technology with Spark. This was a major release for Kylin along with the latest release, version 3.0, which introduced real-time streaming.

Also in 2017, Luke and I gave a presentation on Kylin at the Spark Summit, a big data event in San Francisco, CA. That year, we submitted a proposal to present on Kylin with Spark and they accepted our topic, so Luke and I flew to San Francisco to give a presentation introducing Kylin.

Two years later, Kyligence’s first paid customer in the U.S. said they knew of Kylin because of this summit, which made us feel very good because others were now seeing the potential that we see in Kylin.

**Shaofeng Shi, San Francisco Spark Summit 2017**

D O N G: Two key milestones come to mind for me. The first one is when Kylin rolled out version 2.0, like Shaofeng mentioned. This is the first version that brought a lot of new features and capabilities and also evolved Kylin to a new architecture, which is a lambda architecture.

This architecture is the foundation for the whole ecosystem of Kylin. Now, we have a lot of users across different industries like banking and manufacturing, so everyone, regardless of whether they are using the open source or enterprise version, the foundation, the basics of the whole architecture is from Apache Kylin 2.0. You could say this is the roots of the whole architecture.

The second milestone is Kyligence, which has released enterprise products based on Apache Kylin. We created this startup company to build a better Kylin community, so that we could serve more commercial companies and global enterprises like Microsoft, Visa, China Union Pay and so many others.

**Dong Li, Azure & Kylin Workshop 2018**

Hongbin, Yang – what comes to your mind when you think of the main Kylin milestones?

H O N G B I N: I don’t know when this milestone happened exactly . . . When Apache Kylin was first open sourced, it had a very limited impact on the industry. I remember Luke and I went to Beijing to hold a Meetup. I remember clearly that very, very few people came out.

I believe there were three people that joined our first Meetup. So, that was a very unforgettable memory for me because at that time I was wondering how people in the industry would accept our project. But, not long after that, we held our second and third Meetups and we had more and more people joining each time.

So, I don’t know when this milestone was achieved but I could clearly see that more and more people were accepting our ideas and our project.

Y A N G: The journey of Apache Kylin is the story of the first open source project both for eBay and for China. Kylin was 100% designed and developed in CCoE (eBay's Chinese center that manages the data warehouse, Hadoop and other BI platforms).

The extremely fast rise of this project from open source to Apache incubation, to winning two Bossie’s and becoming a top-level project show the level of resonance this technology had within the big data and open source communities.

What a journey! Thank you for sharing so many highlights with us. When we continue this conversation in our next episode, I’d like to hear more in this vein – your favorite memories from the time you’ve spent working together on Apache Kylin.

Stay tuned for our next episode! If you missed the beginning of this story, check out Episode One and Episode Two.

Additional Resources 

Roaring Elephant Podcast with Dong Li - Episode 93 – Apache Kylin: Extreme OLAP Engine for Big Data

How Apache Kylin Is Rapidly Changing the Way We Approach Big Data – Q&A with Apache Kylin Committer, Kaige Liu

What’s New with Apache Kylin 3.0?

Making Distinct Counting Work for Big Data

Getting Started with Apache Kylin

Get the Most From Your Kylin Deployment with These Resources

Further Reading Is Available on Our Apache Kylin Blog

The Apache Software Foundation

Apache Kylin

About the Founders 

Luke Han is the Co-Founder and CEO of Kyligence and Co-Founder of Apache Kylin, the first Apache Software Foundation top-level project developed in China. He is responsible for Kylin’s strategic planning, development roadmap, product design, and more, and is committed to developing the Apache Kylin global community and ecosystem. He has served as Head of Big Data Products in eBay’s Global Analytics Infrastructure Division, Chief Advisor to Actuate China, and Technical Director of Power Excellence East China.

Yang Li is the Co-Founder and CTO of Kyligence, Co-Founder of Apache Kylin and member of the Project Management Committee (PMC). Previously, he was the Senior Architect of Big Data in eBay’s Global Analytics Infrastructure, Vice President at Morgan Stanley, and during his time with IBM, he received the Outstanding Technology Contribution Award. Yang has more than 10 years of hands-on experience in big data analytics; he has focused on parallel computing, data indexing, relational mathematics, approximation algorithms, compression algorithms and other cutting-edge technologies. Over the past 15 years, Yang has directly driven the development of OLAP technology in the big data space. 

Dong Li is the Founding Member and Senior Director of Product and Innovation at Kyligence, an Apache Kylin Core Developer (Committer) and member of the Project Management Committee (PMC) where he focuses on big data technology development. Previously, he was a Senior Engineer in eBay’s Global Analytics Infrastructure Department, a Software Development Engineer for Microsoft Cloud Computing and Enterprise Products, and a core member of the Microsoft Business Products Dynamics Asia Pacific team where he participated in the development of a new generation of cloud-based ERP solutions. 

Shaofeng Shi is a Partner and Chief Software Architect at Kyligence, Apache Kylin Core Developer (Committer) and Chairman of the Project Management Committee (PMC Chair) where he focuses on big data analytics and cloud computing technologies. Previously, he was a Senior Data Engineer in eBay’s Global Analytics Infrastructure Department and a Cloud Computing Software Architect at IBM. 

Hongbin Ma is the Vice President of Research and Development at Kyligence, an Apache Kylin Core Developer (Committer) and member of the Project Management Committee (PMC) where he focuses on big data infrastructure and platforms. He joined eBay as Apache Kylin’s Chief Committer. Previously, he was a core contributor to Trinity, Microsoft’s Asian Research Institute’s graph database. He has contributed to Apache Kylin’s storage engine, query optimization, test coverage and other areas and is currently the technical leader of Kyligence Enterprise data warehouse products. 

Jason Zhong is a Partner and Senior Director at Kyligence, an Apache Kylin Core Developer (Committer) and member of the Project Management Committee (PMC). He has worked in eBay’s Global Analytics Infrastructure Division and been involved in operational automation product development as well as Kylin’s development. After joining Kyligence, he worked in both research and development before becoming responsible for business sales and business development transformation. He has won consecutive Kyligence sales titles and is currently the Head of Kyligence South Division.

About the Author

Samantha Berlant is the Marketing Communications Manager at Kyligence and a big fan of AI, machine learning, and science-fiction. She spent several years leading content analytics projects at Facebook and Instagram and has been a writer and editor for over a decade. Samantha believes in the power of accessible data and her favorite Star Trek character is, coincidently, Data.

Post Views: 4,261

Apache Kylin Through the Eyes of the Founders – Episode Three

Every Product Will Be a Data Product

Building a Metrics Store for Snowflake

AWS Bedrock and Kyligence Copilot: Revolutionizing Data Analysis

Build Your Data Copilot on AWS S3

Build Your Data Copilot on Snowflake

These 7 AI Analytics Tools Can Transform Your Data Game Effortlessly!

SaaS Metrics that Matter for Customer Success

Top 5 Augmented Analytics Tools for 2023

Website Metrics that Matter for Business Growth: Why They Matter and How to Measure Them

What Are Analytics Query Accelerators? How Does It Work With Cloud Data Lakes?

Apache Kylin Through the Eyes of the Founders – Episode Three

You might be interested:

Read Next

Every Product Will Be a Data Product

Building a Metrics Store for Snowflake

AWS Bedrock and Kyligence Copilot: Revolutionizing Data Analysis

Build Your Data Copilot on AWS S3

Build Your Data Copilot on Snowflake

These 7 AI Analytics Tools Can Transform Your Data Game Effortlessly!

SaaS Metrics that Matter for Customer Success

Top 5 Augmented Analytics Tools for 2023

Website Metrics that Matter for Business Growth: Why They Matter and How to Measure Them

What Are Analytics Query Accelerators? How Does It Work With Cloud Data Lakes?