This article is part of a series of conversations with the founding members of Apache Kylin and Kyligence on the origins of Apache Kylin. You can find the second installment here: Episode Two.
Episode One: The Rise of Kylin
If you’ve already decided to ignore OLAP as a potential solution for your big data problems, think again. There is an open source project dedicated to bringing OLAP back to big data, and it’s doing so in a big way.
This recipient of multiple industry awards has been adopted by over a thousand organizations worldwide seeking a solution to the problem of storing and analyzing big data fast enough for their insights to make an impact on their business.
Apache Kylin is a distributed analytical data warehouse built with big data in mind. Via a clever combination of multi-dimensional cubes and precalculation technology, Apache Kylin can provide near constant query speeds no matter the size of your dataset with sub-second latency – cutting costs for adopters of this technology in both time and manpower needed for effective analysis.
This is the origin story of the unexpected hero of modern big data analytics, Apache Kylin, as told by its inventors.
It is such an honor to be sitting down with the founders of both Apache Kylin and Kyligence. Thank you for taking this time to share your experiences with us.
The six of you worked together at eBay on a little open source project now known as Apache Kylin, a globally admired OLAP on big data platform and a top-level project with the Apache Software Foundation. That wasn’t the end of it though. You had a vision for what Kylin could become, so you left eBay together to form an enterprise-ready version of Kylin that you called Kyligence.
Let’s start at the very beginning of this story. How did each of you end up working at eBay on this Kylin project? And secondly, what made you want to be a part of Kylin in the beginning? Out of all the open source projects and all the projects at eBay, what drew you in to this one specifically?
L U K E: Apache Kylin came about in the middle of 2013. eBay’s CCoE (the center that manages the data warehouse, Hadoop and other BI platforms) was facing a lot of challenges on the analytics side. I was the leader of big data product at that time and serving under Debashis Saha who was eBay's VP for the commerce platform.
A side note on Debashis, he's been a major supporter of both Kylin and Kyligence from the beginning. He understood the vision behind the solution clearer than most people and that support has carried over into his role as an advisor for Kyligence.
I managed all of the BI and big data platforms inside eBay and offered infrastructure-level support to all of our customers. Before I joined eBay, I had been working in data management and analytics for more than 10 years.
So, to address the issues we were facing, we began brainstorming ideas for how we could help our customers and analysts quickly access their data and unleash the data's value for them.
This brainstorm was the very beginning. We formed a very big collective of projects we called Yangtze, after the river in China. Yangtze was a huge effort spanning ETL, BI tools, the OLAP side and BI front end, as well as the infrastructure level.
Kylin was one of our OLAP projects under Yangtze. In the end, Kylin was the only one that survived.
Kylin must have shown significant promise early on then. What did you think of the project at this stage?
L U K E: At that moment, I didn’t think the industry had a component or technology like this available. In the very beginning, I just wanted people to start thinking about how we could make this technology work and how we could solve the challenges we were encountering in big data.
We didn’t have any thoughts about forming a startup company; we just wanted our customers to be happy. We wanted our technology to be well-leveraged for our customers on the business side so it could make an impact internally at eBay.
I thought it was a very small initiative at first, but when we open sourced, I realized the world was changing. On the first day after we released Kylin to the world, we got a lot of great feedback and response to our project – even Hortonworks said it was great.
Then Ted Dunning, chief architect at MapR, invited us to the Apache Software Foundation. That was the moment I realized we had made something different. This project was historic for eBay as well, because we made a very huge technical impact on the open source community.
“Historic” is definitely an appropriate label for Kylin. You have all worked very hard and created something special with Kylin. Yang, you were the first to join Luke on Kylin. How did you become interested in the project?
Y A N G: I left Morgan Stanley and joined eBay in early 2014 specifically for the Apache Kylin project. The link between me and Apache Kylin is a guy called Xu Jiang. He and I have known each other for a long time. At that point, he was aware that I was looking for a more technical opportunity, so he approached me about this project he was working on.
Xu was planning to leave eBay and was looking for a technical successor to take his place on the Kylin project. I gladly accepted his offer and took on the technical lead role for Apache Kylin.
Xu and Luke are the two biggest founding cornerstones of Kylin. Together, they came up with the idea for Kylin and it was their support and contributions that formed the very early team and got management support and sponsorship for the project.
Luke led the project management side of things and Xu led the technical side, until I took his place.
You and Luke certainly made a lot of headway in those very early days. Hongbin, you joined around that same time. How did you hear about Kylin?
H O N G B I N: I first heard about Kylin when I interviewed at eBay. Working on Apache Kylin was my first job after graduation. Before I joined the eBay team, I was an intern at Microsoft where I gained some background knowledge that was relevant to Kylin.
When I joined the team, Luke and Xu Jiang were just starting to work on the project. Xu was starting to work on the Kylin prototype when he interviewed me.
I was really drawn in by his personality and passion for the project more than by the project itself at first, because it was so new. I remember Xu once told us that Kylin was a once in a lifetime opportunity for all of us.
J A S O N: I couldn’t agree more. I joined the team in 2015. I was already at eBay on another team that worked near the Kylin team. We all worked long hours, but especially the Kylin team, I noticed they very often worked late, and I was curious about why they worked so hard.
Sometimes I could hear them arguing in a meeting room and I would wonder why they were arguing. Most of the time we are peaceful at work, arguing is not common. So, I was interested in the team before I knew what Kylin was.
Once I understood the project, I was very interested in it as well. Within eBay, there are many data workers and we use many different data tools. We were always working with data but on traditional software like Teradata and sometimes a new tool like Spark or Hadoop. But we faced some problems that meant we couldn’t react quickly to business needs.
When I learned what Kylin was trying to accomplish, I knew it could really stop some critical and common problems and I saw the great capacity for Kylin’s growth in the future. I was very excited when I got the opportunity to join this team.
Shaofeng and Dong, you both joined the project a little later. What brought you to Kylin?
S H A O F E N G: I started working on Kylin when I joined eBay in November of 2014 just after the project became open source. Before that, I worked in the IBM lab in the cloud domain, but I have more interest in big data.
I joined this team because I thought big data would be the next big trend in IT. The project itself was very interesting as well – a powerful analytics engine on big data – that drew me in.
D O N G: I joined Apache Kylin at the end of September 2015 when I joined eBay as a software engineer. Kylin was already an open source, incubating project with the Apache Software Foundation.
As a developer at that point in my career, I had the passion and willingness to make contributions to the open source community. I wanted to make my contributions and my code visible to everyone and benefit users around the world because the open source community has a global reach.
I also wanted to join this team specifically because this is an ASF open source project. Joining this project helped me become an Apache Committer, which means I got an apache.org email address, which I think is a very cool thing – it’s the recognition of a developer.
So, there was a sense of what you might call “street cred” for a developer joining this team, but it was also about the project itself.
Apache Kylin belongs to the big data technology industry. At that time, big data was the major change taking place in tech. As a young developer, I thought that big data was something very cool and something that I could make the focus of my career for many years to come.
I felt this was the right direction for me and that I could work my way up quickly. Just three months after I joined, the project graduated from ASF incubation and became a top-level project.
Could you explain what the designation “top-level project” means for ASF?
D O N G: Of course. “Top-level project” is a recognition from the ASF. It verifies that Kylin is a mature open source product. In order to graduate from incubation, a project needs to meet some criteria such as its license, security, management policy and quality. So, the graduation was proof that Apache Kylin had done a good job in those areas.
How did Apache Kylin become an open source project? What was that process like for your team?
L U K E: Apache Kylin was contributed by eBay to the Apache Software Foundation. Before the six of us left eBay, we operated using the open source Apache Kylin as our single source of truth.
We didn’t have two copies, one for open source and one for eBay, we only had one – that’s very important and made life very easy.
In September 2014, we decided to take Apache Kylin live into production within eBay. The next month, with the permission of our senior management, on October 1st, I loaded out all the source code to GitHub. We quickly got a lot of response from the community.
Then our mentor, Ted Dunning, invited us to the Apache Software Foundation later in October 2014. With his great support and guidance, we submitted a proposal to the Apache Software Foundation, and they accepted us.
We became an incubated project with Apache Software Foundation in November 2014. Less than a year later, in October 2015 we decided to raise a graduation vote and we passed it. So, in December 2015 we graduated as an official Apache Software Foundation top-level project.
Did anything change about the project once you made it open source?
L U K E: When we contributed Kylin to the Apache Software Foundation, the only change to the project was that it really opened us up to the community. We got a lot of adoptions and a lot of contributors so suddenly.
The entire community came to us. This made us feel very, very good because at eBay, we were just an infrastructure team serving a few business customers but, suddenly, we found a lot of huge companies were leveraging our technology.
When did you make the decision to leave eBay and form Kyligence? How did that decision affect the relationship between eBay and Kylin?
S H A O F E N G: In 2016, the original six members left eBay and formed Kyligence Inc. We wanted to make something new because we saw the opportunities with this new technology, its trend and the market, so we formed Kyligence to support customers in the finance and telecom domains and beyond.
The Kylin team at eBay kept growing. Soon, some new engineers were invited to be Kylin Committers or PMC members, following The Apache Way.
Today, we maintain very good cooperation on the project because we have the same interest in it. The PMC has members from different companies including Kyligence, eBay, Meituan, JD and others. We all hope this project will grow quickly and we work together to help move it forward.
What an origin story! I look forward to continuing this conversation in our next episode The Journey Begins where we will dive into the details of your individual work on Apache Kylin and learn some of the secrets that make this project tick. Stay tuned!
About the Founders
Luke Han is the Co-Founder and CEO of Kyligence and Co-Founder of Apache Kylin, the first Apache Software Foundation top-level project developed in China. He is responsible for Kylin's strategic planning, development roadmap, product design, and more, and is committed to developing the Apache Kylin global community and ecosystem. He has served as Head of Big Data Products in eBay's Global Analytics Infrastructure Division, Chief Advisor to Actuate China, and Technical Director of Power Excellence East China.
Yang Li is the Co-Founder and CTO of Kyligence, Co-Founder of Apache Kylin and member of the Project Management Committee (PMC). Previously, he was the Senior Architect of Big Data in eBay's Global Analytics Infrastructure, Vice President at Morgan Stanley, and during his time with IBM, he received the Outstanding Technology Contribution Award. Yang has more than 10 years of hands-on experience in big data analytics; he has focused on parallel computing, data indexing, relational mathematics, approximation algorithms, compression algorithms and other cutting-edge technologies. Over the past 15 years, Yang has directly driven the development of OLAP technology in the big data space.
Dong Li is the Founding Member and Senior Director of Product and Innovation at Kyligence, an Apache Kylin Core Developer (Committer) and member of the Project Management Committee (PMC) where he focuses on big data technology development. Previously, he was a Senior Engineer in eBay's Global Analytics Infrastructure Department, a Software Development Engineer for Microsoft Cloud Computing and Enterprise Products, and a core member of the Microsoft Business Products Dynamics Asia Pacific team where he participated in the development of a new generation of cloud-based ERP solutions.
Shaofeng Shi is a Partner and Chief Software Architect at Kyligence, Apache Kylin Core Developer (Committer) and Chairman of the Project Management Committee (PMC Chair) where he focuses on big data analytics and cloud computing technologies. Previously, he was a Senior Data Engineer in eBay's Global Analytics Infrastructure Department and a Cloud Computing Software Architect at IBM.
Hongbin Ma is the Vice President of Research and Development at Kyligence, an Apache Kylin Core Developer (Committer) and member of the Project Management Committee (PMC) where he focuses on big data infrastructure and platforms. He joined eBay as Apache Kylin's Chief Committer. Previously, he was a core contributor to Trinity, Microsoft's Asian Research Institute's graph database. He has contributed to Apache Kylin's storage engine, query optimization, test coverage and other areas and is currently the technical leader of Kyligence Enterprise data warehouse products.
Jason Zhong is a Partner and Senior Director at Kyligence, an Apache Kylin Core Developer (Committer) and member of the Project Management Committee (PMC). He has worked in eBay's Global Analytics Infrastructure Division and been involved in operational automation product development as well as Kylin's development. After joining Kyligence, he worked in both research and development before becoming responsible for business sales and business development transformation. He has won consecutive Kyligence sales titles and is currently the Head of Kyligence South Division.
About the Author
Samantha Berlant is the Marketing Communications Manager at Kyligence and a big fan of AI, machine learning, and science-fiction. She spent several years leading content analytics projects at Facebook and Instagram and has been a writer and editor for over a decade. Samantha believes in the power of accessible data and her favorite Star Trek character is, coincidently, Data.