This article is part of a series of conversations with the founding members of Apache Kylin and Kyligence on the origins of Apache Kylin. You can find the first four installments here: Episode One, Episode Two, Episode Three, Episode Four.
Episode Five: An Education in Humanity
Apache Kylin is an open source distributed analytical data warehouse built with big data in mind. Via a clever combination of multi-dimensional cubes, plug-in architecture, and precomputation technology, Kylin can provide near-constant query speeds no matter the size of your dataset with sub-second latency – cutting costs for adopters of this technology in both time and manpower needed for effective analysis.
This recipient of multiple industry awards has been adopted by over a thousand organizations worldwide seeking a solution to the problem of storing and analyzing big data fast enough for their insights to make an impact on their business.
This is the origin story of the unexpected hero of modern big data analytics, Apache Kylin, as told by its inventors.
We are now moving deeper into our conversation with the founding members of the open source project, Apache Kylin. I imagine there has been no shortage of lessons to learn as you built a new technology like this up from scratch.
What are some things you have learned from working on the Apache Kylin project, both personally and professionally?
Y A N G: Personally, I think I learned how to be more social, more influential. It’s not that I didn’t know social marketing and communication was important, Apache Kylin was just the first real practice I had with it and the first successful results I got out of it. That was the special thing about it.
Technically, working on Kylin is definitely a big technical challenge. This was a very big technical achievement for me as well. Apache Kylin gave me the first big batch of external recognition I’ve had in my life. It was the first time I was aware that a piece of work I created was widely appreciated. For example, how we use HBase co-processor to improve the storage performance, how we use dictionary technology to have a very good compression rate of pre-calculated data, how we use and extend Calcite so it can optimize a query given the cube pre-calculation we have, and how we use a plugin architecture to make the whole system extensible for future developments. So, I’ve been everywhere in the system. It’s difficult to say which part I liked most. From a technical point of view, I liked every part of it. It’s more the public recognition that surprised me.
I can imagine the level of global recognition Kylin has received has been quite a different experience from a typical day in the life of a developer.
H O N G B I N: Definitely. When I started out with Kylin, I was a junior software engineer, so everything was new to me. I’ve played so many roles over the last few years. I guess, looking back, Senior Engineer was the easiest chapter in my career path because I had a lot of time to focus on pure technology problems. When you move into a management role, you have to consider other factors like the stability of the team, the roadmap of the product, and I think that’s much more difficult than solving a pure technology problem.
On the technical side, I’ve learned a lot, of course. What I also learned from this project is that open source works in China. Before our project, there were very few successful open source projects in China, but after what we achieved, I noticed a lot of other companies and other people being successful in the open source community – meaning both with Apache Kylin and the open source community as a whole. The community is very active now and many, many people are involved.
So, Kylin has been the trend-setter in that regard. Jason, what has this project taught you?
J A S O N: I think I’ve learned the most simply from working with this team. Before I joined the Kylin team, I sometimes heard them arguing in the meeting room. We work in peace most of the time at our office, so I was not used to arguing. After I joined the Kylin team, I saw how they worked. Sometimes they argued with each other just to find the right and best result. From this, I learned that sometimes we don’t have to be so peaceful, we need to argue – professionally – for the results. Don’t argue for the person, argue for the result.
That is an important distinction. When a group of passionate people work together things can easily get out of hand if you don’t see eye to eye, but it sounds like you all kept your eyes on the goal and through your discussions, heated or otherwise, you built some incredible technology.
S H A O F E N G: There’s definitely a lot to learn about the technology by working on Apache Kylin. On the non-technology side, I have learned how to be open-minded and listen to our users. They give us a lot of feedback and we can learn many things by understanding their perspective. Developers have a different perspective. We tend to think from the technology side, not from the end user’s side. Sometimes a community user will tell us about their experiences or their problems and that helps us think through the problem from another angle. This helps us improve our design to be more usable and more suitable for different users.
Secondly, I’ve learned to collaborate. Because this is an open source project, we have developers from many different companies. When they want to join the project and make a contribution, we need to guide them through setting up their environment and help them download the code and make their first contribution. Then, we need to review their code, give them suggestions, and help them improve their implementation until their feature is merged into Kylin. This is very different from working on commercial software projects.
Thirdly, I’ve learned the importance of going offline to support the online open source project. The website and code repository are not enough on their own to teach people about Kylin. We also need to make more people aware of the project by hosting or joining some offline events such as Meetups, conferences, or webinars and talk through our use cases to let more people know about it.
D O N G: Yes, exactly. We can learn so much more from those direct conversations with our users. I learned a lot from this experience overall. When I get in touch with the users of the community, I get a better understanding of the scenarios being used and the value the technology gave them. In this role with Kylin, I can quickly get close to the users and get direct feedback. When I was just a developer at large companies, I didn’t have any opportunities to get close to the users and learn about the end-to-end scenarios. I was just making the code to fulfill some requirements. But with Kylin, I have more chances to see the whole picture. That is the biggest learning opportunity for me. This lets me know what the product is to the users, what users expect of the product, and how users use the product to fulfill which scenarios. Talking directly with the users means I can quickly get answers to these questions and feed that information right back into the community of Kylin contributors and the PMC to help us guide the project to meet the needs of our users.
It is so important to keep that human-user in mind and it’s wonderful that you are all so passionate about connecting with your userbase. I believe that is one crucial element of your success so far. Luke, as the initial driver of this project, I’m sure you’ve learned a lot and had to adapt to many different challenges over the years. What stands out to you as the most significant lessons you’ve taken away from your time with Kylin?
L U K E: There are several things I’ve learned. Apache Kylin is the first top-level project contributed by Chinese developers, so the culture conflict between west and east has been very interesting.
Working with great people from the Apache Software Foundation, I learned a lot about how to communicate effectively with people from around the world – how to bring your opinion to them, how to understand each other, how to avoid the culture conflict, and how to hold open discussion and make open decisions.
Learning all of this was not just of benefit to me and the managers of Apache Kylin, I also bring these lessons, values, and this mindset to Kyligence, which is an international company. I very much agree with respectful, open decision making. It’s very important to forming a solid team. How to conquer culture conflict and how to keep people talking openly and with respect, I really learned a lot about that by working on Kylin.
This project is also very important for the global open source community to learn how a project from China can work with the global community. We are not a mystery; we aren’t incomprehensible. We’re all the same human beings. We just speak a different language, but we are the same.
Given these challenges, Kylin’s outcome could have been a very different one if you had not made such an effort to learn and teach effective cross-culture communication. We are all human, brought together by technology.
In our next conversation, I’d like to learn more about the other obstacles you’ve all faced and overcome, the challenges you’ve conquered, to bring this project to fruition. Stay tuned for our next episode!
Q&A with Apache Kylin Committer, Kaige Liu – How Apache Kylin Is Rapidly Changing the Way We Approach Big Data
Learn About Real-Time Streaming - What’s New with Apache Kylin 3.0?
4-Part Series on Count Distinct – Making Distinct Counting Work for Big Data
Further Reading Is Available on Our Apache Kylin Blog
Roaring Elephant Podcast with Dong Li – Episode 93 – Apache Kylin: Extreme OLAP Engine for Big Data
About the Founders
Luke Han is the Co-Founder and CEO of Kyligence and Co-Founder of Apache Kylin, the first Apache Software Foundation top-level project developed in China. He is responsible for Kylin's strategic planning, development roadmap, product design, and more, and is committed to developing the Apache Kylin global community and ecosystem. He has served as Head of Big Data Products in eBay's Global Analytics Infrastructure Division, Chief Advisor to Actuate China, and Technical Director of Power Excellence East China.
Yang Li is the Co-Founder and CTO of Kyligence, Co-Founder of Apache Kylin, and member of the Project Management Committee (PMC). Previously, he was the Senior Architect of Big Data in eBay's Global Analytics Infrastructure, Vice President at Morgan Stanley, and during his time with IBM, he received the Outstanding Technology Contribution Award. Yang has more than 10 years of hands-on experience in big data analytics; he has focused on parallel computing, data indexing, relational mathematics, approximation algorithms, compression algorithms, and other cutting-edge technologies. Over the past 15 years, Yang has directly driven the development of OLAP technology in the big data space.
Dong Li is the Founding Member and Senior Director of Product and Innovation at Kyligence, an Apache Kylin Core Developer (Committer) and member of the Project Management Committee (PMC) where he focuses on big data technology development. Previously, he was a Senior Engineer in eBay's Global Analytics Infrastructure Department, a Software Development Engineer for Microsoft Cloud Computing and Enterprise Products, and a core member of the Microsoft Business Products Dynamics Asia Pacific team where he participated in the development of a new generation of cloud-based ERP solutions.
Shaofeng Shi is a Partner and Chief Software Architect at Kyligence, Apache Kylin Core Developer (Committer), and Chairman of the Project Management Committee (PMC Chair) where he focuses on big data analytics and cloud computing technologies. Previously, he was a Senior Data Engineer in eBay's Global Analytics Infrastructure Department and a Cloud Computing Software Architect at IBM.
Hongbin Ma is the Vice President of Research and Development at Kyligence, an Apache Kylin Core Developer (Committer) and member of the Project Management Committee (PMC) where he focuses on big data infrastructure and platforms. He joined eBay as Apache Kylin's Chief Committer. Previously, he was a core contributor to Trinity, Microsoft's Asian Research Institute's graph database. He has contributed to Apache Kylin's storage engine, query optimization, test coverage, and other areas and is currently the technical leader of Kyligence Enterprise data warehouse products.
Jason Zhong is a Partner and Senior Director at Kyligence, an Apache Kylin Core Developer (Committer), and a member of the Project Management Committee (PMC). He has worked in eBay's Global Analytics Infrastructure Division and been involved in operational automation product development as well as Kylin's development. After joining Kyligence, he worked in both research and development before becoming responsible for business sales and business development transformation. He has won consecutive Kyligence sales titles and is currently the Head of the Kyligence South Division.
About the Author
Samantha Berlant is the Marketing Communications Manager at Kyligence and a big fan of AI, machine learning, and science-fiction. She spent several years leading content analytics projects at Facebook and Instagram and has been a writer and editor for over a decade. Samantha believes in the power of accessible data and her favorite Star Trek character is, coincidently, Data.