MLSQL is a new SQL variant designed for big data and AI scenarios. It is open source with Apache License V2.0. With MLSQL, users can perform self-service machine learning and AI tasks on large scale datasets on top of Ray and Spark, without caring about the different programming paradigms between PySpark and Ray, simply by writing a few lines of SQL statements. MLSQL optimized its distributed engine by combining Spark and Ray and improving the underlying data exchanging efficiency between them. Also, users can run the same piece of code on any Ray cluster of their choice.In this presentation, Kyligence Head of Product Dong Li will outline the basics of MLSQL with a live demo and a deep-dive into how MLSQL implements Spark+Ray on the engine side to build an efficient and single substrate for big data and AI.
Ray Summit brings together developers, ML practitioners, data scientists, DevOps, and cloud-native architects interested in building scalable data & AI applications with Ray, the open-source Python framework for distributed computing.Topics include: ML in production, MLOps, deep & reinforcement learning, cloud computing, serverless, and Ray libraries