1. Apache Hudi technology was introduced.
2. The Alluxio distributed caching service was integrated to cache hot data in memory or a local disk and store cold data in S3. This improved query access performance and reduced calls to S3 APIs. As a result, network bandwidth usage was minimized and the costs were further reduced.
3. Read and write were separated. Kyligence Cloud uses two Spark clusters, one for data processing and calculation and the other for online queries.
4. Airflow implemented fully automatic scheduling, reducing the latency of end-to-end incremental data updates to less than 1 hour.
5. All data calculations can be performed within the VPC, which is only accessible via a VPN, ensuring high security.