The World’s Largest Company Without a Data Warehouse: What We Learned from the Complete Removal of Oracle from Amazon.com

Author
Brian Chen
Kyligence
Sep. 24, 2020

Amazon Web Services (AWS) Chief Evangelist Jeff Barr published an article on his official blog entitled "Migration Complete – Amazon's Consumer Business Just Turned off its Final Oracle," officially announcing the completion of the migration of the core trading system database. This isn’t the first migration the enterprise has undertaken. Previously, Amazon completed the migration of traditional data warehouses to Redshift in 2018.

This blog will discuss the significance of this major milestone and why businesses are migrating traditional data warehouses to the cloud. I’ve worked in the field of data warehouse and BI for more than a decade and participated in many traditional data warehouse migration projects for large enterprises.

As early as the AWS re: Invent conference in 2018, AWS announced that all Oracle-based data warehouses would be turned off and migrated to Redshift-based cloud data warehouses. In the keynote speech “Under the Hood: How Amazon Uses AWS Services for Analytics at a Massive Scale,” AWS introduced the migration process for the Oracle data warehouse. Luckily, I listened to this speech firsthand and have shared some thoughts below on data warehouse migration based on my own personal experience.


TL;DR What Have We Learned


1.    There are compelling reasons to migrate off traditional DWs

Chief among reasons to migrate are the limitations of performance, scalability, and concurrency. Once the DW is hit by more than a handful of users, the headaches begin. DWs also have high price tags and high maintenance and support costs.

2.    It works! Cloud analytics platforms can now meet and surpass the requirements of DWs

Amazon reduced database costs by over 60% and AWS customers regularly report cost savings of 90% by switching from Oracle to the cloud. Latency of our consumer-facing applications was reduced by 40%. The use of managed cloud services reduced database admin overhead by 70%.

3.    Migrating DWs to the cloud is challenging, but doable

Amazon identified technical, business, and financial challenges as the categories of hurdles they needed to overcome:

  • Technical Challenges – Rationalizing different data types, platform functions, SQL syntax, etc.
  • Business Challenges – Avoid or minimize disruption to normal business operations
  • Financial Challenges – Migration costs involved staffing, licensing, and tweaking the expense model


Why migrate away from traditional data warehouses?

Amazon shared three major reasons why Oracle-based traditional data warehouses should be migrated:

Poor Scalability

Oracle is based on an all-in-one architecture with tight coupling of computing and storage. It can’t quickly respond to its given requirements when the amount of data increases explosively.

High Maintenance Cost

It takes more than 100 hours of downtime per month to upgrade and patch the system.

Large Capital Expense

Many companies will spend billions of dollars on data warehouses every year (the exact cost of AWS is unknown).

removing oracle from aws
Figure 1: Challenges Faced by AWS Based on Oracle Data Warehouse

In recent years, the migration of the traditional data warehouse has spread like a prairie fire across many industries, which is most readily apparent in the internet industry. In addition to the above points, the following ones may also be the main reasons why enterprises need to migrate traditional data warehouses:


Closed Technology

The traditional data warehouse technology system is relatively closed. Enterprises need technology that is more flexible, independent, and controllable, so they can make enhancements or optimizations according to their needs, instead of being restricted to the product development roadmap of suppliers.


Monolithic Usage

The traditional data warehouse is mainly used to store and process structured data and often serves enterprise report applications. With the continuous push for digital transformation, enterprises have increasingly higher requirements for their data warehouse or data platform and require them to support more innovative applications such as self-service analysis, real-time calculation, graph calculation, machine learning, etc.


Governance in Global Markets

Typically, in the domestic financial industry, traditional data warehouse systems and software are essentially monopolized by foreign products. With the continuous escalation of this trade war, the domestication of core financial technologies has been raised to a strategic level, and the domestication of data warehouses in the financial industry has become imperative for the future.


Challenges faced by data warehouse migration

Data warehouse migration is a huge project, especially for a large-scale data warehouse, which requires a lot of manpower and material resources, and the process will face various challenges. Some enterprises have lost an entire team due to the hard and boring work in the logarithmic phase of migration.

Generally, data warehouse migration needs to overcome the following three major challenges:


Technical Challenges

How can we reasonably design the technical setup strategy to ensure the smooth migration of models, data, scripts, and upper-level applications? In particular, it’s necessary to address the technical differences between the old and new systems, such as the differences in data types, functions, SQL syntax, stored procedures, result consistency, etc. This is especially important when the tech stacks of the two systems are inconsistent, such as the current mainstream migration from traditional databases to big data platforms. 


Business Challenges

In the process of migration, how can we reduce the impact on the business and achieve as smooth a migration as possible? For this, you’ll need not only technical support, but also communication with management. Especially when the new platform has just launched, system instability may occur from time to time. It is essential to first establish a rapid response mechanism and strategy to reduce the impact on the business. 


Cost Challenges

The migration process requires you to invest a lot of manpower and resources. How can we improve the automation of migration, boost the migration efficiency, and effectively control the migration cost? This requires the development of various automated tools, such as data lineage analysis tools and SQL script migration tools, to greatly reduce the dependence on manpower. 


How do enterprises make migration successful?

The following figure shows the five best practices that are essential for an optimal AWS migration experience. Ensure communication systems are in place throughout the enterprise and you have a solid strategy, ready to implement, that accounts for the expected adjustment period and also minimizes the impact of the transition on the business.

removing oracle from aws
Figure 2: Best Practices of AWS Data Warehouse Migration


Summary

The successful migration experiences of internet enterprises represented by AWS vary in strategy, methods, and processes, so we can’t settle on one standard migration practice. However, one thing they all have in common is a team with solid technical strength that has gone through careful design, careful planning, continuous verification, iteration, and even trial and error.

Since joining Kyligence, I have participated in a number of projects for migrating traditional data warehouses to big data platforms. We have accumulated a lot of valuable first-hand experience on the migration of traditional data warehouses and have formed an overall plan for products, tools, methodologies, and services.


Here are some additional related resources:

Snowflake: The Good, Bad, and the Beautiful

The Evolution of Precomputation Technology

Migration Complete – Amazon’s Consumer Business Just Turned off its Final Oracle Database

Under the Hood: How Amazon Uses AWS Services for Analytics at a Massive Scale