Friday, January 8, 2021

Oracle BlueKai Data Management Platform scales to 1 Million transactions per second with Oracle Database Sharding deployed in Oracle Cloud Infrastructure

Oracle Database Tutorial and Material, Database Exam Prep, Database Certifications, Database Career, Database Prep

I recently had a conversation with Matt Abrams who is

Group Vice President of Engineering for Oracle BlueKai Data Management Platform, a part of Oracle Cloud Applications.

We talked about the use of the Sharding feature of Oracle Database for the BlueKai Oracle Data Management Platform.

This may be one of the biggest relational OLTP database deployments in the world.  

Below is an excerpt.

Could you please start by describing the scale of the deployment?

Matt: The Oracle Database shards are deployed across multiple availability domains in Oracle Cloud. Below are key metrics for primary databases:

Transactions 1 Million/second
Events  30 Billion/day 
API Calls   30 Billion/day 
API Payload Size   125 Kilobytes/API call (Average) 
Average read time   1.6 Milliseconds/API call 
Average write time   2.5 Milliseconds/API call 
Total database size   2.5 Petabytes 
Rows in largest table   22 Billion
Redo generation rate   180 Terabytes/hour 
Network traffic   1 Terabit/second 
Total machines   52 Oracle Compute Instances
Total CPU   2,704 Cores 
Total memory   38,740 Gigabytes 

Oracle Database shards are configured for High Availability and Disaster Recovery to meet 99.99% uptime SLAs. 

Very cool. Seems like a very interesting application!

Matt: Yes, this is the Data Management Platform (DMP), formerly known as BlueKai, a company that was acquired by Oracle in 2014. The DMP ingests, organizes, provides insights and analytics on, and activates audiences on a large number of advertising platforms.

Efficiently processing data on this scale requires a significant amount of innovation in various areas including: 

◉ Automatic content classification algorithms
◉ Bespoke probabilistic data structures used to perform unique counting and overlap analysis at a massive scale
◉ Autonomic infrastructure services that automatically adapt to the dynamic nature of our data processing environments
◉ Robust audience taxonomies
 
These are impressive numbers. How did Oracle Sharding help DMP meet its business objectives?

Matt: Oracle Sharding helped us improve in following areas.

Scalability: Sharding’s linearly scalable architecture allows us to continue to scale as transactions and data volume grows with no impact on latency. Having linear scalability is absolutely critical for us to be able to support our business growth.

Availability: Sharding improved application availably by provides fault isolation. Any issue with a given shard has no impact on availability of other shards. 

Oracle DMP is deployed in Oracle Cloud Infrastructure regions with multiple availability domains. We have spread our primary shards across all the availability domain in a given region. We are using Oracle Data Guard to have a replica/standby instance in a different availability domain than that of primary instance.

Due to high transaction rate coupled with write intensity of our application, we do see occasional hardware failures. In such recent instances, the failover was seamless.

By using multiple availability domain and multiple regions in Oracle Cloud Infrastructure along with Oracle Data Guard, we have protection from failure of an availability domain or even an entire region.

Performance: This might sound counterintuitive, but with Oracle Sharding, we saw our performance improve compared to other key-value stores we have used in past.

Stability: And there is something to be said about stability of Oracle Database. Since going live with Oracle Database, we haven’t had a single database outage due to software issues.

Holiday season tends to be our busiest time of the year. In past, with key-value stores, we typically saw about 3 to 4 percent of request failures (and retries) during peak loads. This year with Oracle Database, holiday season came and went, and we did not see any such failure.

What are the key reasons for adopting Oracle Sharding?

Matt: Oracle DMP’s data processing ecosystem has grown organically over time. We have systems for data streaming, real-time key-value databases, distributed batch data processing, and workflow management to name just a few. More systems mean more complexity and cost. Complexity comes in many forms such as:

◉ Workflow management and coordination between disparate systems
◉ Data consistency issues between systems
◉ Lack of ACID transactions places the burden on the application
◉ Storing the same data multiple times in different systems to support various use cases is expensive

In the past, once your data volume and velocity reached a certain threshold, it became impractical and usually impossible to use traditional RDBMS technology. Scaling vertically hits both cost and practical limits and consolidates the blast radius for system failure. Key-value stores can scale horizontally, but until now that has meant sacrificing features that a traditional database would provide.  

This changes with the release of Oracle Sharding. With Oracle Sharding we have an ACID compliant and horizontally scalable database that is capable of supporting both near real-time key-value use cases as well as complex analytics operations.

Oracle Database Tutorial and Material, Database Exam Prep, Database Certifications, Database Career, Database Prep
With Oracle Database’s converged architecture, we now can use a single data system that drastically reduces complexity, decreases cost, and allows us to simplify our data architecture by consolidating logic and data into a single data store that meets a diverse set of needs.

What alternative solutions did Oracle DMP explore before adopting Oracle Database? In what ways is Oracle Database superior to those alternatives?

Matt: Oracle DMP has used a variety of key-value stores alongside more traditional databases for years. When thinking of alternatives to an Oracle Sharded Database, you aren’t thinking about a single database. Instead, you are thinking about a group of databases that each perform in one specific area. The promise of a sharded Oracle Database is that it can perform in all of the areas we need it to.

Any lessons learned, or advice to other companies in a similar position?

Matt: Over the past 20 years we’ve been trained to make compromises in our data processing systems because legacy databases couldn’t scale to meet our scalability and availability requirements. With Oracle Sharding we have the opportunity to rethink and simplify our data architectures. My advice is not to look at the Sharded database as a replacement for one system or function, but instead use it as an opportunity to simplify your data architecture and claw back some of the compromises you may have made in the past for the sake of scalability and availability.  

How does Oracle DMP expect the deployment of Oracle Database Sharding technology to impact the business overall and the customers it supports?

Matt: Over time we expect to deliver new features to our customers more quickly thanks to the simplified data processing architecture. Now that we have full transactions and the ability to run analytics queries in the same environment, when we run our real-time data queries, we will be able to offer more precise and lower latency responses to queries generated on behalf of our customers.

Any thoughts on the potential adoption of other Oracle technologies that will further strengthen the architecture?

Matt: Oracle Sharding + Oracle Cloud Infrastructure is a powerful combination.  

Sharding represents a fundamental shift in how Oracle Database technology works.

Elasticity is a critical value proposition here.

Source: oracle.com

Related Posts

0 comments:

Post a Comment