Friday, July 10, 2020

Oracle Database 20c Preview

Innovations in Oracle Database


In this blog entry we'll take a look at some of the new features inside Oracle Database 20c which has just been refreshed to version 20.3 on the Oracle Cloud's DBCS platform.

Oracle Database 20c is the next release of Oracle's multi-model database. Like the versions that have preceded it, Oracle Database 20c provides industry leading scalability, availability and security for both OLTP and analytical workloads. It supports relational, JSON, XML, spatial, graph, columnar, OLAP and unstructured data, enabling you to focus on building applications without having to worry about how to persist such data.

Support and Previous Releases


You can find details on the significant features in our previous yearly release in the following posts:

Oracle Database 18c
Oracle Database 19c

Oracle Database 20c is a yearly short term support release allowing users to try out new functionality, or take advantage of cutting-edge features for applications that could benefit from it. While we are very excited about this latest release, Oracle recommends that most users should consider upgrading to Oracle Database 19c, since it provides long term support all the way through to April 2026.

Converged Database


Since the initial releases of Oracle Database, Oracle has taken the approach that storing and managing data in a single database platform makes more sense than breaking it up and storing it in single use engines. Using multiple independent engines inevitably results in issues with data integrity, consistency and security. By using a single engine that provides the best of breed support for all of the major data types and development paradigms, users can benefit from all of Oracle Database's key capabilities such as, ACID transactions, read consistency, centralised security model, parallel scans and DML, online backups, point in time recovery etc. - regardless of the approach you take to storing the data.

The decision to centralize your data inside Oracle Database doesn't mean sacrificing the ability to build applications using whatever design approach you think is appropriate. Oracle supports the creation of single database application as well as those adopting event-driven or microservice paradigms. Key to this approach is the use of Oracle's Multitenant Architecture to provide each service with its own virtual database (PDB). This still allows you to manage many PDBs as one, and it simplifies the federation of the database via inter PDB SQL operations.

And, if you need to support millions of concurrent users or geographically distribute your database because of regulatory requirements, Oracle Database Sharding makes it simple to do this while still providing a converged data model.

Oracle Database 20c New Features


Oracle Database 20c introduces several features, far more than is covered in this short blog posting.

Let's go through some of the significant enhancements in Oracle Database 20c Preview release.

Blockchain Tables


Blockchain as a technology has promised much in terms of solving many of the problems associated with the verification of transactions. While considerable progress has been made in bringing this technology to the enterprise, a number of problems exist, with arguably the largest being the complex nature of building applications that can support a distributed ledger.

To simplify the introduction of this exciting technology in Oracle Database 20c we're introducing Blockchain Tables. These tables operate like any normal heap table, but with a number of important differences. The most notable of these being that rows are cryptographically hashed as they are inserted into the table, ensuring that the row can no longer be changed at a later date.

Oracle Database 20c, Oracle Database Exam Prep, Oracle Database Certification, DB Learning

This essentially creates an insert only table. Blockchain Tables don't allow users to update or delete rows. Users are also prevented from truncating the data, dropping partitions or dropping the table within certain time limits.

This important capability means that other users can trust that the data held in the blockchain table is an accurate record of events. Oracle Database 20c enables you to run a process that will verify all of the records are consistent with their hash signature.

SQL Macros


It is not unusual for a SQL statement to grow in complexity as the number of joins increase, or the operations performed on the retrieved data becomes more involved. It is also not uncommon for developers to try and solve this problem by using stored procedures and table functions to simplify these commonly used operations. This works extremely well to simplify code, but can potentially sacrifice some performance as the SQL engine switches context with the PL/SQL Engine. In Oracle Database 20c, SQL Macros solve this problem by allowing SQL expressions and table functions to be replaced by calls to stored procedures which return a string literal to be inserted in the SQL we want to execute. It's an incredibly simple concept and one that C and Rust programmers will be familiar with. The following trivial example shows it in action.

First, let's create a tables and insert a few rows.

CREATE TABLE(id integer, name varchar2(30), item_type varchar2(30), price float );
insert into line_items values (1, 'Red Red Wine', 'ALCOHOL', 15.6)
insert into line_items values (2, 'Its Cold Out There Heater', 'RADIATOR', 200.49);
insert into line_items values (3, 'How Sweet It Is Cake', 'FOOD', 4.56);

The SQL below calculates the value added tax on rows in our LINE_ITEMS table

select id,
  case
    when item_type = 'ALCOHOL' then round(1.2 * price, 2)
    when item_type = 'SOLAR PANEL' then round(1.05 * price, 2)
    when item_type = 'RADIATOR' then round(1.05 * price, 2)
    else price end as total_price_with_tax
from line_items; 

However in Oracle Database 20c we can simplify it by creating a function with the new SQL_MACRO keyword and returning a string.

create or replace function total_price_with_tax(the_price float, the_item_type varchar)
  return varchar2 SQL_MACRO(SCALAR) is
begin
  return q'[case
    when item_type = 'ALCOHOL' then round(1.2 * price, 2)
    when item_type = 'SOLAR PANEL' then round(1.05 * price, 2)
    when item_type = 'RADIATOR' then round(1.05 * price, 2)
    else price end as total_price_with_tax]';
end; 

We can then simply reference the SQL Macro inside of a select statement. The SQL that's executed is exactly the same as the original SQL Statement without the overhead of a context switch each time the row is fetched to execute our function.

SQL > select id, total_price_with_tax(price, item_type) from line_items;

  ID      TOTAL_PRICE_WITH_TAX(PRICE,ITEM_TYPE) 
  ------------------------------------------
   1                                   18.72 
   2                                   210.5145 
   3                                   4.56 

The same approach can be used in parametrised views and Polymorphic tables.

Oracle Database In-Memory Enhancements


Analysing data using a columnar model can result in massive performance improvements when compared to doing the same operations using a row-based model. However, updating data is significantly faster when using data held in rows. Oracle Database In-Memory is unique in that it allows you to benefit from both approaches. With this capability you can run your relational or JSON application unchanged and Oracle Database will maintain a columnar store supporting blazingly fast real-time analytical queries.

Oracle Database 20c introduces three major improvements to enhance performance and ease of use when using Oracle Database In-Memory functionality.

◉ Database In-Memory Vector Joins : Through the use of its newly enhanced Deep Vectorization SIMD Framework, Oracle Database In-Memory can accelerate operations like hash joins on columns held inside of the In-Memory column store. In the case of a hash join, the join is broken down into smaller operations that can be passed to the vector processor. The key-value table used is SIMD optimized and used to match rows on the left and right-hand sides of the join. This approach can result in join performance improvements of up to 10 times over traditional methods.

◉ Self Managing In-Memory Column Store : When Oracle Database In-Memory was first released, you had to explicitly declare which columns were to be populated into the In-Memory Column Store. This gave you a high degree of control if memory was tight. In Oracle Database 18c, we introduced functionality that would automatically place objects in the Column Store if they are actively used and removed objects that weren't. However, you still had to indicate the objects you wanted considered. In Oracle Database 20c when you set  INMEMORY_AUTOMATIC_LEVEL to HIGH, all objects are considered. This automatic memory management significantly simplifies the job of managing the column store.

◉ In-Memory Hybrid Columnar Scans : It is often not possible to have every column of every table populated in the Column Store because memory is limited. In many instances, this isn't an issue but every once in a while you may encounter  a query which needs some of the data(columns) from the Column Store and some data that's only available in the row store. In previous versions of Oracle Database In-Memory, such querys would simply run against the row store. In Oracle Database 20c we can now use both. The optimizer can now elect to scan the Column Store and fetch projected column values from the row store if needed. This can result in a significant improvements in performance.

Oracle Database 20c, Oracle Database Exam Prep, Oracle Database Certification, DB Learning

Hybrid Columnar Scan

Native JSON Datatype


We introduced support for JSON in Oracle Database 12c (12.1.0.2). It allowed JSON to be stored in the database inside of a varchar2 or a LOB (CLOB or BLOB). This meant it was possible to build applications with the flexibility offered by a schemaless design model but benefiting from the power of the Oracle Database. You could query the JSON documents using standard SQL, take advantage of advanced analytics, index individual attributes or whole documents and process billions of JSON documents in parallel. We also provided tools to discover what attributes made up the JSON documents and trivially create relational views on top of the collections. It was also possible for developers to treat the Oracle Database as if it were a NoSQL Database by accessing it with the SODA (Simple Object Data API) APIs available for Java, Node.js, Python, C and REST.

In Oracle Database 20c we are improving our JSON support by offering a Native data type, "JSON". This means that instead of having to parse JSON on read or update operations, the parse only happens on an insert and the JSON is then held in an internal binary format which makes access much faster. This can result in read and update operations being 4 or 5 times faster and updates to very large JSON documents being 20 to 30 times faster.

CREATE TABLE j_order (
   id     INTEGER PRIMARY KEY,
  po_doc JSON );

The new data type wasn't the only change that got introduced for JSON in Oracle Database 20c Oracle also added a new JSON function JSON_TRANSFORM which makes it much simpler to update and remove multiple attributes in a document in a single operation.

UPDATE j_order SET po_doc = JSON_TRANSFORM( po_doc,
                                            SET '$.address.city' = 'Santa Cruz’,
                                            REMOVE'$.phones[*]?(@.type = "office")’ )
 WHERE id = 555;

And of course, we also added compatibility for the new JSON datatype to our drivers and utilities like Datapump and GoldenGate.

Machine Learning for Python and AutoML


Machine Learning has been built into Oracle Database since the release of 8i. It doesn't require analysts or data scientists to extract the data onto a file system or specialist database, but rather allows you to leverage the power of Oracle Database and build models with over 30 Machine Learning algorithms running directly on data held in your tables. This approach of moving the algorithms to the data minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment.

In Oracle Database 20c we are introducing functionality to make it even simpler for users to take advantage of this functionality by providing Python Machine Learning interfaces to Oracle Database. This new client compliments the R and SQL interfaces already available. Data Scientists can now work in an environment they feel comfortable with, and simply treat the Oracle Database as a high performance compute engine. You can use programmatic structures similar to those in the Scikit-learn, and Panda frameworks which become simple proxies for tables in the database and then call the Oracle Database ML algorithms using Python function calls rather than SQL.

Oracle Machine Learning for Python also looks to simplify the process of selecting and tuning the right model which can be a complicated and time-consuming exercise. New AutoML functionality can select the model, the relevant attributes and then tune the hyper parameters. This dramatically reduces the time required to create a model that accurately creates predictions or classifies your data.

Oracle Database 20c, Oracle Database Exam Prep, Oracle Database Certification, DB Learning

In Oracle Database 20c Machine Learning we are also adding support for the MSET-SPRT and XGBoost algorithms, and the Adam Optimization solver for the Neural Network Algorithm.

Other Notable Enhancements


There were over 100+ enhancements to Oracle Database 20c. And whilst I'd love to cover them all in detail I'd simply be repeating a lot of content from Oracle Database New Features Guide. However there are a few more features that I think are of particular note.

Expression based init.ora parameters : It's now possible to base database parameters (init.ora) on calculations made on the configuration of the system, i.e. set the database parameter CPU_COUNT on half the number of CPUs available to the operating system.

Automatic Zone Maps : Exadata can now automatically create Zone Maps based on the predicates used in queries. Previously this was a manual operation requiring you to understand how the data would be accessed. This can dramatically reduce the number of blocks that need to be scanned.

Optimised Graph Models : Graphs can consist of millions or even billions of edges and vertices and so the storage optimisations we've made to the graph capabilities in Oracle Database 20c preview release can result in big space and performance improvements for your models.

Sharding Enhancements : To make it easier to develop Java applications against Oracle Sharding we've introduced a new Java Data Source that makes it simple to obtain connections without having to define the shard key or manage the connection key explicitly. We have also made sharding more fault-tolerant by automatically looking for alternates if the shard you are working on fails during execution.

Persistent Memory Support : In Oracle Database 20c we provide support for Databases running on top of Persistent Memory File Systems. PMEM File systems can offer significant latency and bandwidth improvements over traditional file systems using SSD or mechanical disks. However, the applications using them need to understand how to safely write to them and the most efficient way to use them in conjunction with other OS resources. Oracle Database 20c's implementation provides atomic writes, safe guarding against partial writes during unexpected power outages. It also offers Fast I/O operations using memory copy. In addition it efficiently uses database buffer cache by bypassing and reading directly from PMEM storage.

Source: oracle.com

Related Posts

0 comments:

Post a Comment