Wednesday, July 3, 2024

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

In this post, I provide an overview and examples of creating datasets from files stored in Oracle Cloud Infrastructure (OCI) Object Storage. I also explore the new capabilities with centralized file storage that enables the ability to schedule periodic reloads of the data stored in Object Storage. I review the creation of an OCI Resource connection including the creation of a dataset using the new connection, using the new UI to search and navigate the compartments, buckets, and objects, and creating datasets based on some files. Finally, I describe how you can manually reload or schedule a periodic reload of the dataset by updating files in Object Storage.

What Is OCI Object Storage?


OCI Object Storage enables you to securely store any type of data in its native format. With built-in redundancy, OCI Object Storage is ideal for building modern applications that require scale and flexibility, because it can be used to consolidate multiple data sources for analytics, backup, or archive purposes.

Creating an OCI Resource Connection


In order to access files stored in OCI Object Storage, you first create an OCI Resource Connection using an API Key. This connection is the same type of connection required for connecting Oracle Analytics to OCI functions and OCI models such as Vision and Language.

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

Creating Datasets from Files in OCI Object Storage


Once a connection has been successfully created, you can start the process of creating datasets from files in the OCI Object Storage buckets. Start the process by creating a dataset by clicking Create Dataset from the home page. Notice that the OCI Resource Connection is displayed as one of the data sources in the Create Dataset dialog.

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

Region Selection


After selecting the OCI Connection, use the dialog to change the default region if necessary, and to easily search for the right compartments, buckets, and objects that could include folders, subfolders, and files. Review the default region and change if necessary with the drop-down list.

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

Navigating and Searching Compartments


After selecting or keeping the default region, either manually navigate or enter a full or partial search string to search all the compartments. The search results are filtered to display only those compartments that meet the search criteria. The search is a wildcard, case-insensitive search.

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

Navigating and Searching Buckets


After clicking the correct compartment where the buckets reside, do the same type of navigation or wildcard search on the buckets. After locating the bucket where the files reside, click it and notice that all the objects in the selected bucket are displayed on the right-hand panel.

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

Navigating and Searching Objects and Selecting a File


Again, manually navigate the objects in the bucket, which can consist of folders, subfolders, and files, or perform a wildcard case-insensitive search. After locating the file to import into the dataset, click OK. The system imports the file into OAC and provides a preview of the contents of the file for review. After the review, click OK to bring the file into the Dataset Editor, where a representative sample is extracted and the deep semantic profile is triggered and the results are displayed in the form of the Data Quality Insights for the contents of the file.

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

Adding Another File from the Same OCI Connection


After creating the table based on the file from the OCI Object Storage bucket, notice that the connection (My OCI Connection), the resource (OCI Object Storage), and the imported file are listed in the left-hand panel. To add another file from the same connection, click the icon to the right of the resource. After clicking that icon, the navigation dialog is displayed again, and you can drill into the bucket again to get the second file. Add as many files as you need and join them to create the dataset. You can also join files from OCI Object Storage with database tables and other files.

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

Extract Credit – Scheduling a Dataset Reload


One of the really cool advantages of creating datasets from files in OCI Object Storage buckets is that you can create a recurring workflow where an upstream process can place updated files of the same name in the same bucket periodically. You can then schedule dataset refreshes to automatically update the data in cache from those updated files. You can set up either a one-time or a recurring schedule. Additionally, you can check the details of a schedule to see the last run time and the next scheduled run. This process and capability provides a way to update visualizations with the latest data from datasets based on files.

Unlock the Power of Your Data: Seamlessly Create and Refresh Datasets from Files Stored in OCI Object Storage

Call to Action


We hope you've enjoyed this overview of creating datasets from files in OCI Object Storage buckets! And we challenge you to start creating datasets from your files stored in buckets and hope that you find them to be both powerful and user-friendly. Keep exploring the powerful world of self-service data modeling and stay tuned for our upcoming blog posts, where we'll share more tips and tricks on both new and existing features of our product.

Source: oracle.com

Related Posts

0 comments:

Post a Comment