Showing posts with label Oracle AI. Show all posts
Showing posts with label Oracle AI. Show all posts

Saturday, May 18, 2024

How to help AI models generate better natural language queries

Using natural language to query your data is an easy way to answer business questions. One question I’m often asked is, “how can this work on my data? Have you seen my table and column names? The names are meaningless.” Fear not! It is possible when you’re using Autonomous Database.

There is no magic. If your table and column names aren’t descriptive, you can help the large language model (LLM) interpret the meaning of tables and columns by using a built-in database feature called “comments”. Comments are descriptions or notes about a table or column’s purpose or usage. And, the better the comment, the more likely the LLM will know how to use that table or column to generate a the right query.

Adding Comments to your tables and columns


Let’s take an example. My database has 3 tables. The table names and columns are meaningless:

TABLE1 TABLE2  TABLE3 
CREATE TABLE table1 (
c1 NUMBER,
c2 VARCHAR2(200),
c3 NUMBER
)
CREATE TABLE table2 (
c1 TIMESTAMP,
c2 NUMBER,
c3 NUMBER,
c4 NUMBER,
c5 VARCHAR2(100),
c6 NUMBER,
c7 NUMBER
)
CREATE TABLE table3 (
c1 NUMBER,
c2 VARCHAR2(30)
)

There is zero chance that a natural language query will know that these tables represent movies, genres and streams. We can fix that ambiguity by adding database comments:

TABLE1
COMMENT ON TABLE table1 IS 'Contains movies, movie titles and the year it was released';
COMMENT ON COLUMN table1.c1 IS 'movie ids. Use this column to join to other tables';
COMMENT ON COLUMN table1.c2 IS 'movie titles';
COMMENT ON COLUMN table1.c3 IS 'year the movie was released';
TABLE2
COMMENT ON TABLE table2 IS 'transactions for movie views - also known as streams';
COMMENT ON COLUMN table2.c1 IS 'day the movie was streamed';
COMMENT ON COLUMN table2.c2 IS 'genre ids. Use this column to join to other tables';
COMMENT ON COLUMN table2.c3 IS 'movie ids. Use this column to join to other tables';
COMMENT ON COLUMN table2.c4 IS 'customer ids. Use this column to join to other tables';
COMMENT ON COLUMN table2.c5 IS 'device used to stream, watch or view the movie';
COMMENT ON COLUMN table2.c6 IS 'sales from the movie';
COMMENT ON COLUMN table2.c7 IS 'number of views, watched, streamed';
TABLE3
COMMENT ON TABLE table3 IS 'Contains the genres';
COMMENT ON COLUMN table3.c1 IS 'genre id. use this column to join to other tables';
COMMENT ON COLUMN table3.c2 IS 'name of the genre';
 
That’s it! The meaningless table and column names can now be understood by the LLM using Select AI.

Set up your Select AI profile to use comments


A Select AI profile encapsulates the information needed to interact with an LLM. It includes the AI provider, the model to use, the source tables used for natural language queries – and whether comments should be passed to the model for SQL generation.

begin

  dbms_cloud_ai.create_profile(

    profile_name => 'myprofile',
    attributes =>       
        '{"provider": "azure",
          "azure_resource_name": "my_resource",                    
          "azure_deployment_name": "my_deployment",
          "credential_name": "my_credential",
          "comments":"true",  -- enable the use of comments
          "object_list": [
            {"owner": "moviestream", "name": "table1"},
            {"owner": "moviestream", "name": "table2"},
            {"owner": " moviestream", "name": "table3"}             
          ]          
          }'
    );

    dbms_cloud_ai.set_profile(
        profile_name => 'myprofile'
    );
end;
/

Run your queries


You can now start asking questions using natural language against your complex schema. Even though the table and column names are meaningless, the LLM is able to identify the appropriate tables and columns through the comments and generate a query:

How to help AI models generate better natural language queries

Summary

There is no magic! Properly describing your data will help you use natural language to get answers. Comments not only help an LLM successfully formulate queries, they also help you understand your data as well!

Monday, January 29, 2024

New AI capabilities with Oracle Analytics

Oracle Analytics delivers ongoing innovation through the year, and offers a wide range of AI capabilities. It provides users capabilities augmented with AI, Machine Learning, and Data Science, as well as leveraging Oracle Cloud Infrastructure (OCI) integrated AI Services.

Oracle Analytics offers 3 types of AI capabilities:

1. Built-in, AI-augmented capabilities.
2. Integrated ML capabilities.
3. Integrated AI capabilities.

New AI capabilities with Oracle Analytics
AI capabilities of Oracle Analytics - Author: Philippe Lions, VP Product Management, Oracle.
 
Here's a list to help you discover some of the key AI features for Oracle Analytics:

Built-in AI Augmented capabilities

AI Auto-Insights: This feature generates data visualizations automatically from a specific dataset. It will create an optimal visualization for the data elements selected.

New AI capabilities with Oracle Analytics
AI Auto Insights, Explain and Segmentation

Explain: This feature uses Machine Learning to automatically explain a metric or attribute - like “Attrition” in the example below - with a single click. It calculates the correlations, drivers, segmentations and anomalies for the dataset.

New AI capabilities with Oracle Analytics

1-Click ML/AI feature: this feature enables you to create statistics and ML-generated insights, like clustering with ML or forecasting, using algorithms like ARIMA or ETS with a single click.

New AI capabilities with Oracle Analytics

ML Integrated Capabilities

Drag and Drop ML and Data Science: this feature allows you to train models using numeric prediction, multi-classifier, binary classifier, or clustering with different built-in algorithms. These ML algorithms can be customized, trained, tuned, and then published to the wider analytics user community.

New AI capabilities with Oracle Analytics
ML Data Flow with Prediction by Low Code No Code
 
Integrated AI capabilities

Oracle Cloud offers multiple AI services: https://www.oracle.com/artificial-intelligence/ai-services/

Currently, Oracle Analytics is integrated with 3 of them and more are planned in our Product Roadmap. Integration means that you can create an AI service in OCI, register the AI model, and ingest the results into Oracle Analytics as a dataset to create visual data stories.

Below are some key OCI AI services currently integrated with Oracle Analytics:

1. AI Vision
2. AI Language
3. AI Document Understanding

Below you can review these AI services and discover data visualizations built using them.

AI Vision Integration

AI Vision provides Object Detection and Image Classification. There are live demos for each in the public OAC instance.

A blog and demo showing object detection is available here:

New AI capabilities with Oracle Analytics
Data Visualization showcasing the use of an OCI AI Vision - Author: Benjamin Arnulf

A blog and demo showing image classification is available here:

New AI capabilities with Oracle Analytics
Data Visualization showcasing the use of an OCI AI Image Classification trained model with 3,000 public MRI - Author: Benjamin Arnulf

AI language Integration

Oracle Analytics has the capability to analyze language using AI:    

New AI capabilities with Oracle Analytics
Oracle Analytics data disualization showing the use of an OCI AI Language with a pre-trained model - Author: Philippe Lions

AI Document Understanding

Oracle Analytics is integrated with OCI AI Document Understanding and can recognize and extract values from a passport, receipts, resumes (aka CVs), invoices and more. See a blog and demo below:

New AI capabilities with Oracle Analytics
Oracle Analytics Data Visualization showing the use of OCI AI Document Understanding with a pre-trained model for key value extraction. Authors: Benjamin Arnulf and Philippe Lions

AI Digital Assistant

This will be integrated into Oracle Analytics, as announced at the keynote of T.K. Anand at OCW23. In summary, you will be able to ask questions like “What is the root cause for attrition and the trend for the past 3 quarters?”, and the integration will leverage Cohere or other Large Language Models (LLM) to create an answer and generate relevant data visualizations. 

New AI capabilities with Oracle Analytics
Oracle Analytics and the OCI AI Digital Assistant using large language models to generate answers and data visualizations.

Source: oracle.com

Friday, January 19, 2024

Generative AI Chatbot using LLaMA-2, Qdrant, RAG, LangChain & Streamlit

Generative AI Chatbot using LLaMA-2, Qdrant, RAG, LangChain & Streamlit

In the evolving landscape of conversational artificial intelligence (AI), the Retrieval-Augmented Generation (RAG) framework has emerged as a pivotal innovation, particularly in enhancing the capabilities of chatbots. RAG addresses a fundamental challenge in traditional chatbot technology: The limitation of relying solely on pretrained language models, which often leads to responses that lack current, specific, or contextually nuanced information. By seamlessly integrating a retrieval mechanism with advanced language generation techniques, RAG-based systems can dynamically pull in relevant and up-to-date content from external sources. This ability not only significantly enriches the quality and accuracy of chatbot responses but also ensures that they remain adaptable and informed by the latest information.

In an era where users expect highly intelligent and responsive digital interactions, the need for RAG-based systems in chatbots has become increasingly critical, marking a transformative step in realizing truly dynamic and knowledgeable conversational agents. Traditional chatbots, constrained by the scope of their training data, often struggle to provide up-to-date, specific, and contextually relevant responses. RAG overcomes this issue by integrating a retrieval mechanism with language generation, allowing chatbots to access and incorporate external, current information in real time. This approach not only improves the accuracy and relevance of responses but also enables chatbots to handle niche or specialized queries more effectively. Furthermore, RAG’s dynamic learning capability ensures that chatbot responses remain fresh and adapt to new trends and data.

By providing more detailed and reliable information, RAG significantly enhances user engagement and trust in chatbot interactions, marking a substantial advancement in the field of conversational AI. This technique is particularly useful in the context of chatbots for the following reasons:

  • Enhanced knowledge and information retrieval: RAG enables a chatbot to pull in relevant information from a large body of documents or a database. This feature is particularly useful for chatbots that need to provide accurate, up-to-date, or detailed information that isn’t contained within the model’s pretrained knowledge base.
  • Improved answer quality: By retrieving relevant documents or snippets of text as context, RAG can help a chatbot generate more accurate, detailed, and contextually appropriate responses. This capability is especially important for complex queries where the answer might not be straightforward or requires synthesis of information from multiple sources.
  • Balancing generative and retrieval capabilities: Traditional chatbots are either generative (creating responses from scratch) or retrieval-based (finding the best match from a set of predefined responses). RAG allows for a hybrid approach, where the generative model can create more nuanced and varied responses based on the information retrieved, leading to more natural and informative conversations.
  • Handling long-tail queries: In situations where a chatbot encounters rare or unusual queries (known as long-tail queries), RAG can be particularly useful. It can retrieve relevant information even for these less common questions, allowing the generative model to craft appropriate responses.
  • Continuous learning and adaptation: Because RAG-based systems can pull in the latest information from external sources, they can remain up-to-date and adapt to new information or trends without requiring a full retraining of the model. This ability is crucial for maintaining the relevance and accuracy of a chatbot over time.
  • Customization and specialization: For chatbots designed for specific domains, such as medical, legal, or technical support, RAG can be tailored to retrieve information from specialized databases or documents, making the chatbot more effective in its specific context.

The need for vector databases and embeddings


When we investigate the retrieval augmentation generation systems, we must grasp the nuanced, semantic relationships inherent in human language and complex data patterns. But traditional databases, which are intended to be structured around exact keyword matches, often fall short in this regard. However, vector databases use embeddings—dense, multidimensional representations of text, images, or other data types—to capture these subtleties. By converting data into vectors in a high-dimensional space, these databases enable more sophisticated, context-aware searches. This capability is crucial in retrieval-augmentation-generation tasks, where the goal is not just to find the most directly relevant information but to understand and generate responses or content that are semantically aligned with the query. Trained on large datasets, embeddings can encapsulate a vast array of relationships and concepts, allowing for more intuitive, accurate, and efficient retrieval and generation of information, thereby significantly enhancing user experience and the effectiveness of data-driven applications.

In this post, we use the Llama2 model and deploy an endpoint using Oracle Cloud Infrastructure (OCI) Data Science Model Deployment. We create a question and answering application using Streamlit, which takes a question and responds with an appropriate answer.

High-level solution overview


Generative AI Chatbot using LLaMA-2, Qdrant, RAG, LangChain & Streamlit

Deployment of the solution uses the following steps:

  1. The user provides a question through the Streamlit web application.
  2. The Streamlit application invokes the predict call API to the model deployment.
  3. The model deployment invokes Langchain to convert user questions into embeddings.
  4. The function invokes an Qdrant Service API to send the request to the vector database to find the top k similar documents.
  5. The function creates a prompt with the user query and the similar documents as context and asks the large language model (LLM) to generate a response.
  6. The response is provided from the function to the API gateway, which is sent to the Streamlit server.
  7. The user can view the response on the Streamlit application.

Getting started


This post walks you through the following steps:

  1. Setting up the Qdrant database instance
  2. Building Qdrant with Langchain
  3. Setting up RAG
  4. Deploying a Streamlit server

To implement this solution, you need an OCI account with familiarity with LLMs, access to OCI OpenSearch, and OCI Data Science Model Deployment. We also need access to GPU instances, preferably A10.2. We used the following GPU instances to get started.

Generative AI Chatbot using LLaMA-2, Qdrant, RAG, LangChain & Streamlit

The workflow diagram moves through the following steps:

  1. Pass the query to the embedding model to semantically represent it as an embedded query vector.
  2. Pass the embedded query vector to our vector database.
  3. Retrieve the top-k relevant contexts, measured by k-nearest neighbors (KNN) between the query embedding and all the embedded chunks in our knowledge base.
  4. Pass the query text and retrieved context text to our LLM.
  5. The LLM generates a response using the provided content.

Hardware requirements


For the deployment of our models, we use a distinct OCI setup that uses the NVIDIA A10 GPU. In our scenario, we deployed 7b parameter model using NVIDIA A10.2 instance. We suggest using the Llama 7b model with the VM.GPU.A10.2 shape (24-GB RAM per GPU, two A10).

Prerequisites


Set up the key prerequisites before you can proceed to run the distributed fine-tuning process on OCI Data Science:

  • Configure a custom subnet with a security list to allow ingress into any port from the IPs originating within the CIDR block of the subnet to ensure that the hosts on the subnet can connect to each other during distributed training.
  • Create an Object Storage bucket to save the documents which are provided at time of ingestion in the vector database.
  • Set the policies to allow OCI Data Science resources to access OCI Object Storage buckets, networking, and others.
  • Access the token from HuggingFace to download the Llama2 model. To fine-tune the model, you first must access the pretrained model. Obtain the pretrained model from Meta or HuggingFace. In this example, we use the HuggingFace access token to download the pretrained model from HuggingFace by setting the HUGGING_FACE_HUB_TOKEN environment variable.
  • Log group and log from logging service to monitor the progress of the training.
  • Go to OCI Logging and select Log Groups.
  • Select an existing log groups or create one.
  • In the log group, create one predict log and one access log.
  • Select Create custom log.
  • Specify a name (predict|access) and select the log group you want to use.
  • Under "Create agent configuration," select Add configuration later.
  • Select Create agent configuration.
  • Notebook session: Used to initiate the distributed training and to access the fine-tuned model
  • Install the latest version of Oracle Accelerated Data Science (ADS) with the command, pip install oracle-ads[opctl] -U

Deploying the Llama2 model


Refer to the blog, Deploy Llama 2 in OCI Data Science, where we depicted on how to deploy a Llama2 model on an A10.2 instance.

To estimate model memory needs, Hugging Face offers a Model Memory Calculator. FurtherFurthermore, for insights into the fundamental calculations of memory requirements for transformers, Eleuther has published an informative article on the subject. Use the custom egress functionality while setting up the model deployment to access the Qdrant database.

Setting up the Qdrant database


To set up the Qdrant database, you can use the following options:

- Create a Docker container instance
- Use a Python client

Initialize Qdrant with Langchain


Qdrant integrates smoothly with LangChain, and you can use Qdrant within LangChain with the VectorDBQA class. The first step is to compile all the documents that act as the foundational knowledge for our LLM. Imagine that we place these in a list called docs. Each item in this list is a string containing segments of paragraphs.

Qdrant initialization

The next task is to produce embeddings from these documents. To illustrate, we use a compact model from the sentence-transformers package:

from langchain.vectorstores import Qdrant
from langchain.embeddings import LlamaCppEmbeddings
import qdrant_client
  
#Load the embeddings model
embedding = LlamaCppEmbeddings(model_path=model_folder_directory,n_gpu_layers=1000)

# Get your Qdrant URL and API Key
url = 
api_key = 

# Setting up Qdrant

client = qdrant_client.QdrantClient(
    url,
    api_key=api_key
)

qdrant = Qdrant(
    client=client, collection_name="my_documents",
    embeddings=embeddings
)

Qdrant upload to vector database

# If adding for the first time, this method recreate the collection
qdrant = Qdrant.from_texts(
                texts, # texts is a list of documents to convert in embeddings and store to vector DB
                embedding,
                url=url,
                api_key=api_key,
                collection_name="my_documents"
            )

# Adding following texts to the vector DB by calling the same object
qdrant.add_texts(texts) # texts is a list of documents to convert in embeddings and store to vector DB

Qdrant retrieval from vector database

Qdrant provides retrieval options in similarity search methods, such as batch search, range search, geospatial search, and distance metrics. Here, we use similarity search based on the prompt question.

qdrant = Qdrant(
    client=client, collection_name="my_documents",
    embeddings=embeddings
)

# Similarity search
docs = qdrant.similarity_search(prompt)

Setting up RAG


We use the prompt template and QA chain provided by Langchain to make the chatbot, which helps pass the context and question directly to the Llama2-based model.

from langchain.chains.question_answering import load_qa_chain
from langchain.prompts.prompt import PromptTemplate

template = """You are an assistant to the user, you are given some context below, please answer the query of the user with as detail as possible

Context:\"""
{context}
\"""

Question:\"
{question}
\"""

Answer:"""

chain = load_qa_chain(llm, chain_type="stuff", prompt=qa_prompt)

## Retrieve docs from Qdrant Vector DB based upon user prompt
docs = qdrant.similarity_search(user_prompt)

answer = chain({"input_documents": docs, "question": question,"context": docs}, return_only_outputs=True)['output_text']

Hosting Streamlit application


To set up Compute instances and host the Streamlit application, follow the readme on Github.

Generative AI Chatbot using LLaMA-2, Qdrant, RAG, LangChain & Streamlit

Source: oracle.com