Wednesday, March 20, 2024

How to help AI models generate better natural language queries

Using natural language to query your data is an easy way to answer business questions. One question I’m often asked is, “how can this work on my data? Have you seen my table and column names? The names are meaningless.”  Fear not! It is possible when you’re using Autonomous Database.

There is no magic. If your table and column names aren’t descriptive, you can help the large language model (LLM) interpret the meaning of tables and columns by using a built-in database feature called “comments”. Comments are descriptions or notes about a table or column’s purpose or usage. And, the better the comment, the more likely the LLM will know how to use that table or column to generate a the right query.

Adding Comments to your tables and columns


Let’s take an example. My database has 3 tables. The table names and columns are meaningless:

TABLE1
CREATE TABLE table1 (
c1 NUMBER,
c2 VARCHAR2(200),
c3 NUMBER
TABLE2
CREATE TABLE table2 (
c1 TIMESTAMP,
c2 NUMBER,
c3 NUMBER,
c4 NUMBER,
c5 VARCHAR2(100),
c6 NUMBER,
c7 NUMBER
)
TABLE 3
CREATE TABLE table3 (
c1 NUMBER,
c2 VARCHAR2(30)
)

There is zero chance that a natural language query will know that these tables represent movies, genres and streams. We can fix that ambiguity by adding database comments:

TABLE1
COMMENT ON TABLE table1 IS 'Contains movies, movie titles and the year it was released';
COMMENT ON COLUMN table1.c1 IS 'movie ids. Use this column to join to other tables';
COMMENT ON COLUMN table1.c2 IS 'movie titles';
COMMENT ON COLUMN table1.c3 IS 'year the movie was released';
TABLE2
COMMENT ON TABLE table2 IS 'transactions for movie views - also known as streams';
COMMENT ON COLUMN table2.c1 IS 'day the movie was streamed';
COMMENT ON COLUMN table2.c2 IS 'genre ids. Use this column to join to other tables';
COMMENT ON COLUMN table2.c3 IS 'movie ids. Use this column to join to other tables';
COMMENT ON COLUMN table2.c4 IS 'customer ids. Use this column to join to other tables';
COMMENT ON COLUMN table2.c5 IS 'device used to stream, watch or view the movie';
COMMENT ON COLUMN table2.c6 IS 'sales from the movie';
COMMENT ON COLUMN table2.c7 IS 'number of views, watched, streamed';
TABLE3
COMMENT ON TABLE table3 IS 'Contains the genres';
COMMENT ON COLUMN table3.c1 IS 'genre id. use this column to join to other tables';
COMMENT ON COLUMN table3.c2 IS 'name of the genre';
 
That’s it! The meaningless table and column names can now be understood by the LLM using Select AI.

Set up your Select AI profile to use comments

A Select AI profile encapsulates the information needed to interact with an LLM. It includes the AI provider, the model to use, the source tables used for natural language queries – and whether comments should be passed to the model for SQL generation.

begin

  dbms_cloud_ai.create_profile(

    profile_name => 'myprofile',
    attributes =>       
        '{"provider": "azure",
          "azure_resource_name": "my_resource",                    
          "azure_deployment_name": "my_deployment",
          "credential_name": "my_credential",
          "comments":"true",  -- enable the use of comments
          "object_list": [
            {"owner": "moviestream", "name": "table1"},
            {"owner": "moviestream", "name": "table2"},
            {"owner": " moviestream", "name": "table3"}             
          ]          
          }'
    );

    dbms_cloud_ai.set_profile(
        profile_name => 'myprofile'
    );
end;
/

Run your queries

You can now start asking questions using natural language against your complex schema. Even though the table and column names are meaningless, the LLM is able to identify the appropriate tables and columns through the comments and generate a query:

How to help AI models generate better natural language queries

Source: oracle.com

Related Posts

0 comments:

Post a Comment