The first query is very slow

I use the omnisciDB to execute query SQL,When the data is not in memory, the query is very slow;For example: the first execute SQL “select var1 from my_table” , It took 300 ms.But the second execute SQL “select var1 from my_table” , It took 10 ms;How can I solve the problem of slow first query?

HI @ShaoZaoWang,

First of all, I want to welcome you to the Omnisci Community forum.

The first time you run a query, several reasons will run slower than subsequent executions.

  • The query itself needs to be parsed, optimized, and compiled (we generate an LLVM program for each query). Subsequent execution reuse the code generated the first time.

  • If the data you are asking for isn’t in the CPU and/or GPU cache, it needs to be read from the disk to populate caches

Both processes take some time, and we can only mitigate these pre-warming caches and the optimizer using the db-query-list parameter in omnisci.conf file or using cli and restarting the database.

The processes quite straightforward, and you can read more about them here.

https://docs.omnisci.com/troubleshooting/optimizing-performance#preloading-data

if you haven’t defined any user and you are on the default database, you can use a file like that to warm up

USER admin omnisci {
select count(col1),count(col2),count(col3) from my_table1;
select col1,col2,col3 from my_table1 limit 10;
select col1 from my_table1 where col2 = 10 limit 10;
}

the first query would warm up the caches reading the col1,col2, and col3 columns into CPU/GPU memory; the two subsequent queries should warm up the cache of the optimizer, so assuming you have enough memory to fit the col1,col2, and col3 columns in the cache, the query will run at full speed since the first run.

the first time you will use a query that has to be parsed and optimized you will get a performance hit, but at least the data will be already in memory, so the execution will be an half-way between the slower and faster.

a query like

select col1,count(*) from my_table1 group by col1;

needs to be parsed, because is in the list, while this query

when you restart the database, connect with omnisql and check for memory cached by the queries you specified in the db-query-list file issuing the \memory_summary command

you should get an output like this

User admin connected to database omnisci
omnisql> \memory_summary
OmniSci Server CPU Memory Summary:
            MAX            USE      ALLOCATED           FREE
    98304.00 MB     1283.50 MB     4096.00 MB     2812.50 MB

OmniSci Server GPU Memory Summary:
[GPU]            MAX            USE      ALLOCATED           FREE
  [0]    32768.00 MB     1283.50 MB     2048.00 MB      764.50 MB

The memory in use is the one used to cache the columns in your queries

Tell us if the performances meet your expectation, by using the db-query-list*

Regards,
Candido