Discussions

Expand all | Collapse all

Specify CPU Memory From Python

Jump to Best Answer
  • 1.  Specify CPU Memory From Python

    Posted 04-17-2019 15:11
    A query I am trying to run is too large to fit in GPU memory, so if I try and run it by default, I get the error: Exception: Query would require a scan without a limit on table(s). Using omnisq from the command linel, I can run the query on the CPU by prefacing the query with \cpu
    . The closest equivalent I could find in the PyMapD library is the function select_ipc, but when I run the query by calling select_ipc, I get the same error. Is there another way to run a query using the CPU from Python without disabling the watchdog?
    #Core
    #Connectors


  • 2.  RE: Specify CPU Memory From Python

    Posted 04-17-2019 17:08
    Hi @James DelVesco

    You can do the following to get into a CPU data frame:
    query = 'SELECT xxx FROM <tablename>'
    df = connection.execute(query)​

    Here connection is the handle you acquired by using the pymapd connect to the OmniSci DB.

    Regards,
    Veda


  • 3.  RE: Specify CPU Memory From Python

    Posted 04-18-2019 11:34
    Hi @Veda Shankar,
    That's how we're querying from Python currently and it selects from GPU memory on the server.


  • 4.  RE: Specify CPU Memory From Python
    Best Answer

    Posted 04-17-2019 23:50
    Edited by Candido Dessanti 04-18-2019 14:23
    Hi @James DelVesco,

    there aren't methods to execute a query on CPU, but you can set the session in CPU mode with set_execution_mode method; it's a bit tricky but works

    conn = connect(your paramters of connection)
    conn._client.set_execution_mode(conn._session,2) -- session set in cpu mode
    df = conn.execute(you query here) -- the query will be executed on cpu
    conn._client.set_execution_mode(conn._session,1) -- session set in gpu mode
    df = conn.execute(you query here) -- the query will be executed on gpu

    you can check if it's working checking the server logs; you should find those messages

    I0418 08:32:10.788362 130774 MapDHandler.cpp:3975] User mapd sets CPU mode.
    I0418 08:32:23.793658 130774 MapDHandler.cpp:3971] User mapd sets GPU mode.

    It works like in mapdql utility.

    can you share the problematic query and the DDLs?


  • 5.  RE: Specify CPU Memory From Python

    Posted 04-18-2019 11:48
    Edited by James DelVesco 04-18-2019 11:48
    Hi @Candido Dessanti,

    Thanks for your help. Now the query is running, which is encouraging, but it doesn't seem to be returning any results.
    The query in question is​ SELECT XID, DoW, ToD, VV FROM IPA WHERE XID = 1
    and the DDL for the table is
    CREATE TABLE IPA (
    XID INTEGER,
    ToD TEXT ENCODING DICT,
    DoW TEXT ENCODING  DICT,
    AAD SMALLINT,
    VV SMALLINT,
    ST TEXT ENCODING DICT);




  • 6.  RE: Specify CPU Memory From Python

    Posted 04-18-2019 14:23
    Sorry @Candido Dessanti, I made a mistake in my code. Everything is running properly. Thanks for your help!​


  • 7.  RE: Specify CPU Memory From Python

    Posted 04-18-2019 23:57
    don't worry @James DelVesco,

    I'm glad you are able to continue your evaluation of omnisci database.

    Anyway, try with more complex queries with aggregates and joins, simple projection queries with little filtering and an high cardinality result don't make the software shine, while the GPU offer a decent speedup on filering.
    You can try to run this query multiple times filtering with the rowid logical column, so if your table has 1B rows you will launch the fisrt query adding the condition rowid between 000000000 and 99999999,
    the second with rowid between 100000000 and 199999999 and so on; doing that the query would be able to run on GPU
    ​ (I don't think you are running out of GPU memory, because the only column needed by the GPU is the xid one, all the others are needed for the projection that's done on CPU)