Discussions

Expand all | Collapse all

Is there a way for the CPU and GPU to operate simultaneously without being idle?

  • 1.  Is there a way for the CPU and GPU to operate simultaneously without being idle?

    Posted 02-06-2019 23:17

    I understand that MapD’s workload is that data is transferred from disk to CPU and then from CPU to GPU.
    I’d like to know if these steps are being done at the same time, or if the next process starts after the previous one is completed.

    Q) Is there a way for the CPU and GPU to operate simultaneously without being idle?



  • 2.  RE: Is there a way for the CPU and GPU to operate simultaneously without being idle?

    Posted 02-07-2019 05:21

    If you mean if there is a multi-threaded CPU processing on a single query, there is a little, but if you talking about the memory manager, it is mainly serial.

    Try to activate the enable-debug-timer on an omnisci instance to experiment how the software works.

    This is a log of a simple query ran on a cold started instance.

    I0207 08:08:45.705850  5948 Calcite.cpp:376] User mapd catalog mapd sql 'select avg(arrdelay) from flights ;'
    I0207 08:08:46.122820  5955 FileMgr.cpp:183] Completed Reading table's file metadata, Elapsed time : 6ms Epoch: 1 files read: 15 table location: '/opt/mapd_storage/data/mapd_data/table_1_169/'
    I0207 08:08:46.126930  5955 Catalog.cpp:2934] Instantiating Fragmenter for table flights took 10ms
    I0207 08:08:46.263607  5948 Calcite.cpp:395] Time in Thrift 5 (ms), Time in Java Calcite server 552 (ms)
    I0207 08:08:46.263666  5948 measure.h:80] Timer end                           parse_to_ra                         parse_to_ra: 4692 elapsed 558 ms
    I0207 08:08:46.263679  5948 measure.h:73] Timer start                     execute_rel_alg                     execute_rel_alg: 3935
    I0207 08:08:46.263691  5948 measure.h:73] Timer start                         getExecutor                         getExecutor:  109
    I0207 08:08:46.263701  5948 measure.h:80] Timer end                           getExecutor                         getExecutor:  109 elapsed 0 ms
    I0207 08:08:46.263711  5948 measure.h:73] Timer start                  executeRelAlgQuery                  executeRelAlgQuery:  119
    I0207 08:08:46.263716  5948 measure.h:73] Timer start           executeRelAlgQueryNoRetry           executeRelAlgQueryNoRetry:  138
    I0207 08:08:46.263850  5948 measure.h:73] Timer start                    executeRelAlgSeq                    executeRelAlgSeq:  359
    I0207 08:08:46.263860  5948 measure.h:73] Timer start                   executeRelAlgStep                   executeRelAlgStep:  395
    I0207 08:08:46.263893  5948 measure.h:73] Timer start                     executeWorkUnit                     executeWorkUnit: 1621
    I0207 08:08:46.263900  5948 measure.h:73] Timer start                     get_table_infos                     get_table_infos:  259
    I0207 08:08:46.263909  5948 measure.h:80] Timer end                       get_table_infos                     get_table_infos:  259 elapsed 0 ms
    I0207 08:08:46.263916  5948 measure.h:73] Timer start                Exec_executeWorkUnit                     executeWorkUnit:  986
    I0207 08:08:46.263947  5948 measure.h:73] Timer start             execution_dispatch_comp                     executeWorkUnit: 1032
    I0207 08:08:46.264796  5948 measure.h:73] Timer start                      buildJoinLoops                      buildJoinLoops:  189
    I0207 08:08:46.264811  5948 measure.h:80] Timer end                        buildJoinLoops                      buildJoinLoops:  189 elapsed 0 ms
    I0207 08:08:46.484344  5948 measure.h:80] Timer end               execution_dispatch_comp                     executeWorkUnit: 1032 elapsed 220 ms
    I0207 08:08:46.484449  5971 measure.h:73] Timer start              execution_dispatch_run                          operator(): 1056
    I0207 08:08:46.484474  5971 measure.h:73] Timer start                         fetchChunks                         fetchChunks: 1682
    I0207 08:08:46.485982  5971 BufferMgr.cpp:309] ALLOCATION slab of 4194304 pages (2147483648B) created in 1 ms GPU_MGR:0
    I0207 08:08:46.486006  5971 BufferMgr.cpp:309] ALLOCATION slab of 8388608 pages (4294967296B) created in 0 ms CPU_MGR:0
    I0207 08:08:46.567407  5971 measure.h:80] Timer end                           fetchChunks                         fetchChunks: 1682 elapsed 82 ms
    I0207 08:08:46.567466  5971 measure.h:73] Timer start           executePlanWithoutGroupBy           executePlanWithoutGroupBy: 1941
    I0207 08:08:46.567474  5971 measure.h:73] Timer start                        lauchGpuCode                       launchGpuCode:   97
    I0207 08:08:46.572422  5971 measure.h:80] Timer end                          lauchGpuCode                       launchGpuCode:   97 elapsed 4 ms
    I0207 08:08:46.574157  5971 measure.h:80] Timer end             executePlanWithoutGroupBy           executePlanWithoutGroupBy: 1941 elapsed 6 ms
    I0207 08:08:46.574188  5971 measure.h:80] Timer end                execution_dispatch_run                          operator(): 1056 elapsed 89 ms
    I0207 08:08:46.574270  5948 measure.h:80] Timer end                  Exec_executeWorkUnit                     executeWorkUnit:  986 elapsed 310 ms
    I0207 08:08:46.574337  5948 measure.h:80] Timer end                       executeWorkUnit                     executeWorkUnit: 1621 elapsed 310 ms
    I0207 08:08:46.574347  5948 measure.h:80] Timer end                     executeRelAlgStep                   executeRelAlgStep:  395 elapsed 310 ms
    I0207 08:08:46.574352  5948 measure.h:80] Timer end                      executeRelAlgSeq                    executeRelAlgSeq:  359 elapsed 310 ms
    I0207 08:08:46.574370  5948 measure.h:80] Timer end             executeRelAlgQueryNoRetry           executeRelAlgQueryNoRetry:  138 elapsed 310 ms
    I0207 08:08:46.574440  5948 measure.h:80] Timer end                    executeRelAlgQuery                  executeRelAlgQuery:  119 elapsed 310 ms
    I0207 08:08:46.574453  5948 measure.h:73] Timer start                        convert_rows                        convert_rows: 4143
    I0207 08:08:46.574470  5948 measure.h:80] Timer end                          convert_rows                        convert_rows: 4143 elapsed 0 ms
    I0207 08:08:46.574479  5948 measure.h:80] Timer end                       execute_rel_alg                     execute_rel_alg: 3935 elapsed 310 ms
    

    I guess that you will get some simultaneous CPU/GPU processing if you run multiple queries at same time that have to be parsed and compiled