Is there a memory replacement mechanism for mapd?


#1

Hi Mapd experts,

I generated a table with 1.2B rows data(about 120G size). And trying to executing a query on GPU mode. Then I got a exception:

Exception: Query couldn't keep the entire working set of columns in GPU memory

I have read this discussion(Query couldn't keep the entire working set of columns in GPU memory).

Does mapd has a memory replacement mechanism when GPU memory is not enough to executing a query with a large table?

My machine hardware status:
CPU: 2xIntel E5-2683
MEM:128G
Video cards:4xGeForce GTX1080, each card has 8G memory installed

Thanks,
Mike


How does MapD handle large dataset with "enable_watchdog=false"?
Exception: Failed to run the cardinality estimation query: Query couldn't keep the entire working set of columns in GPU memory
#2

From my knowledge yes, MapD auto manage the space available on gpu and cpu memory.

nvidia.smi is your friend: with this command you can quickly check if you have some memory trouble (in my case tensorflow was eating all the memory of my poor single gtx1080).

Cheers.


#3

You can also try the \memory_summary from mapdql to get the amount of CPU and GPU memory actively used. Its more accurate than nvidia-smi since it only counts memory actually used (MapD generally alocates memory in fixed sized blocks, by default 2GB).

https://www.mapd.com/docs/latest/mapd-core-guide/mapdql/?highlight=memory_summary


#4

Thanks for your answers.

I tried \memory_summary command. It show me these:

CPU RAM IN BUFFER USE :  28274.91 MB
GPU VRAM USAGE (in MB's)
GPU     MAX    ALLOC    IN-USE     FREE
 0  7985.44  7985.44   7324.22   661.22
 1  7985.44  7985.44   7324.22   661.22
 2  7985.44  7985.44   7034.68   950.76
 3  7985.44  7985.44   6591.80  1393.64

And I checked the mapd_server.ERROR log file, there have two error message:

E0610 10:33:42.996304  2117 BufferMgr.cpp:370] ALLOCATION failed to find 599870464B throwing out of memory GPU_MGR:1
E0610 10:33:43.041529  2118 BufferMgr.cpp:370] ALLOCATION failed to find 599870464B throwing out of memory GPU_MGR:0

It seems mapd_server cannot allocate more memory on GPU.

Does it means I cannot run a query with data over than my machine total GPU memory?

Is there a memory replacement mechanism between CPU memory and GPU memory(like virtual memory) makes me run a large dataset with little GPU memory?

Thanks


#5

Hi,

MapD has two options when there is not enough GPU memory available to meet the requirements of executing a query.

The first option is to turn the watch dog off which will allow the query to run in stages on the GPU. MapD will then orchestrate the transfer of data through the various layers of our data abstraction to move the data onto the GPU for execution.

The second option is to set option allow-cpu-retry to direct queries, that do not fit in the GPU memory available, to fall back and be executed on the CPU.

It is probably worth noting here that MapD Core is an in-memory database, and so if you believe your common use case is going to exhaust the capabilities of the VRAM on the GPU’s available we might recommend doing a scaling operation to determine what size installation you need to solve your issue at hand. MapD can scale across multiple GPU’s in a single machine (up to 20 physically (most we have found in one machine), and up to 64 using GPU visualization tools like bitfusion flex), then MapD can scale across multiple machines in a distributed model allowing for many servers each with many cards. So the size of data that can be operated on is very flexible.

If you described the end goal of what you are trying to determine in regards to MapD in a broad context including your use case and some details of your schema and data sizing, we would be able to guide you more directly to an endpoint rather than chasing individual tidbits.

regards


TSocket::write_partial() send() <Host: localhost Port: 9091>Broken pipe
How to solve "Query couldn't keep the entire working set of columns in GPU memory"?
#6

Hi dwayneberry,

Our data size is about 200~300GB(hundreds MB increase daily), and now we want to build up an internal-used instant analysis system to support managment. My task is to evaluate the possibility of using the open-source mapd as our basic platform of this system.
Now we have only one server equipped with 4 Nvidia cards. So we expect it can run on the single server to handle all data. You mentioned mapd can be distributed to multiple machines, does the open-source version mapd have this ability?

Thanks


#7

Hi,

The open source version of MapD does not support distributed deployment.

With 4x1080GTX you have about 32GB of VRAM so approximately 10% of your data size.

What needs to be determined is how much of that data NEEDS to be on the GPU’s for your set of queries, and what options there are to compress the size of the data being used.

Is it possible to share your schema. Also you can run \o on your tables from mapdql and it will try to give you a memory optimized schema to reload your data based on what the currently distributions in your data are (so make sure you run it on a representative set of data)

What do ‘normal’ queries look like? Can you share an example of your query set? Are the queries being generated via Immerse?

Regards


#8

The system we are planning is to collect log files from servers, format log into useful tables, and run some sql script to analysis product status. We just started technical investigation, so I am afraid not able to share schema. I guess the sql queries would be quite simple, normal aggregate functions, group, sort, join, filter, enough I think. Since it is internal used, command line is ok.