Hi, I’m trying to load a large (>5GB) CSV file to MapD using StreamImporter.
I have MapD in Docker (CPU only).
Here is what I do:
docker run -d -v $HOME/mapd-docker-storage:/mapd-storage -p 9090-9092:9090-9092 mapd/mapd-ce-cputo start the container
docker exec -it <containerID> bin/mapdqlto start the client
CREATE TABLE IF NOT EXISTS basic ( ...);to create a table
cat /mapd-storage/bas.csv | bin/StreamImporter basic mapd -u mapd -p HyperInteractive --delim ',' --batch 1000000to import the file
What happens is that som rows are imported and then suddenly the container terminates and in the logs I found this error Could not sync file to disk.
I’ve also tried other import methods as well, but with StreamImporter I was able to import the most number of rows.
Thanks for any suggestions…
Here is the entire log:
Log file created at: 2018/03/22 08:25:31 Running on machine: 10bbe3099ee0 Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg E0322 08:25:31.523133 8 MapDHandler.cpp:152] This build isn't CUDA enabled, will run on CPU E0322 08:25:31.980234 8 MapDHandler.cpp:184] No GPUs detected, falling back to CPU mode F0322 08:35:43.294337 54 FileMgr.cpp:529] Could not sync file to disk