Connect host directory data folder to docker container


I want to connect my mounted drive /mnt/data/ to Omnisci docker container, and eventually load data from this location. Appreciate any pointers to get this up and running. Thanks!

This is how i created the container - mydocker.
docker run --runtime=nvidia -d --runtime=nvidia -v /mnt/data:/omnisci-storage -p 6273-6280:6273-6280 omnisci/core-os-cuda:v5.4.1

I’ve also tried loading the file directly from /mnt/data but not having much luck

docker exec -it mydocker /omnisci/bin/omnisql

    User admin connected to database omnisci
    omnisql> COPY yellowtaxiJan19 FROM '/mnt/data/yellow_tripdata_2019-01.csv' WITH (header='true');
    Exception: File or directory "/mnt/data/yellow_tripdata_2019-01.csv" does not exist.

Hi @bycxgto,

Because it’s in a container OmniSci sees the filesystem layout of the container, not the host, so the command would be:

COPY yellowtaxiJan19 FROM '/omnisci-storage/yellow_tripdata_2019-01.csv' WITH (header='true');

Thanks @jpharvey and indeed the file path has now become /omnisci-storage/ and not the host directory!!!
allow me to go off the topic - is there a way that files can be loaded directly to omnisci without defining the tables before copying?

It’s not possible to have OmniSci create it automagically with a single COPY FROM command. Our Enterprise product has Immerse which can infer the schema when you do an upload through the web interface, and our pyomnisci Python library also has a way to infer the schema on load.

It’s also possible to use omnisql to detect the schema first, using the \detect command. You could capture the output of that and create the schema prior to running COPY FROM.

Both Immerse and omnisql use the detect_column_types API endpoint. Have a look at the definition if you were going to use Thrift directly - I find the Python Thrift bindings the most convenient reference:

1 Like