Currently the parsing, optimization and parts of rendering can overlap between queries but most of the execution occurs single file. We may relax this in the future to only cover the GPU portion of the execution (so that things like CPU reduction can occur in parallel).
We’ve found that in general you get the most throughput on the GPU by allowing a query to have all the resources, and don’t have to worry about contention for things like buffer or cache memory. And if you can get queries done very quickly, you then get low latency even with many simultaneous queries.
I think that the issue in your case the real issue may be that the queries are kicking to CPU for the reasons I mentioned above (i.e. should be fixed in 3.1 targeted for next week). This would explain why adding more GPUs is not giving you a performance boost.
However for simple queries on relatively small datasets we are considering supporting executing on subsets of GPUs (smaller than the total number of GPUs) so that different GPU groups can execute at the same time. Expected gains from this configuration would be from parallelizing “fixed overheads” on each query between MapD servers on the same node. Right now you can emulate this behavior by running multiple MapD servers on the same node, mapping each to different sets of GPUs with the
--num_gpus flag (see here for the relevant docs). Note that in this case you’d need each MapD server to have its database, which since your systems are read only could be done by copying the directory for the original database n times, where n is the number of servers you want to run. Depending on the query workload, this may give you anything from no speedup to a decent speedup over running one “mongo” server.
Obviously we eventually should allow a more elegant solution to this, i.e. to allow the user to do it all from one server.
My recommendation would be to try the 3.1 release to see if it gives you acceptable latency with multiple queries and if its not enough try the replication technique described above. Let us know how it goes!