JDBC Custom Socket Factory to use Unix Domain Sockets


#1

Hi guys,

I might be interested to use jdbc connection connection for higher troughput using Unix Domain Sockets instead of HTTP or TCP. I have experience with this on Postgres/MariaDB drivers via providing custom Socket factory, is such functionality also easy to implement on mapd?

The idea is to use super fast CQEngine for additional data storage within RAM directly where results from MapD will be stored temporarily and forwarded regularly to queue for later persistence.

Thx for response.

Ladislav


#2

I’m not exactly following your question, but I believe the answer is no, you either need to go through TCP or HTTP(S). This is because we use Apache Thrift to provide communication between the client and OmniSci.

So the JDBC driver at its core are the Thrift bindings, generated for Java. What you do after the JDBC client receives the payload is up to you of course, but I don’t think you have any choices besides the ones provided by Thrift.

Note that one of the Thrift methods uses IPC to provide pointers to the query results, so if you really need speed, you can use these methods to read the query results right out of CPU or GPU RAM


#3

Ok, but if I do something like:

public class ThriftUnixDomainSocketServer {
   Preformatted text public static void main(String[] args) throws IOException {
        File socketFile = new File("thrift.sock");
        if (socketFile.exists())
            socketFile.delete();
        AFUNIXServerSocket serverSocket = AFUNIXServerSocket.newInstance();
        serverSocket.bind(new AFUNIXSocketAddress(socketFile));
        AFUNIXSocket socket = (AFUNIXSocket) serverSocket.accept();
        TTransport ttransport = new TFastFramedTransport(new TIOStreamTransport(socket.getInputStream(), socket.getOutputStream()));
        TProtocol tprotocol = new TBinaryProtocol(ttransport);
        // tprotocol is ready to use!
    }
}

I can get domain socket based server and then connect via domain socket, right?

This idea is really more about discovering capabilities of GPU acceleration and enhance it with one of the fastest in-memory java collection like SQL tables: https://github.com/npgall/cqengine

The idea is that the output of GPU processing (queries) will be written to larger buffer of CQengine on the same host and will be available for fast reads in more materialized form by third party connectors.

This is more about some POC I started working on and is limited to single node at moment.

NOTE: The whole idea is about get rid of HTTPS and or TCP in my way to get max throughput and lowest latency possible.


#4

Honestly, this is just something you’d have to try and report back.


#5

I will describe my motivation, we evaluate here also ClickHouse (opensource quite fast stuff). So some guys say GPU beast is super expensive while we can do similar on low cost hardware. My idea is to link ClickHouse on single DGX machine with MapD, so they can interact and some way (not sure what will be the outcome, but will give it a try), will report if get any kind of result.