Discussions

Expand all | Collapse all

Connect Superset/Looker to Mapd

  • 1.  Connect Superset/Looker to Mapd

    Posted 10-02-2017 06:59

    Our business analysts are familar with tools like Looker (https://looker.com) and Apache Superset (https://github.com/apache/incubator-superset).
    What would be the best way of connecting Mapd to these query/visualization tools?
    We’re currently running Mapd Community Edition on AWS for evaluation purposes.



  • 2.  RE: Connect Superset/Looker to Mapd

    Posted 10-02-2017 13:57

    I’ve asked a similar question on the Looker forum recently, here is what I got in reply:



  • 3.  RE: Connect Superset/Looker to Mapd

    Posted 10-02-2017 14:19

    Hi Dimitri,

    Thanks for your reply!
    Did you get looker to work with mapd?
    If so: how are you currently connecting the two?
    And if not: did you contact mapd directly about this?



  • 4.  RE: Connect Superset/Looker to Mapd

    Posted 10-02-2017 14:39

    I’m still testing it locally with an assumption that I will be able to connect somehow



  • 5.  RE: Connect Superset/Looker to Mapd

    Posted 10-24-2017 14:39

    Hi @jjdr ,

    i just finished a minimal (read-only) implementation of a sqlalchemy dialect for Mapd; i can share the code if you want to test



  • 6.  RE: Connect Superset/Looker to Mapd

    Posted 10-24-2017 19:57

    Hi @aznable, that’s awesome to hear! Is this something you’d be willing to share with the community?



  • 7.  RE: Connect Superset/Looker to Mapd

    Posted 10-25-2017 01:33

    Of course but I have to do a package (at least a basic setup) and clean up the code; it’s working but I haven’t tested extensively

    I already did a pull request to include the mapd query runner I wrote for Redash; I guess it will be available in the next days.

    I am a bit short on time lately so it’s taking more time than expected to write connectors for most promising open source dv tools



  • 8.  RE: Connect Superset/Looker to Mapd

    Posted 10-26-2017 15:36

    Sounds great @aznable! Would be happy to test your code, did you put it on github/somewhere else?
    If you don’t want to share publicly you can also send me a DM of course.



  • 9.  RE: Connect Superset/Looker to Mapd

    Posted 10-28-2017 04:35

    UHno problems to share

    First of all you have to install pymapd the python driver of mapd

    Is available in pipy repository so you can install with pip install pymapd, the you have to download this file

    sqlalchemy_mapd.zip (38.1 KB)

    to install you have to launch set setup.py with install command so

    python setup.py install

    the connect string format is
    mapd://username:password@host:port/database_name
    so mapd://mapd:HyperInteractive:yourhost:9091/mapd

    i suggest to try with the newest release of mapd because superset for every statetemt calls a create_engine, then a connect and a close on connect that does not close anything on database (it release the connection to the sqlalchemy pool, but instead reusing on subsequent call it close the connection and reopen…crazy) so with just 8 session you are running out of connection quite fast.

    you can try to add “engine_params”: {“poolclass” : “NullPool” } on extra section of database connection, but on my windows enviroment isn’t working with any dialect; even calling directly sqlalchemy create_engine fail, but maybe is a problem of my enviroment only

    if you want time grains avaiable on superset you have to add this class on superset db_engine_specs.py file

    class MapdEngineSpec(BaseEngineSpec):
    engine = ‘mapd’
    time_grains = (
    Grain(“Time Column”, _(‘Time Column’), “{col}”),
    Grain(“second”, _(‘second’), “DATE_TRUNC(second, {col})”),
    Grain(“minute”, _(‘minute’), “DATE_TRUNC(minute, {col})”),
    Grain(“hour”, _(‘hour’), “DATE_TRUNC(hour, {col})”),
    Grain(“day”, _(‘day’), “DATE_TRUNC(day, {col})”),
    Grain(“week”, _(‘week’), “DATE_TRUNC(week, {col})”),
    Grain(“month”, _(‘month’), “DATE_TRUNC(month, {col})”),
    Grain(“quarter”, _(‘quarter’), “DATE_TRUNC(quarter, {col})”),
    Grain(“year”, _(‘year’), “DATE_TRUNC(year, {col})”),
    )

    the performance with Mapd is very good, but it would be better without all those reconnections to database; they are costing 0.1/0.2 seconds fo lag

    i dont know is sql lab is working because on my installation, it simply does not work with any database, but you will get the table and columns list, so i am confident it will work for you.

    any feedback is appreciated, so i will be able to improve the code



  • 10.  RE: Connect Superset/Looker to Mapd

    Posted 11-01-2017 10:21

    Will report back as soon as I have had time to try this out, thanks again for sharing!



  • 11.  RE: Connect Superset/Looker to Mapd

    Posted 07-23-2019 02:33
    Well @Candido Dessanti, I cannot have access to sqlalchemy_mapd file. I am really interesting in using Apache Superset with MapD Core and your comment is the only helpful one.


  • 12.  RE: Connect Superset/Looker to Mapd

    Posted 07-23-2019 04:48
    @Silviu-Daniel Vijiala

    I shared the file in Mediafire because I'm having troubles with Company Proxy.
    You can download http://www.mediafire.com/file/q32ry3nv0lzi1qm/sqlalchemy_mapd.zip/file

    I've not tested the driver since October 2017, and both SuperSet and pymapd driver have changed, so cross fingered.
    If you face problems, feel free to report, but it'll take time set up an environment with Apache Superset


  • 13.  RE: Connect Superset/Looker to Mapd

    Posted 12-12-2019 19:39
    Hi @Candido Dessanti. Thank you for posting your code! I'm giving it a go here but running into an error right off the bat. I'm new to Omnisci and SQLAlchemy dialect definition so I'm hoping you might be able to point me in the right direction.

    I had started out by getting Superset and Omnisci environments going but Superset is throwing an exception when trying to connect using your dialect. It looked at the code Superset runs and it's doing a simple test by all creating a  DB engine with SQLAlchemy and running a select.

    I've copied the failing Superset connection test code and run it externally. Note that I made two changes in your code to switch from "mapd://" to  "omnisci://" in the connection string - believe that's what the newer pymapd is using now.
    from sqlalchemy.engine import Dialect, Engine, url, create_engine
    from contextlib import closing
    from sqlalchemy import and_, or_, select
    
    engine = create_engine("omnisci://admin:HyperInteractive@localhost:6274/omnisci")
    
    with closing(engine.connect()) as conn:
        conn.scalar(select([1]))
        print ("SUCCESS")
    


    The error:

    sqlalchemy.exc.DBAPIError: (pymapd.exceptions.Error) Exception: Non-empty LogicalValues not supported yet
    [SQL: SELECT 1]

    Full traceback:
    /home/user/ve/omni-db-connector/bin/python /mnt/NTFS_D/Documents/PyCharmProjects/superset/omni-db-connector/sqlalchemy_mapd/example_checking_dialect.py
    /home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/default.py:385: SAWarning: Exception attempting to detect unicode returns: DBAPIError('(pymapd.exceptions.Error) Exception: Non-empty LogicalValues not supported yet')
      "detect unicode returns: %r" % de
    Traceback (most recent call last):
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/pymapd/cursor.py", line 117, in execute
        nonce=None, first_n=-1, at_most_n=-1)
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/omnisci/mapd/MapD.py", line 1598, in sql_execute
        return self.recv_sql_execute()
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/omnisci/mapd/MapD.py", line 1627, in recv_sql_execute
        raise result.e
    omnisci.mapd.ttypes.TMapDException: TMapDException(error_msg='Exception: Non-empty LogicalValues not supported yet')
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
        cursor, statement, parameters, context
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
        cursor.execute(statement, parameters)
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/pymapd/cursor.py", line 119, in execute
        raise _translate_exception(e) from e
    pymapd.exceptions.Error: Exception: Non-empty LogicalValues not supported yet
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/mnt/NTFS_D/Documents/PyCharmProjects/superset/omni-db-connector/sqlalchemy_mapd/example_checking_dialect.py", line 13, in <module>
        conn.scalar(select([1]))
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 914, in scalar
        return self.execute(object_, *multiparams, **params).scalar()
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 982, in execute
        return meth(self, multiparams, params)
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
        return connection._execute_clauseelement(self, multiparams, params)
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
        distilled_params,
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
        e, statement, parameters, cursor, context
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
        util.raise_from_cause(sqlalchemy_exception, exc_info)
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
        reraise(type(exception), exception, tb=exc_tb, cause=cause)
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
        raise value.with_traceback(tb)
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
        cursor, statement, parameters, context
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
        cursor.execute(statement, parameters)
      File "/home/user/ve/omni-db-connector/lib/python3.7/site-packages/pymapd/cursor.py", line 119, in execute
        raise _translate_exception(e) from e
    sqlalchemy.exc.DBAPIError: (pymapd.exceptions.Error) Exception: Non-empty LogicalValues not supported yet
    [SQL: SELECT 1]
    (Background on this error at: http://sqlalche.me/e/dbapi)
    
    Process finished with exit code 1
    


    Any pointers to get me on the right track would be very welcome!




  • 14.  RE: Connect Superset/Looker to Mapd

    Posted 12-12-2019 21:01
    OK progress. What I've found is that I can actually just remove that test and Superset will connect :)

    That test will need to be addressed at some point, but still experimenting here.

    Using the sample data that comes along with the omnisci-db docker image I was able to add a Deck.gl viz of the flights_2008_10k dataset. Awesome.

    When trying to add any of the other tables in the sample db: 'omnisci_states', 'omnisci_counties', 'omnisci_countries', I am getting errors when Superset tries to determine the column types on those tables. On further inspection (and a couple of tiny code corrections) I see that it's failing when it encounters OGC spatial type columns, specifically 'MULTIPOLYGON'. SQLAlchemy doesn't have built-in types for those, GeoAlchemy would be needed for that. From the looks of it, Superset doesn't actually have native support for spatial types! When I look at the Superset examples, there's a polygon dataset for San Fransisco, but those polygons are actually interpreted as a text field.

    I'm hoping I've missed something there.


  • 15.  RE: Connect Superset/Looker to Mapd

    Posted 12-13-2019 05:52
    Hi Adam - 

    I suspect your issue from your first post is that OmniSci doesn't support this as a query, as it doesn't have a table reference. It's actually the backend that's passing you the error, it just goes through pymapd/SQLAlchemy, so it is not as clear as it could be what the actual issue is.
    conn.scalar(select([1]))


    As far as the geospatial types, that's generally an issue in a lot of places (since pandas doesn't natively support geo types). 




  • 16.  RE: Connect Superset/Looker to Mapd

    Posted 12-18-2019 03:00
    Hi @Randy Zwitch,

    I implemented default_from in the dalect to manage this kind of call, but you need at least a table in the database; created a table called DUAL with a row on it ;)



  • 17.  RE: Connect Superset/Looker to Mapd

    Posted 12-13-2019 08:19

    Hi @Adam Valair,

    The error you are getting is normal, because actually omnisci database doesn't support this kind of query, but this error wouldn't be a blocking one.

    Maybe I posted the worng version (shame on me).

    Anyway the driver has to be upgraded, because I wrote it for Mapd 3.xx, so I should add delete, upgrade, ias and new datatypes. I'll try to work on in it on the WE, then I'll publish on github.

    If you face any problem, feel free to post here ;)





  • 18.  RE: Connect Superset/Looker to Mapd

    Posted 12-14-2019 11:53
    Thanks @Randy Zwitch and @Candido Dessanti for your replies.
    ​​​​​
    I would be happy to contribute to the repo @Candido Dessanti if you want to post the url here.



  • 19.  RE: Connect Superset/Looker to Mapd

    Posted 12-17-2019 10:50
    Hi @Adam Valair,

    did some progress too



    now the sqllab is working.

    About spatial datatypes, unlikely superset doesn't like out WKT format. I'll chack later what's the problem


  • 20.  RE: Connect Superset/Looker to Mapd

    Posted 12-17-2019 16:19
    Nice!
    The errors with the OGC fields were only that they weren't mapped to anything, so it would fail with an exception and stop. For example, when superset encountered the "MULTIPOLYGON" field it didn't know what to do. If those fields are actually just WKT then they could be mapped to TEXT and then dealt with differently in the UI; I'm very new to omnisci and so I'm not sure what the storage of the OGC field types actually looks like.


  • 21.  RE: Connect Superset/Looker to Mapd

    Posted 12-17-2019 22:31
    I tried to load the ogc fields as text dictionary encoded, but superset didn't like the format (the idea was to store them in two different fields, the first native to use Geospatial functions and the second dictionary encode for group by and rendering).

    I exported the ogc fields with the copy command and reimported on a text field, but the objects complained about format


  • 22.  RE: Connect Superset/Looker to Mapd

    Posted 12-24-2019 02:19
    Hi @Adam Valair,

    I have cleaned up the code in the sqlalchemy dialect and rebranded to omnisci, so now it works a little better with apache superset

    I have placed the files needed in this mediafire folder https://www.mediafire.com/folder/c1f9q2hkkfu1l/

    The zip file contains the dialect, and you can install with the provided setup.py file as usual. the second file is needed to enable time grains for the dialect and has to be placed into db_engine_spec subfolder of superset installation

    I verified the objects in superset didn't support WKT, and they group on polygons fields; as the time of writing, this isn't supported on omnisci (you can group on text encoded and numeric fields only), so the only way display polygon's maps with deck.gl polygons, you have to store the polygons into text encoded fields in a compatible format (tried with JSON format on omnisci_states table joined with flights_2008_7m)


    to do that I created a view that joins the provided flights_2008_7m with a modified omnisci_states table

    CREATE VIEW flights_2008_7M_geo AS 
    select f.*,
    o.omnisci_geo origin_geo,
    d.omnisci_geo destination_geo 
    from flights_2008_7M as f join omnisci_states2 as o on f.origin_state=o.abbr 
    join omnisci_states2 as d on f.dest_state=d.abbr;

    Of course, storing the geometry on a text column prevents us from using any geospatial function. As a workaround, we can store the geometry in both formats using the text for the group by / rendering and the native columns for the rest.
    To manage the test connection issue and logical tables in general,just create a table called dual inserting a row into it. The dialect will write the query pointing to the DUAL table.
    If you face problems or new ideas, please provide feedback