Define geospatial data type

Hi,

How should I define geospatial data columns like points, polygon and multipolygon?

I’m attempting to load .csv and .geojson files containing these data types inside a docker container (omnisci/core-os-cuda:v5.4.1) and define them as GEOMETRY(MULTIPOLYGON, 4326) NOT NULL ENCODING COMPRESSED(32)) or GEOMETRY(POINT, 4326). They are all in longitude/ latitude but I’m not having too much luck loading them. Thanks!

Examples from the New York Subway dataset
> CREATE TABLE IF NOT EXISTS NYCSubway (

Division TEXT ENCODING DICT,
Line TEXT ENCODING DICT,
StationName TEXT ENCODING DICT,
StationLatitude FLOAT,
StationLongitude FLOAT,
Route1 TEXT ENCODING DICT,
Route2 TEXT ENCODING DICT,
Route3 TEXT ENCODING DICT,
Route4 TEXT ENCODING DICT,
Route5 TEXT ENCODING DICT,
Route6 TEXT ENCODING DICT,
Route7 TEXT ENCODING DICT,
Route8 SMALLINT,
Route9 SMALLINT,
Route10 SMALLINT,
Route11 SMALLINT,
EntranceType TEXT ENCODING DICT,
Entry TEXT ENCODING DICT,
ExitOnly TEXT ENCODING DICT,
Vending TEXT ENCODING DICT,
Staffing TEXT ENCODING DICT,
StaffHours TEXT ENCODING DICT,
ADA TEXT ENCODING DICT,
ADANotes TEXT ENCODING DICT,
FreeCrossover TEXT ENCODING DICT,
NorthSouthStreet TEXT ENCODING DICT,
EastWestStreet TEXT ENCODING DICT,
Corner TEXT ENCODING DICT,
EntranceLatitude FLOAT,
EntranceLongitude FLOAT,
StationLocation GEOMETRY(POINT, 4326) NOT NULL ENCODING COMPRESSED(32),
EntranceLocation GEOMETRY(POINT, 4326) NOT NULL ENCODING COMPRESSED(32));

omnisql> COPY NYCSubway FROM ‘/omnisci-storage/NYC_Transit_Subway_Entrance_And_Exit_Data.csv’ WITH (header=‘true’, lonlat=‘true’);

Result

Loaded: 0 recs, Rejected: 1868 recs in 1.050000 secs

Hi @bycxgto,

You can look at the logs to find out what is going wrong with your import of data.

the logs are located into [mapd_storage]/data/mapd_logs/omnisci_server.INFO that deafults to
/var/lib/omnisci/data/mapd_log/omnisci_server.INFO

For each record discarded, you should find the reason for rejects.

tried to import the file into the table and I’m getting those errors

2020-11-17T18:15:42.745738 E 16 12 geo_types.cpp:666 Geospatial Import Error: GeoGeoFactory Error: unsupported geometry type
2020-11-17T18:15:42.745753 E 16 12 Importer.cpp:2064 Input exception thrown: Failed to extract valid geometry from row 87 for column StationLocation. Row discarded. Data: (BMT Brighton Brighton Beach 40.577621 -73.961376 B Q Stair YES YES FULL FALSE TRUE Branch 7 St Brighton Beach Rd SE 40.577746 -73.960231 (40.577621, -73.961376) (40.577746, -73.960231))

the geometry is invalid because the importer is expecting the ind of geometry is going to import (POINT, POLYGON, or whatever) while into the dataset nothing is specified or if you want to use the lonlat parameter, you have to remove the double quotes and parenthesis, because the loader is expecting two float numbers.

changing the file this way

Division,Line,Station Name,Station Latitude,Station     Longitude,Route1,Route2,Route3,Route4,Route5,Route6,Route7,Route8,Route9,Route10,Route11,Entrance Type,Entry,Exit Only,Vending,Staffing,Staff Hours,ADA,ADA Notes,Free Crossover,North South Street,East West Street,Corner,Entrance Latitude,Entrance Longitude,Station Location,Entrance Location
    BMT,4 Avenue,25th St,40.660397,-73.998091,R,,,,,,,,,,,Stair,YES,,YES,NONE,,FALSE,,FALSE,4th Ave,25th St,SW,40.660489,-73.99822,40.660397, -73.998091,40.660489, -73.99822
    BMT,4 Avenue,25th St,40.660397,-73.998091,R,,,,,,,,,,,Stair,YES,,YES,FULL,,FALSE,,FALSE,4th Ave,25th St,SE,40.660323,-73.997952,40.660397, -73.998091,40.660323, -73.997952

COPY NYCSubway FROM ‘/omnisci-storage/NYC_Transit_Subway_Entrance_And_Exit_Data.csv’ WITH (header=‘true’, lonlat=‘true’);
Loaded: 2 recs, Rejected: 0 recs in 0.094000 secs

Hi @candido.dessanti thanks for pointing it out. Turns out the StationLocation and EntranceLocation were created as (lat,lon) pair not the other way round. So the format isn’t supported and hence the error.

Yap, I am providing some examples of how to change the file.

I’m not sure you can mix lonlat with other geometries imports, but I want to find out :wink:

Changing this way

    BMT,4 Avenue,25th St,40.660397,-73.998091,R,,,,,,,,,,,Stair,YES,,YES,NONE,,FALSE,,FALSE,4th Ave,25th St,SW,40.660489,-73.99822," POINT(40.660397 -73.998091)"," POINT(40.660489 -73.99822)"
    BMT,4 Avenue,25th St,40.660397,-73.998091,R,,,,,,,,,,,Stair,YES,,YES,FULL,,FALSE,,FALSE,4th Ave,25th St,SE,40.660323,-73.997952,"POINT (40.660397 -73.998091)","POINT (40.660323 -73.997952)"

with this copy command
COPY NYCSubway FROM ‘/omnisci-storage/NYC_Transit_Subway_Entrance_And_Exit_Data.csv’ WITH (header=‘true’);

works; and you can also define station latitude/longitude and Entrance latitude/longitude as points turning the lonlat to false mixed with other geometries

COPY NYCSubway3 FROM ‘/omnisci-storage/NYC_Transit_Subway_Entrance_And_Exit_Data.csv’ WITH (header=‘true’,lonlat=‘false’)

omnisql> select * from NYCSubway3;
            Division|Line|StationName|Station|Route1|Route2|Route3|Route4|Route5|Route6|Route7|Route8|Route9|Route10|Route11|EntranceType|Entry|ExitOnly|Vending|Staffing|StaffHours|ADA|ADANotes|FreeCrossover|NorthSouthStreet|EastWestStreet|Corner|Entrance|StationLocation|EntranceLocation
            BMT|4 Avenue|25th St|POINT (-73.9980909619471 40.6603969729787 )|R|NULL|NULL|NULL|NULL|NULL|NULL|NULL|NULL|NULL|NULL|Stair|YES|NULL|YES|NONE|NULL|FALSE|NULL|FALSE|4th Ave|25th St|SW|POINT (-73.998219959437 40.6604889224565 )|POINT (40.6603969729787 -73.9980909619471)|POINT (40.6604889224565 -73.998219959437)
            BMT|4 Avenue|25th St|POINT (-73.9980909619471 40.6603969729787 )|R|NULL|NULL|NULL|NULL|NULL|NULL|NULL|NULL|NULL|NULL|Stair|YES|NULL|YES|FULL|NULL|FALSE|NULL|FALSE|4th Ave|25th St|SE|POINT (-73.9979519899925 40.6603229607736 )|POINT (40.6603969729787 -73.9980909619471)|POINT (40.6603229607736 -73.9979519899925)

the values loaded with the point (lon lat) are wrong, while the format is accepted and they are formally valid

so we should also swap the values inside the brackets

Division,Line,Station Name,Station Latitude,Station Longitude,Route1,Route2,Route3,Route4,Route5,Route6,Route7,Route8,Route9,Route10,Route11,Entrance Type,Entry,Exit Only,Vending,Staffing,Staff Hours,ADA,ADA Notes,Free Crossover,North South Street,East West Street,Corner,Entrance Latitude,Entrance Longitude,Station Location,Entrance Location
BMT,4 Avenue,25th St,40.660397,-73.998091,R,,,,,,,,,,,Stair,YES,,YES,NONE,,FALSE,,FALSE,4th Ave,25th St,SW,40.660489,-73.99822," POINT(-73.998091 40.660397 )"," POINT(-73.99822 40.660489 )"
BMT,4 Avenue,25th St,40.660397,-73.998091,R,,,,,,,,,,,Stair,YES,,YES,FULL,,FALSE,,FALSE,4th Ave,25th St,SE,40.660323,-73.997952,"POINT (-73.998091 40.660397 )","POINT (-73.997952 40.660323 )"

awesome thanks! helps to preserve all column this way. I suppose all geospatial queries won’t support (lat,lon) pairs anyway, so at some point I need to swap the position and create points as (lon,lat)…!

The problem was the formats like “( lat, lon )” that’s not supported

Still, if you have the latitude before longitude on your dataset, it’s not a problem because you need to set to false the lonlat parameter (or not set at all because false is the default).

I am going to correct the previous posts because the setting of lonlat to true was wrong.

When you specify in the format POINT ( lat lon ), this parameter hasn’t any effect

1 Like

@bycxgto, You have also the option to create a new table and import geometry files using the copy command, setting to true the geo parameter.

Let’s say we have a shapefile like NYC Taxi Zones.zip and we want to import in a new table, e only need to run this command from omnisql

copy nyc_taxi_zones from '/omnisci-storage/NYC Taxi Zones.zip' with (geo='true')

We will get a newly created table called nyc_taxi_zones with every attribute of the shapefile

CREATE TABLE test_geo (
  borough TEXT ENCODING DICT(32),
  location_i DOUBLE,
  objectid DOUBLE,
  shape_area DOUBLE,
  shape_leng DOUBLE,
  zone TEXT ENCODING DICT(32),
  omnisci_geo GEOMETRY(MULTIPOLYGON, 4326) ENCODING COMPRESSED(32));