Assign lat/lon records to cells on a custom grid



I have a table with lat, lon and total sales per record (each lat, lon is a record) and I want to compute averages and total records for all points that are inside a polygon (as defined in a shapefile).

In my understanding this general case is not currently supported, due to ST_CONTAINS not currently supported by MapD core (Spatial intersections,,,

In light of this I decided to change the problem a little bit to what I believe is a much simpler problem that can use ST_DISTANCE:

  • Divide the US in a fishnet (grid) of x miles by x miles and make it into a mapD table of this type:
    grid_cell_name, grid_lat_centroid, grid_lon_centroid, geometry - similar to this:

  • Then assign to each record (lat, lon) a cell within the grid by calculating the distance for each record to each centroid point in all cells in the grid - note: here of course multiple records will be assign to the same grid. In this case my “shapefile” is basically the fishnet.

I think I can use ST_DISTANCE which seems to be supported and do a SELECT MIN DISTANCE to assign the corresponding grid. In theory sounds doable but I wanted to get some advise as of which query to use - please assume one table is called us_grid and the other one is called records.

note: If I have billions of records this could be expensive since I’d have to compute n*m distances and then get the minimum for each record - I assume some trick on along the lines of sorting both tables, records table and grid table would make for a better approach.

All advise is welcome here, thanks!


Hi @dfernan,

Note that both ST_Contains and ST_Distance are not currently supported, geo features are in the works.

As you may have seen in Assign lat/lon records to cells on a custom grid
there is distance_in_meters function that you can call to find the distance between two WGS-84 positions. Example (taken from tests): SELECT DISTANCE_IN_METERS(-74.0059, 40.7217,-122.416667 , 37.783333) FROM test LIMIT 1;

But, unless I’m missing something, you don’t need to calculate any distances to assign each record a cell in your grid. You have the coordinates for each record - you can get the cell right away, just pick the starter lat0/lon0 and then divide each record’s diff(lat/lon-lat0/lon0) by a cell size based on granularity of your grid.


@dash, thanks for the help! I was stuck on thinking about general shapes for general shapefiles but you are absolutely right for the fishnet case. One potential problem would be the difference in degrees be affected by different lat, lon regions? This since difference in degrees and difference in distance in meters/miles is different in different regions? do you have some suggestions on how to correct this? This wouldn’t be an issue in small total area covered by the fishnet but if i cover the whole world it could be a problem?

Thanks in advance!


Can you work with spheroid cells? The sizes of these cells could be identical in terms of degrees, e.g. 5° x 5°. They will be a bit different in terms of the actual area. You can pick a territory, say 50° x 50°, and then assign each record to one of the 100 cells. Would that work for your calculations? You can try playing with the cell size, see if 1° x 1° or 0.5° x 1° cells would work better. You probably want to keep your calculations in WGS84.

Alternatively you may try projecting your coords if that would simplify things. If area of focus is small you could use something simple that’s already supported like Web Mercator.


I might be missing something but we can natively do square or hexagonal binning in SQL and already use that for heatmapping (see the example in our Vega demo here).

The sql for hexagonal binning looks something like this (obviously with appropriate values substituted for all the dollar sign variables):

SELECT reg_hex_horiz_pixel_bin_x(conv_4326_900913_x(lon), $minHexXBounds, $maxHexXBounds, conv_4326_900913_y(lat), $minHexYBounds, $maxHexYBounds, $hexWidth, $hexHeight, 0, 0, $width, $height) as x, reg_hex_horiz_pixel_bin_y(conv_4326_900913_x(lon), $minHexXBounds, $maxHexXBounds, conv_4326_900913_y(lat), $minHexYBounds, $maxHexYBounds, $hexWidth, $hexHeight, 0, 0, $width, $height) as y, AVG(amount) as val FROM fec_contributions_oct WHERE (conv_4326_900913_x(lon) >= $minHexXBounds AND conv_4326_900913_x(lon) <= $maxHexXBounds) AND (conv_4326_900913_y(lat) >= $minHexYBounds AND conv_4326_900913_y(lat) <= $maxHexYBounds) GROUP BY x, y

Not sure if this is what you’re looking for but wanted to throw it out there.


hey, nice, @darwin this may be exactly what I’m looking for! the vega app it’s not connecting but from the query it does seem to be exactly what I am trying to do - would you mind checking the connection issue?

I have a couple of follow up questions:
(1) How could I specify the size of the hexagons in miles?
(2) Could I make it such that the hexagon size is fixed regardless of zoom in/zoom out. This would be useful since I would want the user to decide on a fixed hexagon size.
(3) Could I also add a pop-up box that shows the values of for each hexagon when the user scrolls on them?