Simple problem really. I’m trying to calculate a set of percentiles (e.g. 50, 68, 90) on groups within a table. I’ve noticed the percentile_cont function is a reserved word in the documents, but the function is not currently implemented. I’ve attempted to go down the path of defining the function myself using the UDF extension to no avail and after looking online for several hours I’ve almost given up.
As an example using the
nyc_tree_2015_683k dataset, I would like to be able to do the following (or something as close to it).
select zipcode, count(tree_dbh), ave(tree_dbh), percentile(tree_dbh, 0.50), percentile(tree_dbh, 0.68), percentile(tree_dbh, 0.90) from nyc_tree_2015_683k group by zipcode;
I should also mention that I have tried executing the query in a Jupyter notebook using
*edited to align with the following post