Type not serializable error - meaning?


#1

Hi,
I am new to MapD so bare with my ignorance about few topics.

I have setup MapD v4.0 on my laptop (CPU only, Ubuntu 16.06), I was trying to validate TPC-DS benchmark locally before I try it on a GPU node, so I generated a small test data set of size = 1 GB and loaded in MapD tables.

Now I am trying to run the queries generated in ANSI dialect format (pls correct me if any other dialect needs to be used). I am getting serialization error for couple if queries and want to understand what it means and what can be done to fix the queries.

Here is one example query -
select count(distinct cs_order_number) as order_count,sum(cs_ext_ship_cost) as total_shipping_cost,sum(cs_net_profit) as total_net_profit from catalog_sales cs1,date_dim,customer_address,call_center where d_date between ‘1999-5-01’ and (cast(‘1999-5-01’ as date) + INTERVAL ‘60’ day) and cs1.cs_ship_date_sk=d_date_sk and cs1.cs_ship_addr_sk=ca_address_sk and ca_state=‘ID’ and cs1.cs_call_center_sk=cc_call_center_sk and cc_county in (‘Williamson County’,‘Williamson County’,‘Williamson County’,‘Williamson County’, ‘Williamson County’ ) and exists (select * from catalog_sales cs2 where cs1.cs_order_number=cs2.cs_order_number and cs1.cs_warehouse_sk <> cs2.cs_warehouse_sk) and not exists(select * from catalog_returns cr1 where cs1.cs_order_number=cr1.cr_order_number) order by count(distinct cs_order_number);

Error - Exception: Exception occurred: type not serializable: [$cor0] (type com.google.common.collect.SingletonImmutableSet)

I could not get much information from the log, but I ran queries in sub-parts and it looks like exists function is not supported therefore the query fails.

Q 1 - Is there any listing available on set of supported/unsupported functions?
Q 2 - Does serialization error always indicate an unsupported function/operation? How should one debug and fix such queries?

Let me know if any further information is needed from my end.

~mbaxi


#2

@mbaxi welcome, and thanks for trying us out.

For a list of supported functions for querying and manipulation, start here in the documentation. Earlier topics in this section address data definition.

I’ll look into the serialization error with some of our engineers. However, we do have some EXISTS functionality. I like your debugging strategy of breaking down the query; perhaps you can explore the syntax while I see what else I can find out?


#3

@mbaxi it looks like the problem is that we don’t yet support correlated subqueries, and that should be the cause of the serialization error. We recommend that you rewrite your query without them if possible and see if the error recurs.


Type not serializable
Exception: ERROR-- Exception occurred: type not serializable: [$cor0] (type com.google.common.collect.SingletonImmutableSet)
#4

Thanks for the reply, this is helpful. One quick query does mapD recommend star schema or denormalize tables?


#5

Joins in mapd are very fast; naturally a flat table is a little faster, but it’s up to you to decide witch best fit for your app