Partition Table based on Date?

Please let us know if any near future plan for releasing the Partition Table by Date Option for a Table, which can allow to drop a Partition and copy a partition data?

Regards,
sumit

Hi @Sumit,

We haven’t any near plan to implement this kind of partition with features, like drop, truncate, and so on, but we are on the verge of releasing the new version of the omnisci database, which recovers space when you delete it a significant amount of rows from a table.

Coupling this with the speed of deletes and ITAS operations and the Fragment Skipping that read from disk/process only the slice of tables that satisfy the filter conditions of the query.

With the latest release (5.5.x), we also added a parameter to the tables that limit the number of epochs kept per table, and that would resolve the problem you had for the massive number of batches you had to do in the past with Kafka and generated a massive amount of metadata.

e.g.

we have a table called orders_32m that has 4608 data pages and 709 free

omnisql> show table details orders_32m;
    table_id|table_name|column_count|is_sharded_table|shard_count|max_rows|fragment_size|max_rollback_epochs|min_epoch|max_epoch|min_epoch_floor|max_epoch_floor|metadata_file_count|total_metadata_file_size|total_metadata_page_count|total_free_metadata_page_count|data_file_count|total_data_file_size|total_data_page_count|total_free_data_page_count
    32|orders_32m|6|false|0|4611686018427387904|32000000|3|35|35|31|31|1|16777216|4096|3999|18|9663676416|4608|709

we delete some data from 1995 (takes around 2 seconds)

omnisql> delete from orders_32m where O_ORDERDATE between '1995-01-01' and '1995-07-31';
omnisql> show table details orders_32m;
    table_id|table_name|column_count|is_sharded_table|shard_count|max_rows|fragment_size|max_rollback_epochs|min_epoch|max_epoch|min_epoch_floor|max_epoch_floor|metadata_file_count|total_metadata_file_size|total_metadata_page_count|total_free_metadata_page_count|data_file_count|total_data_file_size|total_data_page_count|total_free_data_page_count
    32|orders_32m|6|false|0|4611686018427387904|32000000|3|38|38|34|34|1|16777216|4096|4010|18|9663676416|4608|769

and then inserts some data of 1996 (57m of records around 11 seconds)

omnisql> insert into orders_32m select O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY from orders where O_ORDERDATE between '1996-05-01' and '1996-07-31' order by O_ORDERDATE;
    omnisql> show table details orders_32m;
    table_id|table_name|column_count|is_sharded_table|shard_count|max_rows|fragment_size|max_rollback_epochs|min_epoch|max_epoch|min_epoch_floor|max_epoch_floor|metadata_file_count|total_metadata_file_size|total_metadata_page_count|total_free_metadata_page_count|data_file_count|total_data_file_size|total_data_page_count|total_free_data_page_count
    32|orders_32m|6|false|0|4611686018427387904|32000000|3|39|39|36|36|1|16777216|4096|4006|18|9663676416|4608|734

As you can see, space is recovered with the first delete (60 blocks recovered) and used by subsequent insert (35 blocks occupied), so now omnisci reuses the freed space without the need of optimizing the table manually.

Then we query a particular date range from the table, and only the fragments that could have the data are read from disk/processed

omnisql> select sum(O_TOTALPRICE), extract(month from O_ORDERDATE) from orders_32m where O_ORDERDATE between '1998-03-01' and '1998-03-15' group by 2;
EXPR$0|EXPR$1
1273388234745.33|3
1 rows returned.
Execution time: 606 ms, Total time: 607 ms
omnisql> \memory_summary
OmniSci Server CPU Memory Summary:
            MAX            USE      ALLOCATED           FREE
    51423.15 MB     1353.27 MB     4096.00 MB     2742.73 MB

OmniSci Server GPU Memory Summary:
[GPU]            MAX            USE      ALLOCATED           FREE
  [0]     9681.76 MB      610.35 MB     4096.00 MB     3485.65 MB
  [1]     9681.76 MB      437.75 MB     4096.00 MB     3658.25 MB
  [2]     9681.39 MB      305.18 MB     4096.00 MB     3790.82 MB

The date and decimal fields for 400m records of the table are 4.8GB, but 1.3GB only has been read because of the fragment skipping.

So while we haven’t a date partition yet, with the actual and the new features, we have something that performs probably better than databases with such partition scheme.

Regards,
Candido.

Hi @candido.dessanti

Love to see your in depth reply with evidence and sample queries… I am going through and will try to simulate the same in lab.

Regards,
sumit

Hi @Sumit,

I’m sorry for the late reply and the not-so-good samples (You should get better results than mine, so take them as “not best case scenario”).

Automatic reclaim of the freed space is part of the 5.6 release, which’s not released yet, because of a regression that is being corrected, while the table’s metadata feature is available since 5.5.2.

I think it’s better to wait for the 5.6 Release to do all testings; I think it’s just a matter of days

Candido

hi @candido.dessanti ,

will wait for 5.6 and after that simulate in lab.

-Sumit

1 Like

Hi @Sumit ,

the omnisci 5.6 has been released yesterday; You can download using the usual link, or upgrade/install using rpm/apt.

Happy testing.

Candido