Observed that after deleting the Data from table, OS partition size not changed. Please let us know how we can free the Data on OS level after executing the Delete command.
Use Case: Create a table and load Data. run the Daily job and delete the 7 Days Old Data from Table.
@sumit unfortunately, you have come upon a known issue. At this time, we don’t free up space after data is removed from our structures.
Given the typical size of datasets that our users want to put into OmniSci, you can be sure that we’re working on this one!
@sumit unfortunately, you have come upon a known issue. At this time, we don't free up space after data is removed from our structures.
Given the typical size of datasets that our users want to put into OmniSci, you can be sure that we're working on this one!
Hi @Sumit SrivastavaThanks for trying OmniSci and engaging with us on the forum.First a little bit of background on what you are seeing. The data on the disk that you are referring to as not reducing should be though of as a log file for the given table not just the current content of the table. It contains ALL the states the data has even been in since the creation of the table. So although you logically think a delete action will reduce the amount of data required to be stored it actually increases it as it is another mutation on the data and hence more information is required to be written into storage to identify those changes. Summary is DELETE adds to the size of the on disk store.
The optimize table mentioned by Randy, will actually also increase the size of the on disk store, but tidy up the logical contents of the table. Let me try to explain a bit more. When a DELETE occurs a record in the table is logically tomb-stoned, meaning it is no longer logical visible to the execution engine but still takes up space in the fragment (A fragment represents a horizontal partition of a set of rows of a table. A set of fragments makes up a table, by default there are 32M rows in a fragment) When Optimize is run these tomb-stoned records are actually removed from the fragment, but as that is a mutation of the data MORE information is stored on the disk. Summary is OPTIMIZE can also add to the size of the on disk store.There are currently two sql operations that will reduce the on disk footprint of a table. TRUNCATE TABLE and DROP TABLE. Probably neither of which are useful to your case.
The options you have currently if size of on disk representation is becoming a concern is to consider using a table created with a MAX_ROWS which will automatically drop off old fragments as new data is loaded in keeping your table at a predetermined size. In your case you would still run the DELETES etc but this is assuming you are adding data to your table in date based order and can approximate what the largest size is for the window of data you are interested in.
The second option is to use CREATE TABLE AS SELECT to create a new table at the point you wish to delete the last seven days.
For either of these options we would need to understand your actual data usage and ingestion patterns to better understand what best works for you.In an up coming release we will be reducing the amount of data being kept in the physical store for a TABLE to reduce the filesystem overhead. Keep an eye out for that.
We also have plans for `COMPACTION or VACUUM` commands to allow a user to control the size on disk over time.Hope this helpsRegards