Restore deleted tables

Hi,

So is there a way to restore deleted tables or rollback options to the last checkpoint? Not sure how to do that, if anyone has any idea how to do a database backup for Omniscidb, that would be great!!

Looking forward to your answers - thanks Dipanwita.

Hi @dipanwita2019,

You have various ways to do backups omniscidb

First, you physically backup all the files comprising the DB itself, and this could be useful in case of a hardware failure like a broken disk.
To do that, you have to stop the services and create an archive of $OMNISCI_STORAGE data-dir.

tar zxvf /some_path/omnisci_storage_backup.tar.gz /omnisci_storage/

Second, you can use the dump/restore table commands to backup single tables in the database.

DUMP TABLE table1 TO '/some_path/table1.dmp' WITH (COMPRESSION='gzip');

You can read about those commands here

Dump/restore can also be a fast way to duplicate a table. For example, you can dump table1 and use the dump to restore a table that you will call table2.

Third, use the copy to command to do logical dumps in text format; with a copy to command, you can specify a query so that you can filter, aggregate, etc. the data you are exporting

COPY (select * from table1 where col1 between 1 and 10000000) to '/somepath/table1.csv';

This is a link to the docs.

Then if you accidentally deleted or inserted, or updated records, you can revert the table to a previous epoch. Still, I suggest using this in case of an emergency, not as a normal way to roll back the data.

There are two hidden commands in omnisql called.

\gte and \ste

The first will return the actual epoch of the table, and the second will set the epoch you will specify

so as an example, we have a table with some records, and we delete some records accidentally

omnisql> select count(*) from fact_sales_16;
EXPR$0
123534963
1 rows returned.
Execution time: 9 ms, Total time: 10 ms
omnisql> \gte fact_sales_16
table epoch is 3
omnisql> delete from fact_sales_16 where extract(month from the_date) = 12;
omnisql> select count(*) from fact_sales_16;
EXPR$0
113205120
1 rows returned.
Execution time: 57 ms, Total time: 58 ms

to restore the records we deleted, we have to set the previous epoch

omnisql> \ste "fact_sales_16 3"
table epoch set
omnisql> select count(*) from fact_sales_16;
EXPR$0
123534963
1 rows returned.
Execution time: 48 ms, Total time: 48 ms

This would cover all the options, also the undocumented ones, you can use to save your data with omniscdb.

Regards,
Candido

1 Like

Just to add to @candido.dessanti’s answer, you can also run: (Warning: this is undocumented/not officially supported)

SHOW TABLE DETAILS <table_name> to get various metadata about the table, including the current epoch (max_epoch), and the lowest epoch you can roll back to (min_epoch).`

In addition to \ste, you can also roll back a table to a certain epoch via ALTER TABLE <table_name> SET epoch = <epoch_number>

For example:

omnisql> insert into epoch_test (a) values (1);
omnisql> insert into epoch_test (a) values (2);
omnisql> insert into epoch_test (a) values (3);
omnisql> insert into epoch_test (a) values (4);
omnisql> insert into epoch_test (a) values (5);
omnisql> show table details epoch_test;
table_id|table_name|column_count|is_sharded_table|shard_count|max_rows|fragment_size|max_rollback_epochs|min_epoch|max_epoch|min_epoch_floor|max_epoch_floor|metadata_file_count|total_metadata_file_size|total_metadata_page_count|total_free_metadata_page_count|data_file_count|total_data_file_size|total_data_page_count|total_free_data_page_count
430|epoch_test|3|false|0|4611686018427387904|32000000|3|5|5|2|2|1|16777216|4096|4088|1|536870912|256|254
omnisql> \gte epoch_test
table epoch is 5
omnisql> select * from epoch_test;
a
1
2
3
4
5
omnisql> alter table epoch_test set epoch = 1;
Cannot set epoch for table (1, 430) lower than the minimum rollback epoch (2).
omnisql> alter table epoch_test set epoch = 3;
omnisql> show table details epoch_test;
table_id|table_name|column_count|is_sharded_table|shard_count|max_rows|fragment_size|max_rollback_epochs|min_epoch|max_epoch|min_epoch_floor|max_epoch_floor|metadata_file_count|total_metadata_file_size|total_metadata_page_count|total_free_metadata_page_count|data_file_count|total_data_file_size|total_data_page_count|total_free_data_page_count
430|epoch_test|3|false|0|4611686018427387904|32000000|3|3|3|2|2|1|16777216|4096|4092|1|536870912|256|254
omnisql> \gte epoch_test
table epoch is 3
omnisql> select * from epoch_test;
a
1
2
3

Again note that as @candido.dessanti mentioned, only use this capability if you know what you’re doing, as its currently not an officially supported/documented feature. And as always, making regular backups is highly recommended.

2 Likes

Hi,

I suggest using the commands suggested by @todd because they are available using any driver (JDBC, ODBC, PYTHON), so you aren’t restricted to omnisql’s use, but as Todd and Me mentioned, don’t use as a regular ROLLBACK.

1 Like

So I tried using the commands Todd referred but through the pymapd library, but not working. Do I have to use command line to run the OmnisciSQL commands you mentioned or do you know any equivalent python library (like pymapd) that will allow me to interact with the database and run the queries.

With omnisql, the \gte and \ste Commander works with almost any version of omniscidb.

When I come back home, I will check why the commands aren’t working with pymapd (it would be called pyomnisci in the latest versions).

Probably the commands are fairly new?

Hi @dipanwita2019,

I did some test with pyomnisci driver and the database and everything is fine, but the version of omniscidb has to be at least the 5.5 to use the show table details and alter table command.

using the pyomnisci driver and a version of the database of at least 5.5, everything would works

Python 3.7.10 (default, Feb 26 2021, 18:47:35) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pyomnisci import connect
>>> con = connect(user="admin", password="HyperInteractive", host="localhost",dbname="omnisci")
>>> cursor=con.execute("show table details flights_2008_7m")
>>> list(cursor)
[(4, 'flights_2008_7M', 58, 0, 0, 4611686018427387904, 2000000, -1, 2, 2, -2147483648, -2147483648, 1, 16777216, 4096, 3811, 3, 1610612736, 768, 127)]
>>> con.execute("alter table flights_2008_7M set epoch=1")
<omnisci.cursor.Cursor object at 0x7f6b2cf74bd0>

The \gte and the \ste command are only avaible using the omnisql command, and they works on almost every version of the database, but you can’t use them with any programmatic driver.

Bests,
Candido

Hi,
Thank you for the update.
Just to make sure that I understand correctly, so once I set the table epoch to a number, and I delete the table accidentally how can I revert back to the epoch number? Is it only through gte and also using OmniSQL not through pyomnisci API ?

Hi,

When you run DDL on tables, like INSERTs, UPDATEs, and DELETEs, the EPOCH of the table is modified automatically by the system to keep track of those modifications.

So as an example, if you have a table with an EPOCH of 120 and you insert a record, the table will have an EPOCH of 121, then you run a delete that wipes 10000 records, the table will have an EPOCH of 122.

To recover the table before the insert, you must set the EPOCH to 120 (using alter table or set is the same). If you want to recover the table after the insert, you have to set it to 121.

To be clear, what do you mean by “delete the table accidentally?”. If you drop the table, you can recover only with a dump, while if you delete some records, you can use the “alter table”/ste command

regards,
Candido