Skip to main content

Posts

Showing posts from May, 2023

Migrate Hive tabes to Iceberg in Trino

Trino version 411 introduced 'migrate' procedure in Iceberg connector. This procedure coverts the existing Hive tables with ORC, Parquet & Avro format to Iceberg table. This article explains the details of the procedure. If you execute CREATE TABLE AS SELECT statement to convert Hive to Iceberg, I would recommend trying this procedure. The procedure will be much faster because it doesn't rewrite files.  The procedure accepts 3 arguments (schema_name, table_name and optional recursive_directory). The possible values for recursive_directory argument are true, false and fail. The default value is fail that throws an exception if the nested directory exists under the table or partition location.   CALL iceberg.system.migrate(schema_name => 'testdb', table_name => 'customer_orders', recursive_directory => 'true');  Let me explain the details of the implementation next.  Generate Iceberg schema object based on Hive table definition Iterate ove...