- Delta streaming is good, although we don't have lot of streaming use case.
- Delta liquid clustering (incremental) equivalent is something missing from Iceberg iirc.
- I noticed when working on Iceberg, you cannot create a table without having a catalog, in delta, you can just write directly to s3 and it will be a delta table independent of catalog.
iceberg fanboy too…so the drawback you mention with hidden partitioning, trino just released 472, you can now query partitions via hidden metadata column.
Hello 👋 Zach, Thanks for the great article. It's really helpful to compare Delta, Iceberg, and Hudi!
Quick question: I noticed in the Delta example that ZORDER is applied on a single column (event_time). Based on the Delta Lake documentation, Z-Ordering tends to be most beneficial when used on multiple columns, especially for queries with varying filters. For a single column, it seems a simple sort or clustering might suffice.
I would love to hear your thoughts. Is there an added benefit to using ZORDER in this specific case?
In this paragraph , you say ' If you use Copy-on-Write strategy, files are compacted when data is written. This makes for slower reads.' . Is this a typographical error? Did you mean this makes for 'Slower writes', as compaction needs to be done at Write time? Amazing blog and very educative and instructive at the same time. The relational database analogy is brilliant
Great comparison.
Few things I have experienced:
- Delta streaming is good, although we don't have lot of streaming use case.
- Delta liquid clustering (incremental) equivalent is something missing from Iceberg iirc.
- I noticed when working on Iceberg, you cannot create a table without having a catalog, in delta, you can just write directly to s3 and it will be a delta table independent of catalog.
Hi Zach, great article thank you!
iceberg fanboy too…so the drawback you mention with hidden partitioning, trino just released 472, you can now query partitions via hidden metadata column.
https://github.com/trinodb/trino/issues/24301
Hi, Zach. Interesting article 👍🏽
I’m willing to test iceberg asap.
In the table comparing both delta, iceberg and hudi say that delta doesn’t allow rename or drop columns but it does.
https://docs.delta.io/latest/delta-column-mapping.html#delta-column-mapping
Makes sense
Hello 👋 Zach, Thanks for the great article. It's really helpful to compare Delta, Iceberg, and Hudi!
Quick question: I noticed in the Delta example that ZORDER is applied on a single column (event_time). Based on the Delta Lake documentation, Z-Ordering tends to be most beneficial when used on multiple columns, especially for queries with varying filters. For a single column, it seems a simple sort or clustering might suffice.
I would love to hear your thoughts. Is there an added benefit to using ZORDER in this specific case?
Thanks again for sharing this!
In this paragraph , you say ' If you use Copy-on-Write strategy, files are compacted when data is written. This makes for slower reads.' . Is this a typographical error? Did you mean this makes for 'Slower writes', as compaction needs to be done at Write time? Amazing blog and very educative and instructive at the same time. The relational database analogy is brilliant
Yes it’s a typo. Lemme fix