We’ve seen the rise and fall of Apache Hive as the default choice when it comes to data lakes.
Hive lost its footing for a few big reasons.
It was too coupled with Hadoop and Spark
Hive, the query engine, became a painful reminder of the slow MapReduce days of data engineering
It didn’t handle small updates very effectively
These major drawbacks from Hive …
Keep reading with a 7-day free trial
Subscribe to DataEngineer.io Newsletter to keep reading this post and get 7 days of free access to the full post archives.