Data Engineer’s Lunch #45: Apache Livy
In Data Engineer’s Lunch #45: Apache Livy, we discussed Apache Livy, a REST API for interacting with Spark Clusters. It […]
Data Engineer’s Lunch #45: Apache Livy Read More »
In Data Engineer’s Lunch #45: Apache Livy, we discussed Apache Livy, a REST API for interacting with Spark Clusters. It […]
Data Engineer’s Lunch #45: Apache Livy Read More »
In Apache Cassandra Lunch #65: Spark Cassandra Connector Pushdown, we discussed Spark predicate pushdown in the context of the Spark
Apache Cassandra Lunch #65: Spark Cassandra Connector Pushdown Read More »
In this blog post, the first in a series about Open Source Data Catalogs, we will be talking about an
Open Source Data Catalog Overview: CKAN Read More »
In Data Engineer’s Lunch #15: Introduction to Jenkins, we discussed Jenkins the automation platform. The live recording of the Data
Data Engineer’s Lunch #15: Introduction to Jenkins Read More »
TableAnalyzer is a tool for analyzing Cassandra (CFStats/TableStats) output that visualizes variance in metrics between nodes. We use TableAnalyzer to
Using TableAnalyzer – Anant’s Tool for Analysis of Cassandra Tables Read More »
Data lakes are a tool for long term data storage. They can be implemented on-premises for use cases requiring high
Apache Spark Companion Technologies: Data Lakes Read More »