Data lakes are a tool for long term data storage. They can be implemented on-premises for use cases requiring high security or in the cloud for more accessible solutions. The Databricks runtime includes code specifically for easing the connection between spark and Data lake technologies as well as its own companion tech, Delta Lake. Delta Lake makes interacting with data in data lakes easier and more consistent but it is possible to work with data lakes without it, as we will see today.[Read more…] about Apache Spark Companion Technologies: Data Lakes
In case you missed it, the fourth installment of our weekly data engineering lunch was presented by guest speaker Will Angel. It covered the topic of using Airflow for data engineering. Airflow is a scheduling tool for managing data pipelines. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now![Read more…] about Data Engineer’s Lunch #4: Airflow for Data Engineering
The Apache Cassandra database has gained popularity because it offers scalability and high availability without compromising performance. Many applications running today were built using relational database technology, however, this technology doesn’t offer the scalability or availability that Cassandra does. This is why many people are considering the switch to Switch to Cassandra. In this post, we will cover everything you need to know about switching from a relational database to Cassandra.[Read more…] about Migrating from a Relational Database to Cassandra: Why, Where, When, and How
Big data is a field that treats ways to analyze, systematically extract information from, or otherwise, deal with data sets that are too large or complex to be dealt with by traditional software. In this post, we take a look at some of the biggest and best technologies available.[Read more…] about Big Data Technologies
In today’s society, it is considered the common practice to distribute your business technology operations in an effort to maximize your potential return, whether that be tangible or intangible assets. This process is not merely separating a company into parts but also paving the path for exponential growth.[Read more…] about Scaling Cloud Web & Data Technologies
The biggest Apache Cassandra event in a couple of years, DataStax Accelerate, is happening in National Harbor, MD from May 21 to 23.
We’re excited to have been picked as presenters for one of the most significant big data conferences in the world.
- Thursday, May 23 – 3:00-3:40pm – Part of Track 5 (Exploring Transformational Use Cases) – How a Global Quick Serve Restaurant Increased Customer Engagement and Brand Loyalty – by Rahul Singh