Data Engineer’s Lunch #10: NoSQL – Part 1

In Data Engineer’s Lunch #10: NoSQL – Part 1, we discussed NoSQL datastores. Specifically, we discussed different types of key-value stores. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!

Overview

Origins of NoSQL

NoSQL databases originated when large companies like Google ran into trouble using SQL databases to store their data. A single machine could no longer deal with the volume of data necessary to run a resource-intensive service like google search. SQL databases could only share data between multiple machines via sharding. Sharding helps the resource limits of CPU power, memory, and storage that single machines have. It also, however, greatly increases development complexity since any applications need to know which shard the data is on and how to access it. Sharding also fails to keep data balanced between nodes, loading on even more developmental complexity and potential delays.

NoSQL Data Storage

Key-value storage is a base for a large amount of NoSQL data storage. Without relationships between data similar to table relationships for SQL databases, NoSQL database queries are reliant on knowing the data that is being requested.

List of NoSQL Databases

  • Memcache
  • Cassandra
  • Dynamo
  • Elasticsearch
  • Solr
  • REDIS
  • Mongo
  • Neo4j
  • JanusGraph
  • HugeGraph
  • Arango

Key-Value Storage

Dictionary format for storing data. Each value is associated with a unique key. Somewhat similar to table-based storage where each field for a row has a value associated with it. Since key-value storage is only interested in storing a single thing (key-value pair) supporting technologies for extending functionality can still be relatively simple. Underneath more complicated NoSQL storage like document stores or column family stores, there is often a layer that is a key-value store. 

List of Key-Value Stores / NoSQL Databases that use Key-Value Storage

  • RocksDB
  • Mongo
  • Cassandra
  • Kafka
  • S3
  • REDIS
  • HBase

Cassandra.Link

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!