Search
Close this search box.
sql-databases

Data Engineer’s Lunch #7: SQL Databases

In Data Engineer’s Lunch #7: SQL Databases, we discuss relational concepts, popular SQL databases, and advantages and disadvantages. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend Data Engineer’s Lunch in person, it is hosted every Monday at 12 PM EST. Register here now!

In Data Engineer’s Lunch #7: SQL Databases, we discuss relational concepts including the history of RDBMS, the general need for SQL databases, rules of design, and normalization. We also discuss popular SQL databases, and their advantages and disadvantages. If you want a more in-depth discussion, be sure to watch the live recording of Data Engineer’s Lunch #7 embedded below! Don’t forget to like and subscribe while you watch it!

Relational Concepts

  • History of the Database
    • We talk through specific slides in the above deck; as well as, add additional history during the live recording. Be sure to check it out below!
  • General Need
    • It is widely used in the Business Intelligence tool.
    • Data Manipulation and data testing are done through SQL.
      • Languages for RDBMS
        • DDL – Data Definition
        • DML – Data Modification
    • Data Science tools depend highly on SQL. Big data tools such as Spark, Impala are dependent on SQL.
    • It is one of the most in-demand industrial skills.
  • Rules of Design
  • “Normalization”

Popular SQL Databases

Advantages / Disadvantages

  • Advantages
    • Faster Query Processing
      • ACID
      • Large amount of data is retrieved quickly and efficiently.
      • Operations like Insertion, deletion, manipulation of data is also done in almost no time.
    • No Coding Skills
      • For data retrieval, large number of lines of code is not required.
      • All basic keywords such as SELECT, INSERT INTO, UPDATE,
    • Standardized Language
      • Due to documentation and long establishment over years, it provides a uniform platform worldwide to all its users.
    • Portable
      • It can be used in programs on PCs, servers, laptops independent of any platform (Operating System, etc). Also, it can be embedded with other applications as per need/requirement/use.
    • Interactive Language
      • Easy to learn and understand, answers to complex queries can be received in seconds.
    • Multiple data views
  • Disadvantages
    • Fixed set of columns.
    • NOSQL – Rows that define new columns on-demand without altering the schema.
    • Pivot queries that create columns dynamically, depending on the distinct data values found as they search the data.
    • To Achieve ACID, need to lock data
    • Limit to how much can happen on One Machine
    • Scaling only happens via Master/Slave Replication of the whole data set
    • OR via Sharding (manual or automatic) that separates data

Resources

If you missed last week’s Data Engineer’s Lunch #6: Common Data Formats Used in Data Engineering be sure to check it out! As mentioned above, the live recording of Data Engineer’s Lunch #7 is embedded below. Also, check out our YouTube page for more videos and the Data Engineer’s Lunch playlist here! Don’t forget to subscribe while you are there!

Cassandra.Link

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!