Search
Close this search box.

Data Engineer’s Lunch #84: Cool and Interesting Things from AWS re: Invent 2022

As AWS Re:Invent 2023 approaches, this seems like a good time to reflect on the last event and share some of the lessons I learned in 2022.

“This whole conference was about data.”

It didn’t take us long to realize that 80% of the companies in the expo and virtually every session of Re:Invent 2022 was focused on describing what individual and enterprises could do to better exploit their data.

AWS EMR – Open Source Tools AWS-style

AWS Re:Invent taught me that the AWS EMR tool is dramatically misnamed. More than just elastic map reduce, AWS EMR is the open source option for a variety of open source tools on AWS:

  • Spark
  • Presto
  • Kafka
  • Snowflake
  • Hive
  • and so many more

AWS EMR claims to be the industry-leading cloud big data solution for petabyte-scale data processing, interactive analytics, and machine learning using open-source frameworks such as Apache Spark, Apache Hive, and Presto.

The brilliance of EMR is that it harvest PRs from the open source community and delivers them on AWS. Each new tool release for the open source tool will be followed by an update on AWS EMR within a few days. AWS takes the needs of the open source community and delivers them on AWS – sometimes faster than the community itself can deliver.

AWS Migration Tools and Insights

  • 7Rs of Migration to AWS
    • AWS’s Logical Framework for Assessing Migration includes the 7Rs. These are options for how handle data sources and applications that need to move during the migration.

  • Tools to Assist Migration
    • Schema Conversion Tool
    • Data Migration Fleet Advisor
      • Large Migrations, such as those with dozens of data sources of microservices will benefit from the Migration Fleet Advisor, a tool that manages inventories of services and can help users better estimate the costs and time associated with migrations to the AWS ecosystem
  • Tools to Assist with Making the Business Case for Migration

One interesting thing to note about the AWS migration stack: it doesn’t often include Cassandra.

The Scale of Data is Changing

This really really big number is generated from a single molecule of penicillin: a molecule of penicillin has 41 atoms and 285 electrons – even assuming that each electron only has 2 states, the number of possible states is 2^285 OR 285 Qbits.

In order to do simulation and modeling, we still need to either vastly constrain the simulation space we are working within – a task which requires high levels of expertise – or we need to step into the quantum age!

AWS Outposts extend the cloud to your premise

AWS Outposts let high availability users retain access to their cloud in the case of a global failure.

  • All AWS resources available locally
  • Connected to cities and regions
  • Limit loss during service outages

New Instance Types

  • For AI/ML
    • Hpc7g
      • high-memory bandwidth and 200gb/s of Elastic Fabric Adapter network bandwidth and can be used with AWS ParallelCluster
    • C7gn
      • Graviton3E processors with AWS Nitro cards that reduce the I/O load on the CPU
  • For the beefiest of nodes
    • I4i.metal
      • 1024 GB RAM,  128 vCPUs, 75Gbps networking speed and 40Gbps of bandwidth

Common Perspectives at AWS Re:Invent 2022

Terraform is NOW (even AWS services recommend and utilize Terraform)

AWS has so much power they will eventually turn everything on their platform into a service. Their model – to build a service that they need, improve it, and then release it as a service for their customers, has been EXTREMELY successful. They will continue to deliver and release services for all ofg

AWS will be making everything serverless – they want to manage all the infrastructure. Though there have been suggestions in the last year that people still like having control over services at the machine level, the serverless argument, pay for only what you use, isn’t going anywhere soon.

The AWS Partner Network

The AWS Partner Network is composed of over 100,000 Vendors.

  • Each vendor has a given number of AWS certified employees.
  • Each vendor offers a few well defined and repeatable services.

According to AWS: “We have so much business and so many partners, but I don’t have enough people to connect them all.”

Cassandra.Link

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. You can also learn more at our partner site with the Apache Foundation: Planet Cassandra.

Anant

We are a technology company that specializes in building business platforms. If you’d like to learn more about out experience at Re:Invent, contact us and ask for Nick!

Leave a Comment

Your email address will not be published. Required fields are marked *