My (non-AI) AWS re:Invent 24 picks

| 4 min read

Cover Photo

re:Invent 24 has come to a close and as predicted, a huge portion of the event was devoted to generative AI. While there were some pretty interesting announcements surrounding Bedrock and Q, hidden amongst the RAGs were a few of my favorite releases that had nothing to do with AI.

Starting with the data-oriented announcements from Matt Garman's keynote, what better service to talk about than the biggest home for your data: S3. Everyone's favorite blob store got some really interesting data science-oriented features this year that reinforced something that I've come to realize over the past few years: data lakes live in S3. First up is S3 Table Buckets, a new bucket type that brings a number of improvements for buckets storing Apache Iceberg files.

AWS S3 Table Buckets announcement by Matt Garman

This enables higher performance for queries against Parquet files stored in S3 automatically just by using a different bucket type. In addition, table buckets automatically take care of compaction and cleanup of unreferenced files to help keep your data lake optimized with continued scale.

S3 Metadata solves a problem that many of us have probably implemented custom solutions for in the past, likely through some separate "sidecar" datastore that works in conjunction with your S3 bucket blobs. AWS S3 Metadata announcement by Matt Garman

This feature builds on the S3 Table Buckets feature to make blob metadata easily queryable and S3 will keep that metadata up to date for you as bucket contents change. While this might seem like one of the less "flashy" announcements of this year's re:Invent, I think this is one of the most broadly applicable for a large swath of S3 users. This is also a classic example of AWS identifying a piece of undifferentiated workload that their customers are building and finding ways to remove that burden. We like this, AWS. Do more of this, please!

Moving away from S3 for the last of the keynote announcements: Aurora DSQL. This is the closest we have as of now to a SQL version of DynamoDB, and that alone is a pretty exciting prospect.

Aurora DSQL announcement by Matt Garman

Aurora DSQL leverages Amazon's Time Sync service to eliminate clock drift across regions enabling strong consistency even on multi-region tables. As fancy as this sounds, there are some necessary tradeoffs that come with building a globally available datastore. Note that the slide says "PostgreSQL-compatible" rather than implying that Aurora DSQL perfectly implements PostgreSQL. There are a number of PostgreSQL features that remain unsupported and only a subset of the SQL dialect that's supported. I suspect both of these lists will change slightly as the service moves toward general availability, though some omissions are almost certainly required as a byproduct of enabling this level of consistent multi-region operation. As someone who's been focused quite a bit on resilience lately, I'll be very interested to see how this service evolves. For workloads that can operate within the constraints, Aurora DSQL could be a game changer for systems that are looking for multi-region active/active operations.

If you're looking for some deep-dive content on Aurora DSQL, Marc Brooker has some fantastic content on his blog. He also has two great breakout sessions from re:Invent on the topic: DAT424 and DAT427.

That wraps up the major announcements that came out of the keynotes this year. Werner's keynote on Thursday morning didn't contain a single announcement, instead inspiring us builders to embrace "simplexity". To me this seems like a by-product of cloud becoming "boring", and this isn't a bad thing! As the technology continues to mature, I'd expect to see fewer big announcements and instead a continued refinement of the services that we're all building on today. It's clear that Generative AI is the current industry trend, and I applaud AWS for taking a full-stack approach to supporting that space from the custom-built training hardware all the way up to providing tools such as Q.

As is tradition for re:Invent, there were also a number of announcements that came in the weeks leading up to the event in Vegas (known as "pre:Invent").

One that I've been really excited for is the ability to now setup custom domain names for private API Gateways. This is a feature that has been sorely needed to make API Gateway easier to use without exposing the gateway to the internet. Previously private gateways could only use their auto-assigned URL that had the format https://{rest-api-id}-{vpce-id}.execute-api.{region}.amazonaws.com/{stage}, or one of the other variations that involved passing special headers as part of the request. I know that I personally have implemented a number of different workarounds in the past to get friendlier URLs for my endpoints, and it will be really nice to be able to do this natively now.

Serverless services got a host of nice features and improvements to continue rounding out their offerings. Aurora Serverless v2 can now scale all the way down to zero ACUs which finally addresses one of the chief complaints about that service. Previously the minimum was 0.5 ACUs which meant the baseline ACU cost was nearly $44/month, generally running counter to the serverless tenant of only paying for execution time.

Speaking of price savings in serverless-land, DynamoDB on-demand got significant price cuts with no changes needed for users. AWS doesn't get into details as to why this price cut has come about, but presumably they've found ways to make significant price savings to their operating costs for the service and are passing that savings along to us as the customer. Benefitting from the optimizations that the awesome engineers over at AWS make is an understated perk of building on a public cloud, and this is once again the type of announcement that I think everyone will agree is a great thing.

Finally it wouldn't be a discussion about serverless without including everyone's favorite compute service: Lambda! The improvements to Lambda are smaller but great to see nonetheless. For Python and .NET users, SnapStart is now available. SnapStart was announced last year for Java as a way to reduce cold start times by pre-warming and caching a function just after initialization so that invocations could load the cached state rather than re-running initialization on every cold start. Unlike the Java implementation, Python and .NET come with some additional cost so you'll need to do some evaluation to determine if it's worth it for your functions. That said, I can see this being useful for certain applications particularly in the Python world where certain data science libraries can get quite large. The last main Lambda announcement was the release of the Python 3.13 and Node 22 runtimes. New runtimes come all the time as languages get updated so this isn't groundbreaking, but still nice to see for users of both languages who haven't otherwise moved onto providing their own runtimes.

While this was certainly not an exhaustive list (that post would be far too long!), these are some of the announcements that I'm personally really excited about from this year's re:Invent cycle. If you're looking for more, I recommend checking out https://aws-news.com for all the good stuff that's been announced over the last few weeks.