DENORMALIZE YOURSELF AND FACE TO WEB SCALE. STATE IS A FUCK

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > DENORMALIZE YOURSELF AND FACE TO WEB SCALE. STATE IS A FUCK

Asleep Style: Oct 20, 2010

gather round everybody, it's time for an effort post on the biggest piece of poo poo that AWS has ever created: their Quantum Ledger Database, aka QLDB

from their docs: Amazon Quantum Ledger Database (Amazon QLDB) is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log.

I'm pretty sure that the docs used to say BLOCKCHAIN a lot but they've toned that down apparently

QLDB is a noSQL document db, but the most important thing to understand about it is that it's journal-first. it doesn't store data directly, it stores transactions. you want your data? no problem, the query will replay the transaction history and reconstruct your data. this architecture has Implications

let's start with some of the nice things, since that will be brief. you get audit logs more or less for free, since the whole transaction history is right there. it's serverless, so spinning up a new ledger is easy. cloud costs are low compared to something like postgres in RDS

now for some of the downsides. I'm going to rip this first one straight from the AWS docs, because they say it better than I can.

quote:

Backup and Restore
Can I take a snapshot or backup of my ledger?

Amazon QLDB does not support a backup and restore feature as of now. At present, an export to S3 functionality is available. Using this functionality you can export the contents of your QLDB journal to S3.

Can I restore my ledger to a particular point in time?

Amazon QLDB does not support a point-in-time restore feature as of now.

yes, that's right. no backups

you're also limited to only 20 tables. yes, it's a document db, not a relational db, so you won't be normalizing your data into a million tables, but with only 20 you will run out quickly. oh, also a dropped table counts as an inactive table, and you're limited to a total of 40 active and inactive tables. with enough new features and architecture changes in your application you may reach a point where your only option is to create an entirely new ledger

another limitation is that each table can only have 5 indexes. since the db is journal first, god help you if you need to query something without an index. if you have a sufficiently large amount of data in a table, even an index won't save you because of the biggest problem of all:

30 second transaction timeout. if your query takes longer than 30 seconds, it times out and returns nothing. it takes a surprisingly small number of documents in a table before even a query like:

code:

SELECT id FROM <table>

will time out, EVEN THOUGH ID IS AN INDEXED FIELD. no problem, let's just just add a LIMIT statement. what do you mean that's not supported? fine, we really need this query to run, can we beef up the instance? oh right, serverless

your best shot to actually get data out when you need it is to have a high-cardinality indexed field that you can set a range on. something like:

code:

SELECT created_time FROM <table>
    WHERE created_time > 2024-03-01::00:00 AND 
    created_time < 2024-03-01::00:15

make the window small enough so that you no longer get transaction timeouts and slide it over whatever range you actually need. without equality predicates you don't actually get an indexed lookup, you get a full table scan and TRANSACTION TIMEOUT. if you have a gun to your head and need to build a greenfield QLDB-based application after reading this post, I suggest integer ids

would you like to do a join? go gently caress yourself. it's supported, but journal first means that it is going to take way too long. both your join criteria and your equality predicates had better be high-cardinality indexes

is QLDB the wrong choice for what mostly boils down to a CRUD app? absolutely, but that decision predates me and this was the VP's pet technology. the thing is I can't think of a situation where QLDB is possibly the right choice. there's no problem space where the benefits outweigh the downsides. dealing with your own custom audit table implementation is going to be an order of magnitude less trouble than QLDB.

that VP left so I was finally able to get permission to migrate our application to postgres. our integration test suite takes 45 minutes to run with QLDB. I've migrated ~25% of the tests so far and I still haven't cracked a minute. this is the most naive possible implementation, dropping tables and recreating them for every single test. using an ORM without any by-hand SQL queries. when the migration is done it's looking like the application is going to be somewhere between 10-100x faster with a bunch of meat left on the bone for future optimizations if we ever need them. I can't wait to be rid of this horrible piece of poo poo

# ¿ Mar 3, 2024 17:41

Adbot: ADBOT LOVES YOU

# ¿ May 17, 2024 13:10

Asleep Style: Oct 20, 2010

PIZZA.BAT posted:

gotcha. so it's like most nosql databases only they turned off the journal cleanup, as a joke

yes, exactly. then the marketing folks got ahold of it and decided that actually the joke was the central feature

# ¿ Mar 4, 2024 05:04

Asleep Style: Oct 20, 2010

redleader posted:

someone coined the phrase "data puddle" for something we're doing (for bad/dumb reasons) and i hate it so much

a data puddle is what pizza.bats client built for their 50 documents

# ¿ Mar 31, 2024 00:04

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > DENORMALIZE YOURSELF AND FACE TO WEB SCALE. STATE IS A FUCK