Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
rotor
Jun 11, 2001

classic case of pineapple derangement syndrome

PIZZA.BAT posted:

i'm hearing vector databases are the new hotness that's getting all the vc attention

finally, scalable databases!

Adbot
ADBOT LOVES YOU

Cybernetic Vermin
Apr 18, 2005

Subjunctive posted:

are vector databases now doing LLM-style random walk everywhere? I thought it was just nearest-neighbour with lots of dimensions

no real randomness (not that central to llm's either after all, though it is involved), but if you're starting from text and want some high-dimensional point in 2024 i think you probably *want* something very llm'ish.

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003
unlike chatbots, vector databases are actually useful. you can basically use them to do semantic search. you query for butts, it returns both farts and poops, because those vectors have high cosine similarity to your query. the embedding into vectors is where the machine learning magic happens. but it doesn't do generative chatbot bullshit, it just returns the search results. people are then using these results to submit the original query to chatgpt with additional context in the prompt (they call this retrieval-augmented generation, as though it was even deserving of a name), but you don't have to do that part

redleader
Aug 18, 2005

Engage according to operational parameters

DELETE CASCADE posted:

people are then using these results to submit the original query to chatgpt with additional context in the prompt (they call this retrieval-augmented generation, as though it was even deserving of a name), but you don't have to do that part

so now they're sending prompts through a thesaurus. wonderful

rotor
Jun 11, 2001

classic case of pineapple derangement syndrome

redleader posted:

so now they're sending prompts through a thesaurus. wonderful

bdid did it first

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal
christ, i'm still actively hating mongodb and now they've got them newfangled vector database things i need to be furious about too?

also i like postgres more than mysql in large part because it actually treats your data as if it's valuable. mysql lets you (or used to let you, maybe it changed) do poo poo like dropping tables that still had foreign key references on them if you just added something to the ddl statement that was basically the equivalent of "trust me bro", whereas postgres will firmly tell you to get hosed if you try bullshit like that

i will also never not be angry at mysql for using the "long" type in their c client library api which means it's 32 bit on x86 and x64 windows, while it's 32 bit on x86 linux but 64 bit on x64 linux. combine that with having to pass long* values to get query results and you're in for some exciting times if you're writing code that should run on all four of those platforms and someone just carelessly shoves a cast from int to long in the wrong spot; doubly so if they're developing on a windows machine since it'll work just fine there

redleader
Aug 18, 2005

Engage according to operational parameters

Deep Dish Fuckfest posted:

i will also never not be angry at mysql for using the "long" type in their c client library api which means it's 32 bit on x86 and x64 windows, while it's 32 bit on x86 linux but 64 bit on x64 linux

lmao, incredible

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003

Deep Dish Fuckfest posted:

also i like postgres more than mysql in large part because it actually treats your data as if it's valuable. mysql lets you (or used to let you, maybe it changed) do poo poo like dropping tables that still had foreign key references on them if you just added something to the ddl statement that was basically the equivalent of "trust me bro", whereas postgres will firmly tell you to get hosed if you try bullshit like that

postgres won't save you from yourself if you use the appropriate incantation (username post combo!!!)

code:
# create table a (id int primary key);
CREATE TABLE
# create table b (a_id int references a);
CREATE TABLE
# drop table a;
ERROR:  cannot drop table a because other objects depend on it
DETAIL:  constraint b_a_id_fkey on table b depends on table a
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
# drop table a cascade;
NOTICE:  drop cascades to constraint b_a_id_fkey on table b
DROP TABLE
e: however in postgres you could put all of the above into a transaction, and roll it back once you realize you hosed up. mysql ddl is not transactional

Hed
Mar 31, 2004

Fun Shoe

rotor posted:

finally, scalable databases!

it's a vector database, not a scalar one

ADINSX
Sep 9, 2003

Wanna run with my crew huh? Rule cyberspace and crunch numbers like I do?

DELETE CASCADE posted:

unlike chatbots, vector databases are actually useful. you can basically use them to do semantic search. you query for butts, it returns both farts and poops, because those vectors have high cosine similarity to your query. the embedding into vectors is where the machine learning magic happens. but it doesn't do generative chatbot bullshit, it just returns the search results. people are then using these results to submit the original query to chatgpt with additional context in the prompt (they call this retrieval-augmented generation, as though it was even deserving of a name), but you don't have to do that part

Oh hey this is what I’m doing at work. We’re using a vector DB to store a bunch of insights extracted from driving data for the purposes of finding it later to feed into simulators for self driving cars. So if your self driving algorithm is messing up when it’s wet out and you’re approaching an overpass you could type in “overpass while raining” or something like that and get some really accurate results. It’s pretty incredible how well it works

We demoed it at IAA last year and the CEO stopped by; the scientists did all the cool vector extraction stuff but were gonna demo either Jupiter notebooks or some hacked up voxel install when I came in to make a little web app, hosted it on ecs, set up a login page blah blah blah… it took a solid month but it was really fun. It was so well received it changed the direction of our project… which I think upset some PMs because when they got back i was not mentioned in the org wide project acknowledgements lol. I’m not mad though please don’t tell the newspaper I got mad

Edit to anonymize it a bit

ADINSX fucked around with this message at 04:15 on Mar 22, 2024

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal

DELETE CASCADE posted:

postgres won't save you from yourself if you use the appropriate incantation (username post combo!!!)

code:
# create table a (id int primary key);
CREATE TABLE
# create table b (a_id int references a);
CREATE TABLE
# drop table a;
ERROR:  cannot drop table a because other objects depend on it
DETAIL:  constraint b_a_id_fkey on table b depends on table a
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
# drop table a cascade;
NOTICE:  drop cascades to constraint b_a_id_fkey on table b
DROP TABLE
e: however in postgres you could put all of the above into a transaction, and roll it back once you realize you hosed up. mysql ddl is not transactional

well yeah no poo poo you're literally asking postgres to delete dependencies too, which is fine, and it means that whatever invariants you're enforcing in what's left in your db will still be correct. mysql will let you drop a table without cascading the drop to dependencies! as in you have constraints that refer to a table that doesn't exist anymore. because who gives a poo poo about data integrity.

also i just looked at the mysql doc to see if things changed and lol

quote:

The RESTRICT and CASCADE keywords do nothing. They are permitted to make porting easier from other database systems.
hmm yes excellent this is exactly what i want; shoving sql statements that work in one db into another one that doesn't implement them but deliberately accepts them to prevent errors from being raised is a great feature. mysql!

and agreed on the transactional ddl; makes upgrading stuff way less stressful

rotor
Jun 11, 2001

classic case of pineapple derangement syndrome

Hed posted:

it's a vector database, not a scalar one

vectors are easily scalable and dont leave jaggies

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003

Deep Dish Fuckfest posted:

mysql will let you drop a table without cascading the drop to dependencies! as in you have constraints that refer to a table that doesn't exist anymore. because who gives a poo poo about data integrity.

:stare: i was not aware of this lol

PIZZA.BAT
Nov 12, 2016


:cheers:


do they not maintain the schema in their own internal system table? wtf? how can your ddl not be transactional?

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal
bold of you to assume there's some internal system table

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


just follow our Devs and set no keys or relationships in the first place

polyester concept
Mar 29, 2017

when i first started learning php and mysql in 2003, I had no idea that you could even define foreign keys as part of the db schema itself. everything was built on raw queries and hope. so many orphans in those days.

Captain Foo
May 11, 2004

we vibin'
we slidin'
we breathin'
we dyin'

DELETE CASCADE posted:

:stare: i was not aware of this lol

username post combo

Ocean of Milk
Jun 25, 2018

oh yeah

Deep Dish Fuckfest posted:

mysql will let you drop a table without cascading the drop to dependencies! as in you have constraints that refer to a table that doesn't exist anymore.

Is that also true for deleting individual rows? As in, in table "butt" I have a row with buttId1 and in table "fart" I have a row with fartId 1 and a foreign key buttId 1, and I can delete the row in butt no-problemo, leaving the fart row with a dangling reference?

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

PIZZA.BAT posted:

do they not maintain the schema in their own internal system table? wtf? how can your ddl not be transactional?

there is a difference between “the schema text is stored transactionally”, which is indeed simple, and “the effects of changing the schema are handled transactionally”, which is a fair bit more involved

spankmeister
Jun 15, 2008






I haven't read the thread at all but i'm using sqlite3 atm and it's p deece. It's kind of weird that the schema is stored as SQL statements so half of the ALTER TABLE stuff won't work but hey ho

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal

Ocean of Milk posted:

Is that also true for deleting individual rows? As in, in table "butt" I have a row with buttId1 and in table "fart" I have a row with fartId 1 and a foreign key buttId 1, and I can delete the row in butt no-problemo, leaving the fart row with a dangling reference?

what happens depends on the foreign key definition and its "on delete" clause, but i don't think even mysql lets you have dangling references like that. the actual details of the behavior probably depends on the storage engine you're using, like many things in mysql. that said i could've sworn there was a way to get the behavior you're describing, but it's been a long time since i used mysql and i can't remember. the closest would probably be adding the "ignore" modifier to the delete statement where rows that would normally fail a constraint and cause execution to abort will just get skipped instead, ensuring you have no idea what exactly it is that you just deleted

why the gently caress you'd want that behavior let alone actually allow it is a mystery to me

fresh_cheese
Jul 2, 2014

MY KPI IS HOW MANY VP NUTS I SUCK IN A FISCAL YEAR AND MY LAST THREE OFFICE CHAIRS COMMITTED SUICIDE
because bossman is breathing down my neck and IDGAF anymore ok fine here its gone are you fukin happy now its gone now goddamn

<bossman leaves>

gently caress. what did i just break. ahhhh fuckit will worry out it when something happens.

cowboy beepboop
Feb 24, 2001

looks like mariadb has System-Versioned Tables now which is nice
https://mariadb.com/kb/en/system-versioned-tables/

Bloody
Mar 3, 2013

my favorite database is a text file I store next to the executable

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal
now i'm wondering if sqlite has an option to store data in human readable format

spankmeister
Jun 15, 2008






Deep Dish Fuckfest posted:

now i'm wondering if sqlite has an option to store data in human readable format

Yeah you just run strings on the db file

PIZZA.BAT
Nov 12, 2016


:cheers:


i was just on a call with one of my customers where our machine learning guy was walking them through various use cases to help them brainstorm things that could be applicable for them. the discussion meanders around for a bit and we hit the inevitable, 'how many records does your dataset contain' question. they answered fifty. we asked, 'fifty... million? billion? what?'

no. just fifty. they bought an enterprise license from us to store fifty documents

props to my machine learning guy for gently taking that and going, 'ok so you may run into an issue where the model overcorrects here and blah blah blah' instead of laughing

fifty documents

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal
maybe they're big documents?

rotor
Jun 11, 2001

classic case of pineapple derangement syndrome

PIZZA.BAT posted:

i was just on a call with one of my customers where our machine learning guy was walking them through various use cases to help them brainstorm things that could be applicable for them. the discussion meanders around for a bit and we hit the inevitable, 'how many records does your dataset contain' question. they answered fifty. we asked, 'fifty... million? billion? what?'

no. just fifty. they bought an enterprise license from us to store fifty documents

props to my machine learning guy for gently taking that and going, 'ok so you may run into an issue where the model overcorrects here and blah blah blah' instead of laughing

fifty documents

im a busy, handsome man with no time to figure out where to put these fifty documents. Heres a check, figure it out nerdlingers.

Bloody
Mar 3, 2013

each document is one entire SQLite database containing one billion rows

Bloody
Mar 3, 2013

and one column

graph
Nov 22, 2006

aaag peanuts

rotor posted:

im a busy, handsome man with no time to figure out where to put these fifty documents. Heres a check, figure it out nerdlingers.

wisdom

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

Deep Dish Fuckfest posted:

now i'm wondering if sqlite has an option to store data in human readable format

the SQLite disk format is readable by humans if that human is motivated enough

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal
that's true of every file format

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

now you’re seeing the truth

Bloody
Mar 3, 2013

how to read sqlite as a human:
step 1. install sqlite

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal

Subjunctive posted:

now you’re seeing the truth

reminds me of that time i got curious how "real" file formats actually look like and read the specs for stuff like elf and dwarf and pdf and who knows what else

my conclusion was that everything is garbage and i'll just keep dumping raw memory to disk like god intended without giving a gently caress about things like padding or versioning, and most certainly not about endianness because those big endian freaks shouldn't be allowed to touch my data in the first place

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

pdf is a trip for sure

Adbot
ADBOT LOVES YOU

rotor
Jun 11, 2001

classic case of pineapple derangement syndrome
fun fact: the first programming language i learned as an adult was postscript

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply