Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
redleader
Aug 18, 2005

Engage according to operational parameters

MALE SHOEGAZE posted:

at my startup our architecture is microservices and a message queue.

each microservice has its own mongo database, because this reduces coupling. allowing the services to access the same database would increase coupling, so if one service needs access to data in another service, the originating service will just dump the entire collection into the pipeline, and the consuming service will write out all of the entries into its own database.

whenever an entity in the database is updated, the responsible service will emit an update event, and dump the entire object into the pipeline. consuming services will then write it to their own db, taking extreme care to update any and all data associations (a traditional DB would of course enforce these relationships for you, but it's no loss because keeping data in sync is a totally trivial problem compared to coupling, which is the worst problem).

the architects of this system did not concern themselves with concurrency, because data races are trivial compared to coupling. we've simply forced each consumer to run on a single thread, because concurrency issues are difficult to solve and we have more important problems such as reducing the amount of coupling in our system.

naturally, this system contains json entities that can be over 1mb compressed. if a user updates a single field on one of these entities, the entire 1mb model will get dumped into the queue. if they update the model twice (or 100 times) it will get dumped into the queue each time. this in no way overwhelms the system.

a few months back, i introduced an RPC mechanism so that we could at least make synchronous calls between services in response to user events. today my lead informed me that we're going to deprecate the RPC system because it increases coupling.

this is how you architect a system with 12 microservices that cannot handle more than 4 or 5 concurrent users before falling over. Fortunately, since everything is so decoupled, the system at least maintains the appearance of working.

:pwn:

Adbot
ADBOT LOVES YOU

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

tef posted:

needs more containers and to be ported to kubernetes by next month tia

also a service discovery layer using a state-of-the-art distributed key value database implementing a sophisticated consensus algorithms with a 500-line bespoke configuration file in order to share three strings.

MononcQc
May 29, 2007

The thing I find interesting about the use of gRPC at Google is that from their SRE book, the entire set up requires an actively managed cluster, a very smart router dispatching requests to a fully managed kubernetes-like back-end of workers on a one-to-one basis, and so on. In the end it sounds like they're transforming the whole thing into what looks a lot more like AWS' Lambda than your traditional RPC set up; they just used RPC as the building block for it.

Then you have an army of folks going in there and hand-tuning timeouts, because RPCs calling RPCs calling RPCs turn out to have very weird and tricky timeout challenges and semantics when it comes to time limitations and cancellation. Blog posts from google point out a need to respect special conventions with unified internal interfaces that let such values be harmonized across the stack (a first argument 'context' that must be weaved into all functions on the call path).

The unmentioned challenges of getting that kind of RPC architecture to work at their best in practice are kind of scary. Maybe the next RPC will be better.

MononcQc fucked around with this message at 12:13 on Aug 9, 2017

Sapozhnik
Jan 2, 2005

Nap Ghost
i didn't read much of that article but it immediately conflates the actual rpc library programming model with the semantics of rpc protocols. i don't really give a poo poo about whether a given rpc is represented with a synchronous stub call or whether a message is built up incrementally and gets explicitly submitted to yield a promise that can be chained. grpc tends to look more like the latter if anything.

retry and rollback semantics and atomicity guarantees and whatever don't go away if you simply pretend they don't exist. that's not an argument for or against rpc.

coarse messages between well-separated concerns that minimize round trips and rollback complexity seem like they ought to be fine.

deeply chained and branching call stacks sliced across tens of machines seem like they'd have the problems you describe yeah. a higher-level abstraction would be useful there. but hey, it's google and they employ some of the best distributed systems people in the world. i'm sure they have some sort of improvements in mind here.

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:

MALE SHOEGAZE posted:

at my startup our architecture is microservices and a message queue.

each microservice has its own mongo database, because this reduces coupling. allowing the services to access the same database would increase coupling, so if one service needs access to data in another service, the originating service will just dump the entire collection into the pipeline, and the consuming service will write out all of the entries into its own database.

whenever an entity in the database is updated, the responsible service will emit an update event, and dump the entire object into the pipeline. consuming services will then write it to their own db, taking extreme care to update any and all data associations (a traditional DB would of course enforce these relationships for you, but it's no loss because keeping data in sync is a totally trivial problem compared to coupling, which is the worst problem).

the architects of this system did not concern themselves with concurrency, because data races are trivial compared to coupling. we've simply forced each consumer to run on a single thread, because concurrency issues are difficult to solve and we have more important problems such as reducing the amount of coupling in our system.

naturally, this system contains json entities that can be over 1mb compressed. if a user updates a single field on one of these entities, the entire 1mb model will get dumped into the queue. if they update the model twice (or 100 times) it will get dumped into the queue each time. this in no way overwhelms the system.

a few months back, i introduced an RPC mechanism so that we could at least make synchronous calls between services in response to user events. today my lead informed me that we're going to deprecate the RPC system because it increases coupling.

this is how you architect a system with 12 microservices that cannot handle more than 4 or 5 concurrent users before falling over. Fortunately, since everything is so decoupled, the system at least maintains the appearance of working.

de my couple hole

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:
your org has it figured out: if your system can't be used it can't be broken

MononcQc
May 29, 2007

Sapozhnik posted:

retry and rollback semantics and atomicity guarantees and whatever don't go away if you simply pretend they don't exist. that's not an argument for or against rpc.
Even raw HTTP has mechanisms built-in for idempotence, cacheability, and error return values that can put the blame on either party for example, because they know theirs is a networked model, not a "let's pretend remote servers are on this machine" kind of deal. It's an argument against RPC because RPC traditionally and currently tends never to address these things even if they have been known to be useful if not necessary for decades.

This is not a pro-HTTP argument; HTTP can and would make for an often fairly lovely mechanism, but HTTP did the distsys poo poo better than most RPC systems do, even if there have been like 3 HTTPs, but over a dozen RPC protocols.

Maybe the 24th RPC iteration will get there. It appears Finagle has a way to mark some failure types as retry-friendly now!

Sapozhnik posted:

deeply chained and branching call stacks sliced across tens of machines seem like they'd have the problems you describe yeah. a higher-level abstraction would be useful there. but hey, it's google and they employ some of the best distributed systems people in the world. i'm sure they have some sort of improvements in mind here.
Deeply-chained and branching call stacks sliced across tens of machines is getting to be the norm with the current microservice trend. Even Male Shoegaze's employer, with their 4 concurrent users, uses 12 microservices. If they were able to accept the "cost of coupling", these weird situations would appear sooner than later. Right now they just have the bad luck of having an even worse model to build on. RPC would be an improvement for them for sure.

Until google have their improvements, I don't really feel it's that great of a model to migrate to use in all the places that don't employ "some of the best distributed systems people in the world" without the full caveats and architecture workarounds google required implementing for any level of non-trivial services.

I mean for gently caress's sake, I went to a conference about reactive applications last year, and while half the presentations were about how Kafka got people to remove so many arrows from their architecture diagrams and replace them by one big Kafka box with a logo on it, at least a quarter of them were just about how to manage all the RPCs and other remote calls that were starting to be blocking and surprised everyone with the big nasty tail latencies when they used microservices. Microservices just make the problem much more apparent.

HoboMan
Nov 4, 2010

i just fire http requests everywhere and it seems like it works pretty well, op

HoboMan
Nov 4, 2010

in my api i am logging the data i send. the problem is some of these data sets are so big that serializing the data to the log it is causing out of memory exceptions

i do need to log this so when poo poo breaks i can prove it's not my fault

and it's super helpful in debugging too, i guess

hobbesmaster
Jan 28, 2008

uhhh most programs don't run with logging set to TRACE for a reason

Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe

HoboMan posted:

in my api i am logging the data i send. the problem is some of these data sets are so big that serializing the data to the log it is causing out of memory exceptions

i do need to log this so when poo poo breaks i can prove it's not my fault

and it's super helpful in debugging too, i guess

Maybe cherry pick some fields from that blob that you know are relevant to your service metrics and can help you debug poo poo?

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder

HoboMan posted:

in my api i am logging the data i send. the problem is some of these data sets are so big that serializing the data to the log it is causing out of memory exceptions

i do need to log this so when poo poo breaks i can prove it's not my fault

and it's super helpful in debugging too, i guess

convert your data to protobufs and then stream it to your logging micro service

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
the problem with rpc as a design is that it encourages programmers to think of the action as something fairly simple and reliable

it also encourages programmers to write code with horrible latency because it’s so easy to serialize things that really could be happening in parallel, but async/await has that problem, too

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


i think on reading that rpc (and grpc) has nothing to do with what i wanted for my side project at work so oops

I'm being switched to a different group soon though so I guess it's irrelevant now anyways~ it was still interesting to learn about though.

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat
i keep tcpdump running on every server and have it stream to kinesis for processing.

of course those generate more tcp data so i'm hitting the 500 shard limit so i had to make some API calls to amazon to dynamically stand up more streams for each server.

parsing the data in wireshark requires massive amounts of memory and is slow as dirt, but at least i can prove that it's not my fault when a server throws a 500 error.

MononcQc
May 29, 2007

CRIP EATIN BREAD posted:

i keep tcpdump running on every server and have it stream to kinesis for processing.

of course those generate more tcp data so i'm hitting the 500 shard limit so i had to make some API calls to amazon to dynamically stand up more streams for each server.

parsing the data in wireshark requires massive amounts of memory and is slow as dirt, but at least i can prove that it's not my fault when a server throws a 500 error.

you can try tshark for a lighter-weight command-line version of wireshark (it ships with it). You can use the same filters as you would in the GUI, but usually it struggles a lot less at handling huge dumps.

Lutha Mahtin
Oct 10, 2010

Your brokebrain sin is absolved...go and shitpost no more!

MononcQc posted:

I went to a conference about reactive applications last year, and while half the presentations were about how Kafka got people to remove so many arrows from their architecture diagrams and replace them by one big Kafka box with a logo on it, at least a quarter of them were just about how to manage all the RPCs and other remote calls that were starting to be blocking and surprised everyone with the big nasty tail latencies when they used microservices. Microservices just make the problem much more apparent.

Would you say that "it makes the terribleness of your systems more obvious" is a benefit of micro services? (I am a baby coder who probably won't make decisions that big for years, so don't feel like you need to write an essay here lol)

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat

MononcQc posted:

you can try tshark for a lighter-weight command-line version of wireshark (it ships with it). You can use the same filters as you would in the GUI, but usually it struggles a lot less at handling huge dumps.

i hope you know i was joking

HoboMan
Nov 4, 2010

i loving get it, but the dev on the other side of that api call is a little poo poo that refuses to even look onto a bug unless you can empirically prove that his poo poo is what hosed up

MononcQc
May 29, 2007

CRIP EATIN BREAD posted:

i hope you know i was joking

eeh, I've seen similar approaches before, so it was a stretch. Like people replaying/rewriting packets on the fly so that a staging/pre-production stack gets actual production data coming in. Kinesis was a dumb thing though because iirc they have like 5 qps limits by default and that would just not be possible without paging log files :shrug:

MononcQc
May 29, 2007

Lutha Mahtin posted:

Would you say that "it makes the terribleness of your systems more obvious" is a benefit of micro services? (I am a baby coder who probably won't make decisions that big for years, so don't feel like you need to write an essay here lol)

It's a benefit if you know you can fix it, have the time to do it, and are going to stick with a microservice architecture out of need already.

In a lot of cases, you just have people who want microservices because that's the new cool thing, but for whom a monolith would work really really well for many years. In this case, they're giving themselves distributed systems problems they could avoid by scaling vertically for a long time. OTOH, it's good to keep these problems in mind because they can impact product decisions in major ways so you can avoid them (some things that work locally just aren't possible with large distributed systems) if you know you're gonna get real big.

MrMoo
Sep 14, 2000

MononcQc posted:

eeh, I've seen similar approaches before, so it was a stretch. Like people replaying/rewriting packets on the fly so that a staging/pre-production stack gets actual production data coming in. Kinesis was a dumb thing though because iirc they have like 5 qps limits by default and that would just not be possible without paging log files :shrug:

Generating pcap logs is a sensible option for trading environments when you don't want latency from inline logging. With OpenOnload it becomes near-free like a port mirror.

JawnV6
Jul 4, 2004

So hot ...

suffix posted:

https://deadlockempire.github.io/
this is pretty good to put the fear in you if you've ever thought oh threading isn't so hard just use a mutex

these were fun little games, but the ‘boss fight’ at the end is a bear. everything else was pretty straightforward, get one thread in the right place and cycle the aggressor, but that last one is the kind of failure you’d catch with batches of tests not inspection

Shaman Linavi
Apr 3, 2012

i put an executable up for one of my github projects and was able to download and run it on another computer
feeling a little less terrible today

Luigi Thirty
Apr 30, 2006

Emergency confection port.

Shaman Linavi posted:

i put an executable up for one of my github projects and was able to download and run it on another computer
feeling a little less terrible today

i uploaded my xcode project, tried to download it again and xcode crashed every time it tried to load

rip

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

JawnV6 posted:

these were fun little games, but the ‘boss fight’ at the end is a bear. everything else was pretty straightforward, get one thread in the right place and cycle the aggressor, but that last one is the kind of failure you’d catch with batches of tests not inspection

actually, you'd catch it by inspection by noting that the critical section isn't actually guarded by the monitor. who gives a poo poo if you can't figure out the exact sequence that breaks it immediately when the underlying flaw is obvious?

if you're relying on tests to catch a nondeterministic failure you're going to have a bad time at some point

JawnV6
Jul 4, 2004

So hot ...

Jabor posted:

actually, you'd catch it by inspection by noting that the critical section isn't actually guarded by the monitor. who gives a poo poo if you can't figure out the exact sequence that breaks it immediately when the underlying flaw is obvious?

if you're relying on tests to catch a nondeterministic failure you're going to have a bad time at some point
jesus why are folks so eager to start "ACTUALLY" poo poo in the TP thread of all places? are you literally arguing that there's no value in going through an exercise if you can heuristically guess a potential solution? gently caress's sake man

and yes, i have relied on tests to catch nondeterministic failures and you have used products that resulted from this methodology

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat
critical sections are the worst mechanism for multithreading that I can think of

Luigi Thirty
Apr 30, 2006

Emergency confection port.

I GOT IT AAAAAAAA

https://twitter.com/LuigiThirty/status/895525296505364480

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

wooooo

Wheany
Mar 17, 2006

Spinyahahahahahahahahahahahaha!

Doctor Rope
is there some kind of an universal sentence tokenization library that will spit out a stream of words from any text, written in any script?

i know doing that correctly is Really Hard, but is there something that produces stable results, even if incorrect? i want to try some bayesian classification on tweets as an idiot side project, and it would be nice if it's not hardcoded to support only languages that separate words with whitespace.

is there a word for what i'm looking for? (a unicorn lol)

gonadic io
Feb 16, 2011

>>=
tps: i will never get over the fact that grails can't count how many unit tests it has so it reports
code:
| Running 61 spock tests... 110 of 61
| Running 61 spock tests... 111 of 61
| Running 61 spock tests... 112 of 61
| Running 61 spock tests... 113 of 61

HoboMan
Nov 4, 2010

Wheany posted:

is there some kind of an universal sentence tokenization library that will spit out a stream of words from any text, written in any script?

i know doing that correctly is Really Hard, but is there something that produces stable results, even if incorrect? i want to try some bayesian classification on tweets as an idiot side project, and it would be nice if it's not hardcoded to support only languages that separate words with whitespace.

is there a word for what i'm looking for? (a unicorn lol)

somethin like this?
https://nlp.stanford.edu/software/lex-parser.shtml

although being probabilistic it may not be stable

HoboMan fucked around with this message at 14:48 on Aug 10, 2017

tef
May 30, 2004

-> some l-system crap ->

Wheany posted:

is there some kind of an universal sentence tokenization library that will spit out a stream of words from any text, written in any script?

i know doing that correctly is Really Hard, but is there something that produces stable results, even if incorrect? i want to try some bayesian classification on tweets as an idiot side project, and it would be nice if it's not hardcoded to support only languages that separate words with whitespace.

nltk is probably your best bet

quote:

is there a word for what i'm looking for? (a unicorn lol)

segmentation ?

https://en.wikipedia.org/wiki/Text_segmentation#Word_segmentation

HoboMan
Nov 4, 2010

tokenizer is the word i think we are looking for

like this?
https://nlp.stanford.edu/software/tokenizer.shtml

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
i think it's finally time to learn c because unsafe rust is just confusing without a c background.

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
icu's word split iterator is sort of surprisingly good, and even supports a few languages that don't separate words with whitespace (by having a giant dictionary)

crazypenguin
Mar 9, 2005
nothing witty here, move along

Jabor posted:

if you're relying on tests to catch a nondeterministic failure you're going to have a bad time at some point

tests are really good at finding non-deterministic failures, just not the one-shot unit test kind.

if you haven't read through the jepsen blog, it's worth a look:

https://jepsen.io/analyses

it's all randomized property testing, and it's basically state-of-the-art for ensuring real-world concurrent systems actually work.

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:

MALE SHOEGAZE posted:

i think it's finally time to learn c because unsafe rust is just confusing without a c background.

C is a good language to learn because it demonstrates the low-level inner-workings of operations that modern higher-level languages have abstracted away and also it teaches you how much you enjoy not having to write in C.

Adbot
ADBOT LOVES YOU

Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe
Yeah unit tests can hide the concurrency problems in a piece of software that is divided into lots of modules. I remember writing an test that spun up hundreds of clients hitting the same endpoints to prove that there was a race condition in our code that was using row level locking incorrectly, and the bug would pop up consistently and faster the more clients I added. Then once I applied a possible fix I used the same test to show that this particular problem was fixed.

  • Locked thread