p-lang thread: (now (have you (problems two)))

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > p-lang thread: (now (have you (problems two)))

«‹›1784 »

Shaggar: Apr 26, 2006

apache is the best cause its designed by and for professionals who work in the real world. MIT/BSD are ok, but apache is more modern and clearly defines rights, whereas they are kind of assumed to exist in MIT/BSD (afaik).

you're not gonna be able to control your open sores. using apache over a restrictive license (one that requires code kick backs, or worse, no proprietary software) gives users the ability to change their mind about contributing later on instead of skipping your license or ignoring it entirely.

# ? Jun 14, 2013 18:36

Adbot: ADBOT LOVES YOU

# ? Jun 11, 2024 00:48

Blotto Skorzany: Nov 7, 2008; He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip

Shaggar posted:

you're not gonna be able to control your open sores. using apache over a restrictive license (one that requires code kick backs, or worse, no proprietary software) gives users the ability to change their mind about contributing later on instead of skipping your license or ignoring it entirely.

"i violate copyright at work, therefore everyone else does too"

# ? Jun 14, 2013 19:00

Zombywuf: Mar 29, 2008

Who the gently caress is releasing code under a license that requires upstream contribution?

# ? Jun 14, 2013 19:20

Max Facetime: Apr 18, 2009

Otto Skorzeny posted:

"i violate copyright at work, therefore everyone else does too"

copyright infringement is easy, prevention less so, but not impossible if you make the source code totally unappealing. gpl helps with this

# ? Jun 14, 2013 19:43

Condiv: May 7, 2008; Sorry to undo the effort of paying a domestic abuser $10 to own this poster, but I am going to lose my dang mind if I keep seeing multiple posters who appear to be Baloogan.

With love,
a mod

https://www.youtube.com/watch?v=E3418SeWZfQ

shaggar was right?

# ? Jun 14, 2013 20:42

NickFendon: May 4, 2009

Why does nobody release stuff under the Boost license? It's like MIT but without the requirement to include the license with compiled output / binaries, right?

# ? Jun 15, 2013 01:32

Squinty Applebottom: Jan 1, 2013

NickFendon posted:

Why does nobody release stuff under the Boost license? It's like MIT but without the requirement to include the license with compiled output / binaries, right?

its identical

The copyright notices in the Software and this entire statement, including
the above license grant, this restriction and the following disclaimer,
must be included in all copies of the Software,

# ? Jun 15, 2013 01:36

MononcQc: May 29, 2007

Quick rundown of eventual consistency methods https://research.microsoft.com/pubs/192621/sigtt611-bernstein.pdf -- a nice quick read to have a general idea of what's out there.

# ? Jun 15, 2013 02:07

NickFendon: May 4, 2009

polpotpi posted:

its identical

The copyright notices in the Software and this entire statement, including
the above license grant, this restriction and the following disclaimer,
must be included in all copies of the Software,

in whole or in part, and
all derivative works of the Software, unless such copies or derivative
works are solely in the form of machine-executable object code generated by
a source language processor.

# ? Jun 15, 2013 02:15

Squinty Applebottom: Jan 1, 2013

lol like I'm reading the whole drat thing

# ? Jun 15, 2013 02:26

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

NickFendon posted:

Why does nobody release stuff under the Boost license? It's like MIT but without the requirement to include the license with compiled output / binaries, right?

because its useless

# ? Jun 15, 2013 02:38

Nomnom Cookie: Aug 30, 2009

NickFendon posted:

Why does nobody release stuff under the Boost license? It's like MIT but without the requirement to include the license with compiled output / binaries, right?

Some C++ libs are. Java projects use Apache license because the ASF does

Also I don't mind if someone uses code I publish but no attribution? I'm not ok with that

# ? Jun 15, 2013 03:05

Nomnom Cookie: Aug 30, 2009

MononcQc posted:

Quick rundown of eventual consistency methods https://research.microsoft.com/pubs/192621/sigtt611-bernstein.pdf -- a nice quick read to have a general idea of what's out there.

is there anything better than last write wins and stupid limited things like replicated counters

# ? Jun 15, 2013 03:06

Gazpacho: Jun 18, 2004; by Fluffdaddy; Slippery Tilde

prefect posted:

artistic license

who doesn't want to be an artist?

also it's a joke on the phrase "artistic license", which i have always appreciated

autistic license: if you distribute a modified version of this code the author will send you e-mail tantrums until you agree to stop

# ? Jun 15, 2013 07:57

Sapozhnik: Jan 2, 2005; Nap Ghost

Nomnom Cookie posted:

Some C++ libs are. Java projects use Apache license because the ASF does

Also I don't mind if someone uses code I publish but no attribution? I'm not ok with that

Copyright law requires it irrespective of the license. If someone copies your code into their work, then whether or not you allow them to do that you still own the copyright for your work and it is illegal for them to plagiarise it (claim that they wrote it and not you, aka "moral rights") or remove any copyright notices you've added in there

http://en.wikipedia.org/wiki/ISC_license

ISC license best license, it's basically a "do whatever the gently caress you want" license except without all the poo poo that's required by copyright law anyway. My only complaint is the HURR IF IT'S IN CAPITAL LETTERS IT'S MORE LEGALER second paragraph.

# ? Jun 15, 2013 13:56

EMILY BLUNTS: Jan 1, 2005

polpotpi posted:

lol like I'm reading the whole drat thing

did you know?
there is software that can compare two bodies of text,

# ? Jun 15, 2013 15:04

MononcQc: May 29, 2007

Nomnom Cookie posted:

is there anything better than last write wins and stupid limited things like replicated counters

Last write wins is a simpler version of logical clocks and others, which aim to track logical dependencies in order to limit how often you need to actually need to make a decision that will lose data. CRDTs will have you design your data types such as they can be merged in any way or order and give stable results (the paper above represents a way to do it with sets via counters, but it's not always counters).

It always depends on what you can allow yourself to lose or not. The general ideas remain limited by the CAP theorem: you can't have updates that work and are reflected on all nodes of a system during failures. Either you stop operations, or you allow some of them and deal with the fallover later.

A lot of it is either designing a system such that you get the optimal amount of communication required to guarantee consistency (through a quorum or whatever), or organizing (some or all) operations such as that they can be reconciled safely when the cluster is healthy.

# ? Jun 15, 2013 18:09

Max Facetime: Apr 18, 2009

So which software is the best for that?

# ? Jun 15, 2013 22:26

Notorious b.s.d.: Jan 25, 2003; by Reene

Max Facetime posted:

So which software is the best for that?

Best for what?

If you require partition tolerance, do you want a CP or an AP system? This is a subtle and difficult choice. AP is really hard to do unless you build your software from the ground up to support it (e.g. datomic + riak makes AP kinda the default choice when designing your product)

CP is easier to wrap your (well, my) head around, and requires less of your time spent thinking about CRDTs, but having the whole world grind to a halt in the face of a problem is kinda scary.

And before you say "...but I don't need partition tolerance" consider how easy it is to have a partition. e.g. some clients able to speak to only one server due to a network fault. (Yes, clients are part of your distributed system!)

# ? Jun 16, 2013 17:33

Cocoa Crispies: Jul 20, 2001; Vehicular Manslaughter!; Pillbug

Notorious b.s.d. posted:

And before you say "...but I don't need partition tolerance" consider how easy it is to have a partition. e.g. some clients able to speak to only one server due to a network fault. (Yes, clients are part of your distributed system!)

lol somebody read http://aphyr.com/tags/jepsen

# ? Jun 16, 2013 17:54

Notorious b.s.d.: Jan 25, 2003; by Reene

Cocoa Crispies posted:

lol somebody read http://aphyr.com/tags/jepsen

yep. actually i saw him present on the topic. it was p. sweet

i had never really thought about the client's angle and that was a scary moment

# ? Jun 16, 2013 18:08

Cocoa Crispies: Jul 20, 2001; Vehicular Manslaughter!; Pillbug

Notorious b.s.d. posted:

yep. actually i saw him present on the topic. it was p. sweet

i had never really thought about the client's angle and that was a scary moment

was that at ricon east? missed it to go to tefville instead

# ? Jun 16, 2013 18:53

tef: May 30, 2004; -> some l-system crap ->

# ? Jun 16, 2013 20:57

tef: May 30, 2004; -> some l-system crap ->

Notorious b.s.d. posted:

yep. actually i saw him present on the topic. it was p. sweet

did you find out what vector clocks are?

# ? Jun 16, 2013 21:00

spongeh: Mar 22, 2009; BREADAGRAM OF PROTECTION

tef posted:

did you find out what vector clocks are?

misread that as victor clocks, and i wanted to know more

# ? Jun 16, 2013 21:29

Captain Foo: May 11, 2004; we vibin'
we slidin'
we breathin'
we dyin'

spongeh posted:

misread that as victor clocks, and i wanted to know more

i misread it as vector cocks

# ? Jun 17, 2013 05:00

vapid cutlery

Apr 17, 2007

php:

<?
"it's george costanza" ?>

is victor still building Tsunami or was that someone else

# ? Jun 17, 2013 07:11

Max Facetime: Apr 18, 2009

Notorious b.s.d. posted:

Best for what?

If you require partition tolerance, do you want a CP or an AP system? This is a subtle and difficult choice. AP is really hard to do unless you build your software from the ground up to support it (e.g. datomic + riak makes AP kinda the default choice when designing your product)

CP is easier to wrap your (well, my) head around, and requires less of your time spent thinking about CRDTs, but having the whole world grind to a halt in the face of a problem is kinda scary.

And before you say "...but I don't need partition tolerance" consider how easy it is to have a partition. e.g. some clients able to speak to only one server due to a network fault. (Yes, clients are part of your distributed system!)

...best for those words you just mentioned?

# ? Jun 17, 2013 15:26

Cocoa Crispies: Jul 20, 2001; Vehicular Manslaughter!; Pillbug

Max Facetime posted:

...best for those words you just mentioned?

there's no best, just what works for what you need

# ? Jun 17, 2013 15:30

Max Facetime: Apr 18, 2009

availability seems a more interesting problem than consistency, if I understood those jepsen articles right

I want to use Cassandra's CQL because it's finally making those column families somewhat understandable but how do you figure out if there's similar pitfalls like postresql, redis, mongodb and riak had?

# ? Jun 17, 2013 15:35

Max Facetime: Apr 18, 2009

Cocoa Crispies posted:

there's no best, just what works for what you need

so 1 step forward and 2 steps back

... thanks

# ? Jun 17, 2013 15:40

MononcQc: May 29, 2007

Max Facetime posted:

availability seems a more interesting problem than consistency, if I understood those jepsen articles right.

They're two faces of the same coin. It's "easy" to go all in for consistency (require a two-phase commit, require everyone to agree on the value, and stall otherwise), and it's easy to go all-in for availability (just drop all consistency requirements).

It's harder to have consistency with multiple failures (quorums, see Paxos, Raft, ZAB), or to have high availability while trying to be as consistent as possible (Dynamo, CRDTs, etc.)

In the last few years, a lot of effort has been devoted to the latter category. The requirements for low latency, huge clusters (multiple hundreds of nodes) and very dynamic environments or cross-datacenter data sets makes it very interesting to find solutions that can work well there. It probably looks more interesting because I don't think there has been nearly as much formal research in that area as there has been for approaches that want to get availability first.

# ? Jun 17, 2013 16:49

spongeh: Mar 22, 2009; BREADAGRAM OF PROTECTION

Max Facetime posted:

availability seems a more interesting problem than consistency, if I understood those jepsen articles right

I want to use Cassandra's CQL because it's finally making those column families somewhat understandable but how do you figure out if there's similar pitfalls like postresql, redis, mongodb and riak had?

We adopted Cassandra really early on (too early), and didn't keep up with updating it. We ran into huge performance issues that would cause it to become cripplingly slow under load, but didn't keep up with the various version breakages and API changes which made it really hard to just upgrade our cluster. I'm sure now that they've hit 1.0 it's a bit better, but even with CQL being relatively new, you'll probably want to make sure you keep up to date with new Cassandra updates, as even recently they still seemed to be making a lot of big changes between 1.x versions.

# ? Jun 17, 2013 20:00

Notorious b.s.d.: Jan 25, 2003; by Reene

tef posted:

did you find out what vector clocks are?

like all distributed systems ninjas i rely on memes to communicate c.s. concepts

do u even kno how memes work

Edit: http://teespring.com/doyouevenknow for all y'all not on the tef inside baseball circuit (smug, proven correct, impoverished)

Notorious b.s.d. fucked around with this message at 04:06 on Jun 18, 2013

# ? Jun 18, 2013 04:03

MeruFM: Jul 27, 2010

Are all these distributed, availability, consistency, etc problems just theoretical or do these issues actually happen in real life? And in what systems of so?

I've used some variant of SQL mostly and the biggest problems still arise from poo poo code (like 99.999 of all computer problems really) causing deadlock and rollbacks.

The problems seem interesting, but at what size does MySQL/oracle stop being good enough, provided nothing stupid going on?

MeruFM fucked around with this message at 09:04 on Jun 18, 2013

# ? Jun 18, 2013 08:53

Squinty Applebottom: Jan 1, 2013

mysql starts being poo poo the second you install it

# ? Jun 18, 2013 10:59

ultramiraculous: Nov 12, 2003; "No..."; Grimey Drawer

polpotpi posted:

mysql starts being poo poo the second you install it

but...but...facebook

# ? Jun 18, 2013 11:06

Squinty Applebottom: Jan 1, 2013

oracle is cool cause it gives DBAs more power than they should reasonably have in a storage backend to do weird terrible poo poo so its lots of fun to play around with

# ? Jun 18, 2013 11:15

Zombywuf: Mar 29, 2008

MeruFM posted:

The problems seem interesting, but at what size does MySQL/oracle stop being good enough, provided nothing stupid going on?

MySQL is never good enough; Postgres damnit.

In general though, a simple relational db is fine up to a few thousand transactions per minute provided you code it properly. Decent architecture can often make most of your CAP theorem problems just stop existing. In general it seems the problems that the research into distributed systems are good for are MMOs (reads and writes in equal proportion with everyone to everyone data sharing) and systems with over 100,000 simultaneous users and fast growth.

# ? Jun 18, 2013 11:52

Adbot: ADBOT LOVES YOU

# ? Jun 11, 2024 00:48

MononcQc: May 29, 2007

You're fine with the usual SQL databases as long as your set of operations is simple enough that a single instance or a few more (if you have master-to-master replication) can deal with the load. When you go over that limit and you need to scale up the number of masters that can write, that's when you may start serious data problems.

But even before you get there, you are hitting some distributed systems issue. Let's take cache for an example. Adding cache in front of a website is basically going for something that breaks a bunch of assumptions: the DB, the cache server (say memcached, redis for some counters, or whatever), and multiple clients (HTTP cache) may all see different values for the same resource.

Usually, this isn't a big deal because people will generally use a single browser, and programmers will treat the DB as the authoritative copy and ignore the values taken by other components -- the DB is still the master, the rest is just replication.

But let's say we've got 5 front-end nodes, and because we want cache to be faster, we cache things locally on each front-end node rather than a dedicated machine on the side (we might have better geo distribution, for example). Now, the timing of each request made can impact what a user hitting a front-end node will see.

Better, every time you refresh the page, you may get a different copy of the document, image, or whatever. It might not be a problem, but if what you're hosting is static content that visually impacts the site (like your templates, the CSS Stylesheet, and the background images it loads), you may end up with a request for the template on node A, the CSS from node B, and the images from node C. If someone hits this partially through a deploy, you may end up with a broken website for most users. Then if users cache it client-side, they might still see the issue minutes or hours after you've fixed it. Woops.

Most of the time it's not gonna be a big deal -- you could invalidate the cache once the deploy is over and fetch a fresh copy (dropping state), change the URL of the new documents (my-background.png?v54 -- duplicate documents and differentiate between state and identity) or go to a CDN that could deal with it (where these guys deal with the distributed systems poo poo for you).

Is this a hard, customer-impacting problem that requires a distributed system expert to step in? Not really. There are simple mitigation techniques and people have been dealing with it informally for years already, with good enough results. You still have an authoritative copy of everything somewhere and can deal with poo poo.

Even if it's not really complex, it still shows some properties of distributed systems where you choose to lack consistency in favor of lowered latency, as described in PACELC. The more state needs to be replicated over many nodes, with the more operations over the data set, the more difficult it gets.

# ? Jun 18, 2013 14:01

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > p-lang thread: (now (have you (problems two)))

«‹›1784 »