Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Jerry Bindle
May 16, 2003

KARMA! posted:

bitbucket supports hg though. in fact im using it right now

i like this thread because i know i belong

Adbot
ADBOT LOVES YOU

Shaggar
Apr 26, 2006

Barnyard Protein posted:

yeah i voted for hg but the company i work for is obsessed with atlassian. what is bad about distributed? i mean for all practical purposes, except for being able to do stuff locally, its really similar to a centralized scm

shaggar what do you use?

svn

Bloody
Mar 3, 2013

lol

the talent deficit
Dec 20, 2003

self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture





bitbucket is the best argument against git/hg i have ever used. to make a pull request follow these 18 steps

jony neuemonic
Nov 13, 2009



:confused:

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
let's not forget that pull requests are not a part of git at all

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
they are good though an if you're not using github you should probably do something like phabricator

Shaggar
Apr 26, 2006

MALE SHOEGAZE posted:

let's not forget that pull requests are not a part of git at all

idk what a pull request is but everyone talks about it in git all the time but its not part of git?

prefect
Sep 11, 2001

No one, Woodhouse.
No one.




Dead Man’s Band

Shaggar posted:

idk what a pull request is but everyone talks about it in git all the time but its not part of git?

it is a request for another git user to merge your branch into theirs

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder

Shaggar posted:

idk what a pull request is but everyone talks about it in git all the time but its not part of git?

it's just a github web thing for reviewing and approving feature branches. you could have a similar thing in any scm.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
and in fact you probably should.

git is nice because local branching is p. sweet for development in general. being "distributed" is a bit overkill for that though, and you're probably loving up if your setup is more distributed than "central repository manages trunk and major branches; developers have local branching for WIP stuff"

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord
one of the things on my list is to sit down and learn more of mercurial and then try to use it for personal stuff. I want to believe it has a better interface

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

Jabor posted:

and in fact you probably should.

git is nice because local branching is p. sweet for development in general. being "distributed" is a bit overkill for that though, and you're probably loving up if your setup is more distributed than "central repository manages trunk and major branches; developers have local branching for WIP stuff"

being distributed works well for the "I want all of my systems to have my branches, without pushing them all to the true origin"

like on my server I have a mirror of an open source project's central repository which has a "git remote update" run on it regularly, and all my other systems treat that as their origin. that way all of my experimental branches can be shared among my systems without them having to go through the upstream repo.

this is important not just to avoid noise: (1) I may not have commit rights on the central repo for a project, but instead have to send patches or pull requests, and (2) if I want to contribute any code to an open source project I have to get management & legal sign-off for each contribution, so I can't just share my branches with other people

jony neuemonic
Nov 13, 2009

Symbolic Butt posted:

one of the things on my list is to sit down and learn more of mercurial and then try to use it for personal stuff. I want to believe it has a better interface

it does, but i wound up going back to git anyway because using anything else feels like swimming against the current.

i have to use git at work though, so ymmv. if i worked in an hg shop i'd probably start using it for personal stuff again.

Valeyard
Mar 30, 2012


Grimey Drawer
I feel like I'm already getting better by having to dig through this horrendous code and try to figure stuff out. I'm getting more adept with the eclipse debugger too, by actually using it.

It's still frustrating and annoying, there is no such thing as small change. There's so much logic tightly coupled with swing components to the point that there's a 5 digit amount of lines that can't feasibly be tested without extensive refactoring

Share Bear
Apr 27, 2004

MononcQc posted:

You need to restart your whole daemon after an unhandled exceptions.

16 pages late because i been working, this was a very good post, thank you

i am beginning to dislike our projects, mainly due to spring and hibernate, less java the language itself

though i hate how many abstractions there are

qntm
Jun 17, 2009

Barnyard Protein posted:

yeah, the key to success is to not have any chucklefucks on your project, or having some way to force them to contribute

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer
lol, i just discovered that we have a vb control that binds an infinite number of html text fields to a vb text field because whoever designed the control couldn't figure out how to fix the memory leaks in his control in IE so he just bound the GUI to a single instance of the vb control to use its logic in the local client instead of trying to handle it in the web client.

JawnV6
Jul 4, 2004

So hot ...

MononcQc posted:

But I'm not done yet. That's not where Erlang stops. In Programming Forth (Stephen Pelc, 2011), the author says "Debugging isn't an art, it's a science!" and provides the following (ASCII-fied) diagram:
code:
    find a problem ---> gather data ---> form hypothesis ----,
    .--------------------------------------------------------'
    '-> design experiment ---> prove hypothesis --> fix problem
Which then loops back on itself. By far, the easiest bits are 'finding the problem'. The difficult ones are to form the right hypothesis that will let you design a proper experiment and prove your fix works.

It's especially true of Heisenbugs that ruin your life by disappearing as you observe them.
i have so much to say about debugging

what's the proper way to blather on about facts learned in a proprietary environment that are worth getting out into the mainstream besides getting drunk with programmers

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

JawnV6 posted:

i have so much to say about debugging

what's the proper way to blather on about facts learned in a proprietary environment that are worth getting out into the mainstream besides getting drunk with programmers

car analogies?

JawnV6
Jul 4, 2004

So hot ...
it can be really unclear if you're suggesting an experiment that is intended to gather data or prove a hypothesis. more generally, you can be in a situation where a diagnostic test might even be rolled into a production fix.

assuming our repair shop is using JIRA, filling out a ticket shouldn't just direct the technician to change a setting and go for a test drive. it must include that context (gather/prove) because it helps some restore empathy to a text-only interface and generate buy-in. and in the best case, the technician will report out bits that might not have made it into the ticket if they didn't have the context (e.g. "turning left produced an odd sound, this rules out using this setting as a production fix")

this generalizes to the larger distinction between "problem space" and "solution space," especially when considering big-hammer approaches to data gathering (e.g. force single-threadedpiston? mode, see if problem disappears)

compuserved
Mar 20, 2006

Nap Ghost

MononcQc posted:

You need to restart your whole daemon after an unhandled exceptions. This means that workflows unimpacted by the exception still need to abort and start over. That's fault tolerance gotcha #1. If you're serving 5,000 concurrent users, some of which are buying stuff online, or doing whatever (like playing dumb games in their browsers over a websocket, or using their one call to their lawyer), and an unrelated exception takes out their current session, you haven't been fault tolerant: a condition unrelated to the current activity (other than sharing the same program) has been taken out.

That's brittle. But there's more to it than that.

The way I like to put it, an extensive type-checking phase, strong test suite, exhaustive model checking, code reviews, linting, and so on, are all aiming at preventing errors from being in programs in the first place. These are required in Erlang, too. The idea is to make sure, as much as possible, that the program is doing the right thing. Yet, errors still make it to programs, and everyone agrees that there are diminishing returns to weeding out all the bugs, potential and real.

It will cost you exponentially more money to weed out the trickier and trickier bugs (without introducing new ones). And so that's why truly error-free programs are gonna be written in vacuums, or on projects that use uncool tech and rules: NASA's The Power of 10: Rules for Developing Safety-Critical Code, for example, forbids using the heap altogether as too risky; they would even delay flights that would go over the new year because code there could have been buggy and it was better to delay by 2-3 weeks in December to January than taking the risk.

Now, those are extremes, and most of the industry isn't running under these rules, doesn't run Ada code despite its proven track record, and so on. That's because most people out there (both of us included) have an addendum to "software should be correct", and it's "software should be correct (within reasonable boundaries of cost and complexity)".

So we got all of these things to prevent errors from being in program, but oh so very few to handle errors in running systems. The approach we take is generally the one used in terms of hygiene without an immune system: live in a clean bubble. But most humans don't need and won't go through these costs, because, well, we have an immune system and an ability to heal injuries.

The only mechanism most software takes close to that are things like redundancy and crash-fast with an easy way to bring something back. But the way we generally apply it is at the architectural level: run the same software in many nodes, restart the whole software as a unit, and so on. The finer-grained mechanisms at which we could handle unexpected errors within software haven't made it there, and for good reason: an exception in a Java thread, for example, risks leaving memory inconsistent so of course you gotta kill it all. Doing otherwise would be risking lovely rear end corruption and that would be worse. Fail-fast is good.

So what's the position of Erlang and why I'm saying it helps fault tolerance?
  • isolated immutable memory: killing or losing a process will not break the memory of others, and is of no risk.
  • preemptive scheduling: no processes, even runaway ones, can starve another one of CPU. You can however set priorities for critical tasks; this is soft-real time stuff (much like the per-process GC, which is not primarily for fault tolerance though)
  • processes dying may still affect a broader state of interdependencies. For this reason, Erlang adds links, which let you add expliciteness of "should-fail-with" relationships between processes
  • because not all processes that depend on each other should die together, it also adds monitors, which lets you detect a process' failure and react to it asynchronously on your own terms
  • pieces of your system fail and you will want redundancy. Therefore the language is made to be network transparent.
  • because the network can fail at arbitrary times (and the other pieces of hardware too), all communication needs to be made via message-passing and asynchronously

So those are the primitives. They give the right tools to make fault-tolerant systems, but are not enough yet. The big concept comes from supervisors, but more exactly from supervision trees.

When you boot your programs and that they have various responsibilities, that is all encoded in the supervision tree. So if for example, I'm building a program to count and report election results from a nationwide vote, my supervision tree could be like this:

code:
                             [root supervisor]
                              |               \
                    [tally_supervisor]       [live reporting sup]
                    /         |                 |        \
             [storage sup]    |       [session sup]      [web server sup]
            /          |      |          |                    / \ \ \
   [worker pool sup]   |      |         ...                [web requests/workers]
    / | | | |        [cache]  |
   [[workers]]]]              |
                              |
                        [district sup]
                        /  | | | | \
                    [various districts'
                     individual supervisors]
                        |               \
                    [counter_sup]       [ballot_sup]
                        |    \                  |
                    [tally] [ballot counting]  [ballot handling]
This system would have two OTP applications: a tally app, and a live-reporting applications. Both are under the VM's root supervisor. The overall tree is started depth-first, from left to right, synchronously. What this means is that my tally supervisor will make sure that the storage layer is up and ready, with its worker pool, before it even begins booting the subsystem in charge of district-specific handling (opening ballot boxes, reading the contents, etc.).

Once that is set up, the supervision tree will start booting that aspect, but will still make sure that the per-district tally process is in place before starting the handling of specific ballots. Only once all of this is at work and under way will the live-reporting app be allowed to boot.

The fun aspect is also that supervisors fail from the leaves first, and gradually up to the root. This means that for tally handling to fail, I need to have enough ballot handling to fail that it kills the ballot supervisor, then have that supervisor die enough that it kills the district supervisor. Then if too many district supervisors die, only then will it bubble up. And if the tally app fails, so will the live reporting app be taken down.

What's interesting here is that the supervision tree's individual supervisors can be programmed with various tolerance levels and strategies. Meaning I can say that I can tolerate a supervisor dying once per hour as well as a million times a minute, or 5 times a second if I want. I can also tell them that once one of their children dies, to either restart it, let it go, or only restart if it was an unexpected shutdown. I can also tell the supervisor that when one of the children die, either restart only this one, or restart all those booted after, or all children whatsoever.

This would, for example, let me specify (in a single line of code) that when ballot counting goes wrong, only restart the ballot counter ,but if the tally for the entire regional office is going to poo poo, start over for the entire office. This is regardless of the specific exception, with the expectation it was transitive (as 99% of bugs are).

So what becomes the general strategy? All your long-term, critical, must-be-safe functionality is packed up up up in the supervision tree, closer to the beginning. If it can't run, the system can't live. All your unsafe, risk-friendly operations are moved down the tree, near the leaves, where they can be allowed to fail. If they're really risky, they can be moved to another node entirely and keep transparently talking to the current one.

You can think of your state in 3 broad categorie: static (rarely changes, known everywhere -- like configuration data), transient and computable (a TCP connection would be transient computable state, if the IP and port to connect to is known. I can then allow myself to lose and rebuild that connection again), or transient and uncomputable (user-submitted data when the user is gone, for example).

Static state is easy (it goes in a table somewhere, anyone can fetch it, it just needs to be there at boot time). Computable state is easy -- put the rootset in the supervisor and it can pass it back to the workers, and the workers can rebuild it. The uncomputable state is tricky. In the voting system above, it would be handled at the leaves, but once we know the state to be correct, it is moved into the persistent storage, where the system isn't allowed to fail. The workers either store it locally or offshore the data elsewhere.

Doing all of this yields very cool systems where the following is encoded in the program structure:
  • what is critical or not
  • what is allowed to fail or not
  • how software should boot according to which guarantees (what is a critical subcomponent or not)
  • how software should fail, meaning it defines the legal states of partial failures you find yourself in
  • how software is upgraded (because it can be upgraded live, based on the supervision structure)
  • how components interdepend on each other
Oh and logging of all exceptional cases, restarts, etc. is handled out of the box by supervisors (and overridable).

So this is good, as long as you say "well that's if you can be allowed failures". And I say 'of course!'. The reason this model is good is that most failures seen in the wild are transient failures.

The reason for this goes like this:



The bugs that are frequent and repeatable are easy to find in devs. Unless you ship a fundamentally broken product, you're gonna find them whether it's with types, tests, careful review, or users yelling at you on the phone.

The repeatable bugs that are in features that are infrequently used are going to be harder -- most likely less spent will be spent on weeding out error out of these because they're either unimportant or used by a minority of people.

Transient bugs that require thousands, millions, or billions of samples to show up are almost guaranteed to never show up statistically speaking (which is where exhaustive proofs and models come in hand if you have life-critical software).

So what will show up in production?



Frequent usage with frequent repeatability shouldn't be there unless your system or process is fundamentally flawed in ways no tech can save it, or if you made a huge mistake and production is very different from testing.

Rare usage and repeatable faults are gonna be what you have logs for. Maybe one or two users are gonna be angry and you'll need to spend time debugging it.

Then a fuckton of bugs are going to be that lovely rear end nondetermenistic transient set of issues. Thankfully, those are often handled by restarts:



Trying again resets a state or waits until a weird combination is gone, and things work this time. rare usage and repeatable faults are a '?' there because that 100% depends on the usage pattern.

In fact, systems I have deployed in production have had that level of error reporting:



That's right, 1.2 millions exceptions a day for a while. Turns out they were transient and not customer impacting. The system ran for 6 months before we had the bandwidth and took the time to address the issue. But it was running fine with no customer complains. We got rid of the error to lower our bandwidth bill, actually.

But I'm not done yet. That's not where Erlang stops. In Programming Forth (Stephen Pelc, 2011), the author says "Debugging isn't an art, it's a science!" and provides the following (ASCII-fied) diagram:
code:
    find a problem ---> gather data ---> form hypothesis ----,
    .--------------------------------------------------------'
    '-> design experiment ---> prove hypothesis --> fix problem
Which then loops back on itself. By far, the easiest bits are 'finding the problem'. The difficult ones are to form the right hypothesis that will let you design a proper experiment and prove your fix works.

It's especially true of Heisenbugs that ruin your life by disappearing as you observe them.

So how do you go about it? GATHER MORE DATA. The more data you have, the easiest it becomes to form good hypotheses. Traditionally, there's four big ways to do it in the server world:
  • Gather system metrics
  • Add logs and read them carefully
  • Try to replicate it locally
  • Get a core dump and debug that
Those are all available in Erlang, but they're often impractical:
  • System metrics are often wide and won't help with fine-grained bugs, but will help provide context
  • logs can generate lots and lots of expansive and useless data, and logging itself may cause the bug to stop happening. In fact, given transient bugs are unexpected, it's quite possible nothing will be logged about them and you'd need to go dive in, edit the code, deploy it, look at logs, and hope they show the right thing.
  • Replicating it locally without any prior information is more or less blind programming. Take shots in the dark until you figure out you've killed the right monster.
  • Core dumps are post-facto items. They often show you the bad state, but rarely how to get there.
More recently, systemtap/dtrace have come into the picture. These help a lot for some classes of bugs. However, I have not yet felt the need to run either in production. Why?

Because Erlang comes with tracing tools that trace every god drat thing for you out of the box. Process spawns? It's in. Messages sent and received? It's in. Function calls filtered with specific arguments or return values? it's in! Garbage collections? it's in. Processes that got scheduled for too long? it's in. Sockets that have their buffers full? It's in. Mailbox sizes, allocation of memory per type and process, layers of stacktraces leading to a currently running call? It's all in. Want to gather metrics about arrival rate of messages in specific workers? It's IN!

It's all out of the box. It's all available anywhere, and it's all usable in production, and can all be done safely (as long as you use a lib that forbids insane cases, like tracing the tracing system itself -- and these libs exist)

So when I look at it all, most languages out there, they provide you the hygiene and the bubble. But all fault tolerance is handled by architecture, or through field surgery with a bonesaw to get rid of gangrene. Erlang by contrast gives you an actual immune system on top of everything else, a way to design, run, and introspect systems that are live.

I hope this is specific enough.

i'm a little late but this post is mondo cool and good

Ralith
Jan 12, 2011

I see a ship in the harbor
I can and shall obey
But if it wasn't for your misfortune
I'd be a heavenly person today
Distributed VCSs are nice because the initial setup/infrastructure barrier to using one for any project, no matter how small, is basically zero. This means a lot of stuff can be easily version controlled that you might not otherwise bother with, conferring all the usual benefits.

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

i use git to version control documents on my computer that only i will ever work on. i probably wouldnt bother doing that if i had to use something centralized, but its nice

Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...
and if you don't want to bother making a repo on github or don't want it public, just clone it to your Dropbox for instant centralized repo!

Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...
or if you're me, spin up an azure vm running gitlab (makes adding more contributors to your stupid side projects easy)

Stringent
Dec 22, 2004


image text goes here

the talent deficit posted:

bitbucket is the best argument against git/hg i have ever used. to make a pull request follow these 18 steps

don't forget the inability to search the source of a repository. bitbucket is hot garbage compared to github.

AWWNAW
Dec 30, 2008

or use steve jobs operation system that automatically keeps revisions of your documents and lets you access them at will all with a pleasant user interface, it's quite incredible really

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
lol i'm been loving up this stupid little bug over and over again.

also functional lang formatting style is a whole new ball game. no idea what i'm doing. I should just get formatter or linter or something.

code:
  def edge_matches(dag, task1, task2, input) do
    Enum.map(:digraph.edges(dag, task2), fn e -> :digraph.edge(dag, e) end)
    |> Enum.filter(fn 

        { _,t2,t1,_   } when t1    == task1 -> false
        { _,_,_,label } when label == input -> false 
        { _,_,_,label } -> !(input in List.wrap(label))
      
      end) 
    |> List.wrap 
    |> case do  
      []               -> { :new,  {}}
      [{e,_,_,^input}] -> { :skip, { e,   task2, task1, input } }
      [{e,_,_, label}] -> { :merge,{ e,   task2, task1, input ++ label } }
      _ ->                { :error,{ nil, task2, task1, input } }
    end
  end

  def merge_edge(dag, task1, task2, input) do
    edge_matches(dag, task1, task2, input) 
      |> 
      case do
        { :new, _  } -> :digraph.add_edge(dag, task2, task1, input)
        { :skip, _ } -> dag
        { :merge, { e, t2, t1, label } } -> :digraph.add_edge(dag, e, task2, task1, label)
        { :error, v } -> { :error, v }
      end
  end
basically i have two structurally identical graphs with different label values on the edges and i'm trying to merge the labels together and it's really not a difficult problem but i've been at it for like 2 hours. time to stop programming.eah

oh yeah and I'm aware that I keep reordering task2 and task1. Don't ask me about it.


also "pipelines" are good.

DONT THREAD ON ME fucked around with this message at 02:55 on Oct 22, 2015

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
seriously, that pipe operator though. that pipe operator is some sick poo poo. it's a little bit confusing at first because it's not immediately obvious how it's going to curry your stuff around, but man

the pipe operator is the thing that makes bash, a terrible terrible terrible language, popular and usable. and wow what a surprise, it's really good in other languages too.

even joe armstrong likes it:
http://joearms.github.io/2013/05/31/a-week-with-elixir.html

Corla Plankun
May 8, 2007

improve the lives of everyone

Shaggar posted:

git is a disaster and nobody should use it. distributed version control is an oxymoron and one of the worst software development concepts of the past 10 years.

also I spent all morning tweaking shims and polyfills to make stuff work in IE8 and now I want to die but its still not as bad as using git

you're right but everywhere I know of that uses git uses it like a centralized version control with superfluous practice merges on the client side

brap
Aug 23, 2004

Grimey Drawer
centralized version control is just primitive and the best reason to use it is your team is too stupid to use distributed version control.

Hunter2 Thompson
Feb 3, 2005

Ramrod XTreme
i made you guys a video

https://www.youtube.com/watch?v=mnmIFu4WqE4

skip to 2:00 if you're an ADD

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder

meatpotato posted:

i made you guys a video

https://www.youtube.com/watch?v=mnmIFu4WqE4

skip to 2:00 if you're an ADD

dang w

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
also monoqc's erlang book is good

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
i think i've got it. i've got the erlang fever.

tef
May 30, 2004

-> some l-system crap ->
"x |> y means call x then take the output of x and add it as an extra argument to y in the first argument position"

it's almost like doing x.y

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder

tef posted:

"x |> y means call x then take the output of x and add it as an extra argument to y in the first argument position"

it's almost like doing x.y

:mindblown:

holy poo poo thank you. so many things make sense now.

tef
May 30, 2004

-> some l-system crap ->
waiting for someone to tell me it's just flip (.)

Adbot
ADBOT LOVES YOU

pepito sanchez
Apr 3, 2004
I'm not mexican
i'm having a hard time finding decent resources on erlang and elixir. have everything set up for it on emacs :( buying books from outside the country isn't really much of an option for i am a poor. i was actually a bit surprised by the lack of free stuff to learn from out there on the two.

  • Locked thread