Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
weird
Jun 4, 2012

by zen death robot
i do

wait no i hate my life for other reasons

Adbot
ADBOT LOVES YOU

minidracula
Dec 22, 2007

boo woo boo

unixbeard posted:

how many people here use ocaml professionally and dont work at jane st
I'm using it at work, and I don't work at Jane St. Does that count?

I dunno if I wanna say anything about "professionally" in YOSPOS...

unixbeard
Dec 29, 2004

mnd posted:

I'm using it at work, and I don't work at Jane St. Does that count?

I dunno if I wanna say anything about "professionally" in YOSPOS...

It seems like such a niche language, I have used it a bit but the only place I know that does anything serious with it was them. What sort of stuff do you use it for?

minidracula
Dec 22, 2007

boo woo boo

unixbeard posted:

It seems like such a niche language, I have used it a bit but the only place I know that does anything serious with it was them. What sort of stuff do you use it for?
I am using it (currently) to build what's essentially a front-end to a compiler (kinda all I want to say right at the moment). I may not stick with it, but right now there's no problem. We'll see how it goes.

I should mention that although I am using OCaml (for this particular thing), I'm pretty sure no one else here is. You could safely dump this in the "research project" bucket at the moment. Though, that would apply to this entire effort, not just the use of OCaml.

JewKiller 3000
Nov 28, 2006

by Lowtax
i use ocaml professionally and i do not work at jane st :)

hackbunny
Jul 22, 2007

I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av
how does it feel to be so fuckin kewl

minidracula
Dec 22, 2007

boo woo boo

hackbunny posted:

how does it feel to be so fuckin kewl
Feels good man.

MononcQc
May 29, 2007

car, cdr, heads and tails all sound so annoying when you've got pattern matching. I grew used to pattern matching and every time I'm in a language that tries to be somewhat functional and doesn't have it, I get very irritated.

Fortunately schemers and lispers from all around the place develop macro systems for that, but yeah.

gonadic io
Feb 16, 2011

>>=
serious q: what does ml have over haskell? I hear that it's faster, is that a direct consequence of its strict semantics?

edit: what I mean by that is that laziness is great for actually programming but high performance haskell is basically about getting rid of the laziness whenever possible if you know a value is going to be computed eventually anyway

MononcQc
May 29, 2007

I heard good things of their module systems and functor definitions (http://homepages.inf.ed.ac.uk/mfourman/teaching/mlCourse/notes/L11.html ) but couldn't say how Haskell compares to that specifically.

Notorious b.s.d.
Jan 25, 2003

by Reene

gucci void main posted:

yeh the rest of it was collected before it could even be finished

CL has the opposite problem. they saw they had five or six featuresets across the participating vendors, and decided to include all of them


JewKiller 3000 posted:

common lisp has everything and nobody uses any of it

this is pretty much true

CL is such a big language it's like C++, every codebase uses its own subset of the thing

Notorious b.s.d.
Jan 25, 2003

by Reene

MononcQc posted:

car, cdr, heads and tails all sound so annoying when you've got pattern matching. I grew used to pattern matching and every time I'm in a language that tries to be somewhat functional and doesn't have it, I get very irritated.

Fortunately schemers and lispers from all around the place develop macro systems for that, but yeah.

so back when i was writing CL, i didn't know what pattern-matching was. i had destructuring-bind and i was happy

in the ensuing years, lispers have written eighteen libraries to get ml-style pattern matching.
eighteen.

http://www.cliki.net/pattern%20matching

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip

Notorious b.s.d. posted:

in the ensuing years, lispers have written eighteen libraries to get ml-style pattern matching.
eighteen.

Olin Shivers of scheme shell fame wrote about the CL/Scheme habit of 80% solutions at length in the readme for his regex library

quote:

There's a problem with tool design in the free software and academic
community. The tool designers are usually people who are building tools for
some larger goal. For example, let's take the case of someone who wants to do
web hacking in Scheme. His Scheme system doesn't have a sockets interface, so
he sits down and hacks one up for his particular Scheme implementation. Now,
socket API's are not what this programmer is interested in; he wants to get on
with things and hack the exciting stuff -- his real interest is Web services.
So he does a quick 80% job, which is adequate to get him up and running, and
then he's on to his orignal goal.

Unfortunately, his quickly-built socket interface isn't general. It just
covers the bits this particular hacker needed for his applications. So the
next guy that comes along and needs a socket interface can't use this one.
Not only does it lack coverage, but the deep structure wasn't thought out well
enough to allow for quality extension. So *he* does his *own* 80%
implementation. Five hackers later, five different, incompatible, ungeneral
implementations had been built. No one can use each others code.

The alternate way systems like this end up going over a cliff is that the
initial 80% system gets patched over and over again by subsequent hackers, and
what results is 80% bandaids and 20% structured code. When systems evolve
organically, it's unsuprising and unavoidable that what one ends up with is a
horrible design -- consider the DOS -> Win95 path.

As an alternative to five hackers doing five 80% solutions of the same
problem, we would be better off if each programmer picked a different task,
and really thought it through -- a 100% solution. Then each time a programmer
solved a problem, no one else would have to redo the effort. Of course, it's
true that 100% solutions are significantly harder to design and build than 80%
solutions. But they have one tremendous labor-savings advantage: you don't
have to constantly reinvent the wheel. The up-front investment buys you
forward progress; you aren't trapped endlessly reinventing the same awkward
wheel.

Examples: I've done this three times. The first time was when I needed an
emacs mode in graduate school for interacting with Scheme processes. I looked
around, and I found a snarled up mess of many, many 80% solutions, some for
Lisp, some for Scheme, some for shells, some for gdb, and so forth. These
modes had all started off at some point as the original emacs shell.el mode,
then had been hacked up, eventually drifting into divergence. The keybindings
had no commonality. Some modes recovered old commands with a "yank" type form,
on c-c y. Some modes recovered old commands with m-p and m-n. It was hugely
confusing and not very functional.

The right thing to do was to carefully implement one, common base mode for
process interaction, and to carefully put in hooks for customising this base
mode into language-specific modes -- lisp, shell, Scheme, etc. So that's what
I did. I carefully went over the keybindings and functionality of all the
process modes I could find -- even going back to old Lisp Machine bindings for
Zwei -- and then I designed and implemented a base mode called comint. Now,
all process modes are implemented on top of comint, and no one, ever, has to
re-implement this code. Users only have to learn one set of bindings for
the common functions. Features put into the common code are available for free
to all the derived modes. Extensions are done, not by doing a completely new
design, but *in terms of* the original system -- it may not be perfect, but
it's good enough to allow people to move on and do other things.

The second time was the design of the Scheme Unix API found in scsh. Most
Schemes have a couple of functions for changing directory, some minimal socket
hacking, and perhaps forking off a shell command with the system() C function.
But no one has done a complete job, and the functions are never compatible.
It was a classic 80%-solution disaster. So I sat down to do a careful, 100%
job -- I wanted to cover everything in section 2 of the Unix man pages, in a
manner that was harmonious with the deep structures of the Scheme language. As
a design task, it was a tremendous amount of work, taking several years, and
multiple revisions. But now it's done. Scsh's socket code, for instance,
*completely* implements the socket API. My hope in doing all this was that
other people could profit from my investment. If you are building your own
Scheme system, *you* don't have to put in the time. You can just steal the
design. Or the code.

The regexp notation in this document represents a third attempt at this kind
of design. Looking back, I'm amazed at how much time I poured into the design,
not to mention the complete reference implementation. I sold myself on doing a
serious job with the philosophy of the 100% design -- the point is to save
other people the trouble. If the design is good enough, then instead of having
to do your own, you can steal mine and use the time saved... to do your own
100% design of something *else*, and fill in another gap.

I am not saying that these three designs of mine represent the last word on
the issues -- "100%" is really a bit of a misnomer, since no design is ever
truly 100%. I would prefer to think of them as sufficiently good that they at
least present low-water marks -- future systems, I'd hope, can at least build
upon these designs, hopefully *in terms of* these designs. You don't ever have
to do *worse* -- you can just steal the design. If you don't have a
significantly better idea, I'd encourage you to adopt the design for the
benefits of compatibility. If you *do* have an improvement, email me about it,
so we can fold it in to the core design and *everyone* can win -- and we can
also make your improvement part of the standard, so that people can use your
good idea and *still* be portable.

But here's what I'd really like: instead of tweaking regexps, you go do your
own 100% design or two. Because I'd like to use them. If everyone does just
one, then that's all anyone has to do.

Notorious b.s.d.
Jan 25, 2003

by Reene

JewKiller 3000 posted:

i am not qualified to argue about call/cc but oleg kiselyov certainly is: http://okmij.org/ftp/continuations/against-callcc.html

this is a fun page

it is good to know that certain fractions of the scheme community hate call/cc as much as i do

MononcQc
May 29, 2007

Otto Skorzeny posted:

Olin Shivers of scheme shell fame wrote about the CL/Scheme habit of 80% solutions at length in the readme for his regex library

This kind of thing always makes me a bit uncomfortable with the code I write.

The stuff I do for a living is developing on large server environments with certain strict behaviors to be had when dealing with overload, latency, throughput, or whatever. General 100% solutions tend to be much slower / less efficient / whatever property, because the general case will make compromises and assumptions about what is allowed or forbidden that do not necessarily apply to your particular use case, or might even be entirely at the opposite of the spectrum.

One example I have from the last couple of weeks is a logging library. One logging library I keep recommending all the time turned out to be too slow for our use cases where overload situations would make IO become synchronous and would lock up the node as a sequential bottleneck right in time-sensitive areas of the code. It's a case I had never encountered before with that lib (and hence why I keep recommending it), and it turned out I had to rewrite a tiny replacement that catered to our use case by batching data received, making things asynchronous, raising throughput while making latency slightly worse for individual lines. It's a lovely logging library by all means of usability, but it eliminated all kinds of issues for us and definitely made things nicer and more predictable, without us needing to just log less data.

So I think I end up being the kind of person who releases 80% libraries here and there, and even though I try to document their narrow use cases, they're not useful for the general public a lot of the time. I'd like to release more general stuff (and I sometimes do), but it just wouldn't work the same in production for us because of what you can decide to bake in as an assumption of what you're allowed to do or not to do. I guess I'm contributing to making things worse in the Erlang library ecosystem :smith:

Posting Principle
Dec 10, 2011

by Ralp
yo was your talk archived?

MononcQc
May 29, 2007

Posting Principle posted:

yo was your talk archived?

http://oreillynet.com/pub/e/2877

It's gonna be in the original format for a couple of weeks/months, which means you gotta subscribe and then watch it in the weird rear end GUI that does the live streaming and stuff. After that time period, they'll put it on their youtube channel, which hopefully won't be too terrible of a format.

Nomnom Cookie
Aug 30, 2009



MononcQc posted:

This kind of thing always makes me a bit uncomfortable with the code I write.

The stuff I do for a living is developing on large server environments with certain strict behaviors to be had when dealing with overload, latency, throughput, or whatever. General 100% solutions tend to be much slower / less efficient / whatever property, because the general case will make compromises and assumptions about what is allowed or forbidden that do not necessarily apply to your particular use case, or might even be entirely at the opposite of the spectrum.

One example I have from the last couple of weeks is a logging library. One logging library I keep recommending all the time turned out to be too slow for our use cases where overload situations would make IO become synchronous and would lock up the node as a sequential bottleneck right in time-sensitive areas of the code. It's a case I had never encountered before with that lib (and hence why I keep recommending it), and it turned out I had to rewrite a tiny replacement that catered to our use case by batching data received, making things asynchronous, raising throughput while making latency slightly worse for individual lines. It's a lovely logging library by all means of usability, but it eliminated all kinds of issues for us and definitely made things nicer and more predictable, without us needing to just log less data.

So I think I end up being the kind of person who releases 80% libraries here and there, and even though I try to document their narrow use cases, they're not useful for the general public a lot of the time. I'd like to release more general stuff (and I sometimes do), but it just wouldn't work the same in production for us because of what you can decide to bake in as an assumption of what you're allowed to do or not to do. I guess I'm contributing to making things worse in the Erlang library ecosystem :smith:

Java logging libraries include asynchronous appenders. Code I didn't write is the best code

Count Thrashula
Jun 1, 2003

Death is nothing compared to vindication.
Buglord
guys i'm gonna learn Haskell

because i have this problem i need to deal with where too many girls talk to me

Count Thrashula
Jun 1, 2003

Death is nothing compared to vindication.
Buglord
aw man almost two hockey avatar budz posts in a row

Opinion Haver
Apr 9, 2007

AlsoD posted:

serious q: what does ml have over haskell? I hear that it's faster, is that a direct consequence of its strict semantics?

edit: what I mean by that is that laziness is great for actually programming but high performance haskell is basically about getting rid of the laziness whenever possible if you know a value is going to be computed eventually anyway

it depends

this is obviously true if you have some gigantic thunk that does arithmetic beause in general 'a + b' is smaller when it gets forced for obvious reasons. but if i have some very small thunk like 'f 20' which balloons into some massive data structure when forced (because, say, f x = Node x (f (x - 1)) (f (x - 1)), you want to evaluate f 10 as late as possible so that you're not stuck with this gigantic thing in memory

i don't know much about ml but it always seemed to have a better module system to me; haskell's module system is just basic 'you can specify what you export, what you import, and the qualified name of the module if any' stuff.

MononcQc
May 29, 2007

Nomnom Cookie posted:

Java logging libraries include asynchronous appenders. Code I didn't write is the best code

Code had asynchronous mode, but on some nodes this invariably led to overload. The problem was the number of calls without paging being built in there for some types of IO (i.e. disk IO has paging, but stdout didn't) which led to the async code progressively accumulating a larger backlog, at which point it toggles to synchronous mode automatically.

Then we never really intended to drop any of our log messages -- making it switch from 'INFO' to 'WARN' would have fixed the issue, but we wanted to see if it was possible to just make it handle everything at once -- and we made it work.

I'm trying to think of a way to port my stuff to the logging library everyone uses, but I'm not sure it's super usable for that.

Shaggar
Apr 26, 2006
entity framework loving blows

Shaggar
Apr 26, 2006
I cant believe people think orms make development faster.

FamDav
Mar 29, 2008
Jane street seemed like the most annoying place to work. Traders in the same space as engineers, all bunched together on one side of the building.

On ocaml, the lack of type classes is annoying but defining new infix operators is weird when you can't control precedence and fixity.

Nomnom Cookie
Aug 30, 2009



MononcQc posted:

Code had asynchronous mode, but on some nodes this invariably led to overload. The problem was the number of calls without paging being built in there for some types of IO (i.e. disk IO has paging, but stdout didn't) which led to the async code progressively accumulating a larger backlog, at which point it toggles to synchronous mode automatically.

Then we never really intended to drop any of our log messages -- making it switch from 'INFO' to 'WARN' would have fixed the issue, but we wanted to see if it was possible to just make it handle everything at once -- and we made it work.

I'm trying to think of a way to port my stuff to the logging library everyone uses, but I'm not sure it's super usable for that.

i disagree with the design of that logging library. java has this guy (Ceki Gülcü) who seems to be obsessed with logging. he wrote log4j, then slf4j, then logback. each time leveraging key learnings from past utilizations. have you considered using logback its pretty good

Nomnom Cookie
Aug 30, 2009



Shaggar posted:

I cant believe people think orms make development faster.

its not an orm but jdbctemplate is a straight win over writing your own jdbc code

Notorious b.s.d.
Jan 25, 2003

by Reene

Nomnom Cookie posted:

i disagree with the design of that logging library. java has this guy (Ceki Gülcü) who seems to be obsessed with logging. he wrote log4j, then slf4j, then logback. each time leveraging key learnings from past utilizations. have you considered using logback its pretty good

they're all compatible with sane migration paths too

Notorious b.s.d.
Jan 25, 2003

by Reene

Shaggar posted:

I cant believe people think orms make development faster.

an ORM can make development faster if and only if your application owns the schema. (using an ORM to query somebody else's database or view is usually not a great idea for lots of reasons)

  • instead of writing (opaque) data to disk, serializing object data to an SQL database makes the data available to other apps

  • instead of making changes to an SQL schema and then updating an app, you can use an ORM to make changes to both data storage and business logic in a single place

  • you don't care about performance, and you just want to advertise "SQL database" compatibility for your product as quickly as possible. your ORM makes you essentially vendor-neutral. use hibernate criteria, instantly support 99.95% of the market

  • you work with people who suck poo poo at sql. an ORM can limit the damage by giving them a dumb criteria interface. it evens the playing field

there are lots of reasons to consider ORM even if it isn't a panacea

Notorious b.s.d. fucked around with this message at 17:10 on Sep 6, 2013

double sulk
Jul 2, 2010

i am terrible at sql and activerecord is OK. you might say it suits my needs.

Bloody
Mar 3, 2013

i use orms because then i only have to deal with being terrible in one language instead of two

MononcQc
May 29, 2007

Nomnom Cookie posted:

i disagree with the design of that logging library. java has this guy (Ceki Gülcü) who seems to be obsessed with logging. he wrote log4j, then slf4j, then logback. each time leveraging key learnings from past utilizations. have you considered using logback its pretty good

I'm not sure how portable to my case that architecture is. The inheritance of levels is nice, and the automation of level-checks before logging is good too. A lot of the principles are okay, but the problem I see there is that doing something like a "synchronized block" to provide thread safety when appending and formatting in that single sequential point is a lovely idea when you have >30,000 preemptively scheduled processes and multiple hundreds of log messages a second making it there.

From that point of view (and knowing we never turn off logging), it is better to format the log at the call site, and then batch them up to be sent to their final destination. This distributes work across all processes, and ensures that your sequential bottleneck is minimal, improving performance node-wide. You make a lot of barely-noticeable small pauses to format, compared to a few seconds-long pauses with the centralized approach, which suck when what you're doing is trying to set up connections or accepting requests or whatever.

Notorious b.s.d. posted:

they're all compatible with sane migration paths too

The migration path for Erlang -> Java on production software isn't the sanest around.

Shaggar
Apr 26, 2006

MononcQc posted:

I'm not sure how portable to my case that architecture is. The inheritance of levels is nice, and the automation of level-checks before logging is good too. A lot of the principles are okay, but the problem I see there is that doing something like a "synchronized block" to provide thread safety when appending and formatting in that single sequential point is a lovely idea when you have >30,000 preemptively scheduled processes and multiple hundreds of log messages a second making it there.

From that point of view (and knowing we never turn off logging), it is better to format the log at the call site, and then batch them up to be sent to their final destination. This distributes work across all processes, and ensures that your sequential bottleneck is minimal, improving performance node-wide. You make a lot of barely-noticeable small pauses to format, compared to a few seconds-long pauses with the centralized approach, which suck when what you're doing is trying to set up connections or accepting requests or whatever.


The migration path for Erlang -> Java on production software isn't the sanest around.

but aren't you synchronizing submits to the batch? how is that different?

uG
Apr 23, 2003

by Ralp
i like redbeans ORM for php because it just creates tables/columns if the table/column doesnt already exists it seems like a great idea

Cocoa Crispies
Jul 20, 2001

Vehicular Manslaughter!

Pillbug

uG posted:

i like redbeans ORM for php because it just creates tables/columns if the table/column doesnt already exists it seems like a great idea

every time i've used a system that has just kinda made columns when they don't exist has sucked lots of rear end

gucci void main posted:

i am terrible at sql and activerecord is OK. you might say it suits my needs.

i'm pretty great at sql and activerecord is fantastic

sql when i Care, orm when i don't

power botton
Nov 2, 2011

a php orm that just makes poo poo up about table schema as it goes along?

Cocoa Crispies
Jul 20, 2001

Vehicular Manslaughter!

Pillbug

git clone trooper posted:

a php orm that just makes poo poo up about table schema as it goes along?

yeah i didn't want to say anything

but really since it's php it's probably only set up for mysql which is the php of databases

uG
Apr 23, 2003

by Ralp
disclaimer: ive never written a line of php :getin:

MononcQc
May 29, 2007

Shaggar posted:

but aren't you synchronizing submits to the batch? how is that different?

Yes and no. There's a possible difference in the level of contention and operations done in the synchronous block.

At the lowest level of the VM, process mailboxes have two queues: an inner one which is locked by the receiver process while executing, and an outer one which other processes will use and thus not compete for a message queue lock with the executing process. When the inner queue is depleted the receiver process will lock the outer queue and move the entire thing to the inner one. Rinse and repeat.

This means that you do the append operation synchronously, but the consumption of the queue is amortized over that cost and that there is very little to deal with there.

At the library level, the process drains its own mailbox as fast as possible and puts it in a higher level queue where results are appended in 'pages' of a determined size to be pushed as one unit to the final writer.

The only locking you need to do is to access the tail pointer and plug your data there, to make it simple (full details in the source). The rest is done in isolation of the rest of the flow -- transfering data from the mailbox to a higher-level queue we can operate on to do filtering, segment data into appropriate page sizes, etc. There is no GC that's gonna hit that queue, no operation except appending to be done on it (and eventually transferring it), and it's part of the core logic of the entire runtime system and you know it's gonna be one of the fastest thing you can do even on multiple cores in that language.

Adbot
ADBOT LOVES YOU

Nomnom Cookie
Aug 30, 2009



MononcQc posted:

I'm not sure how portable to my case that architecture is. The inheritance of levels is nice, and the automation of level-checks before logging is good too. A lot of the principles are okay, but the problem I see there is that doing something like a "synchronized block" to provide thread safety when appending and formatting in that single sequential point is a lovely idea when you have >30,000 preemptively scheduled processes and multiple hundreds of log messages a second making it there.

From that point of view (and knowing we never turn off logging), it is better to format the log at the call site, and then batch them up to be sent to their final destination. This distributes work across all processes, and ensures that your sequential bottleneck is minimal, improving performance node-wide. You make a lot of barely-noticeable small pauses to format, compared to a few seconds-long pauses with the centralized approach, which suck when what you're doing is trying to set up connections or accepting requests or whatever.


The migration path for Erlang -> Java on production software isn't the sanest around.

is logback formatting synchronized, i don't think it is. anyway what the async appender does is shove log events onto a queue for the writer thread to consume. pretty much what you're describing except that its not the default

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply