Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
yes but sbrk is really hard to use guys

would you rather use sbrk or this beautiful gc

actually one of these four beautiful completely reimplemented gc algorithms

one of which may actually leak everything that escapes the young generation

and one that might not be quite stable yet

something about boots, i dunno

but its totally parallel

just dont have too many boots whatever that means

but if its this or sbrk man you gotta ask yourself

Adbot
ADBOT LOVES YOU

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
go sbrk yourself!! lol haha

hackbunny
Jul 22, 2007

I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av
no sbrk on Windows, only (a weirdly designed incompatible quasi-equivalent of) mmap :confuoot:

e: VVV brk/sbrk was removed from the UNIX standard in version 3 (2001), I guess they realized it wasn't terribly portable or flexible (but it sure is low overhead and easy to implement! I envy smaller systems sometimes)

hackbunny fucked around with this message at 02:12 on Apr 4, 2014

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip
don't most modern mallocs eschew the use of sbrk anyways? i know that phkmalloc and its successors do at least, not sure how jemalloc and tcmalloc do things, i assume gnu malloc does the dumbest thing possible while using a bunch of asm intrinsics shrouded in a billion ifdefs

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
yes sbrk is a really terrible way to run a heap

i was subtly implying that most arguments about gc were born in a simpler age

a dumber age

and that people tend to contrast gc with explicit malloc/free as if there were literally no intermediate positions

and as if C++ programmers literally spend all their time chasing down use-after-free errors

but i was very subtle about my implications

subtle

ofc i say all this having just spent a few hours chasing down a use-after-free error

JewKiller 3000
Nov 28, 2006

by Lowtax
just don't free and you won't have that problem

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

JewKiller 3000 posted:

just don't free and you won't have that problem

So, basically a GC, but without the pauses.

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip
Nice!

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip

rjmccall posted:

ofc i say all this having just spent a few hours chasing down a use-after-free error

heh

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip
it;s me i'm the guy who only ever uses a pool allocator

Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...

Soricidus posted:

seriously i know developers love to sperg about ui responsiveness but real world users don't give a gently caress if their program takes a second to react after they click on the "process butts" icon, they probably need a moment to decide what to do next anyway

features first, worry about optimisations like worker threads later

just gonna dogpile on here and say that you are literally murdering people with apps that pause like this

assume your app is used by merely 100,000 people every day, and it pauses for one second five times every time they use it

in two months - 63 days - your app will have burned an entire year of human time. further, this is time that is either gone - removed from existence - per the previous point about editing, or it is time that those users will spend in a very painful emotional state, raging at your unresponsive bullshit app

someday the geneva convention will bring people like you down for your crimes against humanity

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

Dessert Rose posted:

just gonna dogpile on here and say that you are literally murdering people with apps that pause like this

assume your app is used by merely 100,000 people every day, and it pauses for one second five times every time they use it

in two months - 63 days - your app will have burned an entire year of human time. further, this is time that is either gone - removed from existence - per the previous point about editing, or it is time that those users will spend in a very painful emotional state, raging at your unresponsive bullshit app

someday the geneva convention will bring people like you down for your crimes against humanity

what if he made a videogame

Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...

Symbolic Butt posted:

what if he made a videogame

presumably those people are at least enjoying their time wasted actually using the software for its advertised purpose

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Otto Skorzeny posted:

it;s me i'm the guy who only ever uses a pool allocator

it actually was pool-allocated

but we were removing it from lists and calling destructors anyway because we detected that it was "unused"

in fact it would still have been a bug in a gc language because the original bug was a byproduct of the inconsistency, not of memory corruption

and the memory corruption associated with the object was a really obvious clue about what was going wrong

so in some ways it was easier to debug because it wasnt a gc language

because deinitialization implicitly signals object invalidation like nothing else ever could

but i wanted to be nice to the gc fans

because someone has to

and its not going to be me

Bloody
Mar 3, 2013

Dessert Rose posted:

just gonna dogpile on here and say that you are literally murdering people with apps that pause like this

assume your app is used by merely 100,000 people every day, and it pauses for one second five times every time they use it

in two months - 63 days - your app will have burned an entire year of human time. further, this is time that is either gone - removed from existence - per the previous point about editing, or it is time that those users will spend in a very painful emotional state, raging at your unresponsive bullshit app

someday the geneva convention will bring people like you down for your crimes against humanity

i like numbers

self-flushing urinals save more time in one day than your entire life

Soricidus
Oct 21, 2010
freedom-hating statist shill

Dessert Rose posted:

just gonna dogpile on here and say that you are literally murdering people with apps that pause like this

assume your app is used by merely 100,000 people every day, and it pauses for one second five times every time they use it

in two months - 63 days - your app will have burned an entire year of human time. further, this is time that is either gone - removed from existence - per the previous point about editing, or it is time that those users will spend in a very painful emotional state, raging at your unresponsive bullshit app

someday the geneva convention will bring people like you down for your crimes against humanity

HORATIO HORNBLOWER
Sep 21, 2002

no ambition,
no talent,
no chance

Bloody posted:

i like numbers

self-flushing urinals save more time in one day than your entire life

flushless urinals are the best urinals

Shaggar
Apr 26, 2006
flushless urinals are disgusting.

Bloody
Mar 3, 2013

shaggar was wrong

JewKiller 3000
Nov 28, 2006

by Lowtax
all urinals are disgusting. great, little droplets of piss everywhere, puddle of piss on the floor

Shaggar
Apr 26, 2006
flushless urinals need to be cleaned like every 2 hours or they stink like crazy.

wolffenstein
Aug 2, 2002
 
Pork Pro

Shaggar posted:

flushless urinals need to be cleaned like every 2 hours or they stink like crazy.

same but you

Stringent
Dec 22, 2004


image text goes here

wolffenstein posted:

same but you

jooky
Jan 15, 2003

rjmccall posted:

it actually was pool-allocated

but we were removing it from lists and calling destructors anyway because we detected that it was "unused"

in fact it would still have been a bug in a gc language because the original bug was a byproduct of the inconsistency, not of memory corruption

and the memory corruption associated with the object was a really obvious clue about what was going wrong

so in some ways it was easier to debug because it wasnt a gc language

because deinitialization implicitly signals object invalidation like nothing else ever could

but i wanted to be nice to the gc fans

because someone has to

and its not going to be me

why do you make posts in this format

Dicky B
Mar 23, 2004

i read it like a really fat dude taking a raspy breath after each newline

double riveting
Jul 5, 2013

look at them go

Mr Dog posted:

idk GUI apps probably still need to be written in native code because of that whole "embarassing pause" thing

Android really suffers against iOS because it has to run a garbage-collecting VM on a device where memory is tight.

as you say, it's really about being memory-starved, rather than not native. as far as i know, modern garbage collection is wonderfully fast (faster than explicit frees, in fact) as long as it's okay to take about 2-3 times as much memory as your active set size. this is perfectly fine for a run-off-the-mill app on desktop.

the reason it gets so exacerbated on the relatively (!) low-memory mobile devices is mostly because everything tends to involve web, video, tons of hi-res pictures, or all of the above.

does (Java on) Android provide some support for taking over explicit memory management (thinking regions or something) for parts of the active set?

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip
you also have to remember that dalvik is a much worse vm than hotspot

Zombywuf
Mar 29, 2008

double riveting posted:

modern garbage collection is wonderfully fast (faster than explicit frees, in fact)

This is transparently false.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

jooky posted:

why do you make posts in this format

inadequate exposure to philosophical marxism in my youth

Dicky B posted:

i read it like a really fat dude taking a raspy breath after each newline

p. much accurate

Sapozhnik
Jan 2, 2005

Nap Ghost

double riveting posted:

as you say, it's really about being memory-starved, rather than not native. as far as i know, modern garbage collection is wonderfully fast (faster than explicit frees, in fact) as long as it's okay to take about 2-3 times as much memory as your active set size. this is perfectly fine for a run-off-the-mill app on desktop.

the reason it gets so exacerbated on the relatively (!) low-memory mobile devices is mostly because everything tends to involve web, video, tons of hi-res pictures, or all of the above.

does (Java on) Android provide some support for taking over explicit memory management (thinking regions or something) for parts of the active set?

well you can always just write native code on Android, though the user interface APIs are all Java-only so you have to have at least some Java in your app.

Dalvik is pretty rubbish though, google doesn't make money on responsive user interfaces, they make money barcoding every aspect of your life so they can sell ads

double riveting
Jul 5, 2013

look at them go

Zombywuf posted:

This is transparently false.

I can't find the paper I got my numbers from, but I'll leave this here. It seems to imply that for HotSpot, even a factor of 1.2 is sufficient. It's true I don't know about the current state of Dalvik... Apparently it used to do simplistic mark-sweep, which is definitely not "modern". I was making a general argument as to whether responsivene apps always require native code, which they clearly don't. In fact I think they generally should not, even on memory-constrained devices.

edit: actually, looking at the graphs quoted on that site, the paper he talks about may have been the very one i was thinking of. in the first graph cited, generational mark-sweep crosses the line at 1 pretty much at 3x heap size. more memory than that and you actually save time by not calling free or whatever so often. and that's in the 2004 paper being criticized as out-of-date.

double riveting fucked around with this message at 12:19 on Apr 7, 2014

Zombywuf
Mar 29, 2008

double riveting posted:

I can't find the paper I got my numbers from, but I'll leave this here. It seems to imply that for HotSpot, even a factor of 1.2 is sufficient. It's true I don't know about the current state of Dalvik... Apparently it used to do simplistic mark-sweep, which is definitely not "modern". I was making a general argument as to whether responsivene apps always require native code, which they clearly don't. In fact I think they generally should not, even on memory-constrained devices.

Native and GC are not mutually exclusive and unless your GC is pause free it's going to be really hard to make a responsive app. The referenced paper is comparing Java to Java, code written under the assumption of GC, code written with explicit management in mind will tend to be tighter although the same techniques can be used in GC languages they frequently aren't.


Basically, poo poo code gives poo poo apps, GC doesn't free you from having to worry about memory and I'm pissed that my phone doesn't have enough RAM to play Threes. Seriously, a loving sliding tile game and it needs 100s of MBs.

Cybernetic Vermin
Apr 18, 2005

the perfection of gc is one of those slightly annoying nerd myths that i think mostly exists because people like absolutes, when very few absolutes exist. gc is deeply flawed, and has the very distinct downside of being an all-or-nothing proposition if it is to work well and be easy to work with. pretty clearly desirable for robust software, but a source of many problems and often not at all desirable.

i sort of hold that a lot of the runtime design (and the languages built on its assumptions) today operates on a slightly incorrect level, in that any application using an appreciable amount of memory (from say 200 megs and up) will have a small number of different classes/types, living in a fairly small number of collections (possibly implicit collections in the form of object graphs), but will still be treated as units, carrying their own tagging/vtables/metadata. the baseline assumption really should be that atomic pieces of data are owned by a collection, which carries the common metadata, and lifetime-manages, its contents. ideally every piece of data is only referred to from one place, and very commonly this is easy to achieve, and makes a lot of management superfluous. the relative minority of objects that are not numerous belong to a singleton collection, which is more overhead than they would otherwise have, but averages better across a reasonably sized application

unfortunately that type of thing is extremely unlikely to be brought about, since displacing existing design thought of an object graph of entirely independent tiny pieces of data would be a monumental undertaking

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

Otto Skorzeny posted:

don't most modern mallocs eschew the use of sbrk anyways? i know that phkmalloc and its successors do at least, not sure how jemalloc and tcmalloc do things, i assume gnu malloc does the dumbest thing possible while using a bunch of asm intrinsics shrouded in a billion ifdefs

anonymous mmap which is also not standard but everyone supports it b/c it makes sense


Zombywuf posted:

Native and GC are not mutually exclusive and unless your GC is pause free it's going to be really hard to make a responsive app. The referenced paper is comparing Java to Java, code written under the assumption of GC, code written with explicit management in mind will tend to be tighter although the same techniques can be used in GC languages they frequently aren't.


Basically, poo poo code gives poo poo apps, GC doesn't free you from having to worry about memory and I'm pissed that my phone doesn't have enough RAM to play Threes. Seriously, a loving sliding tile game and it needs 100s of MBs.

we need region tracking and poo poo, for once modern c++ actually gets that poo poo right

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

Malcolm XML posted:

anonymous mmap which is also not standard but everyone supports it b/c it makes sense


we need region tracking and poo poo, for once modern c++ actually gets that poo poo right

somewhere tony hoare is crying

Sapozhnik
Jan 2, 2005

Nap Ghost
the erlang approach (from the almost nothing i know of it) makes a lot of sense: have a bunch of message-passing logical processes with small isolated heaps, and you can just GC those small heaps independently. of course, the semantics around passing large buffers from process to process probably gets interesting rather quickly.

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

Mr Dog posted:

the erlang approach (from the almost nothing i know of it) makes a lot of sense: have a bunch of message-passing logical processes with small isolated heaps, and you can just GC those small heaps independently. of course, the semantics around passing large buffers from process to process probably gets interesting rather quickly.

The good way is to have yet another guardian process handle access to/from your big ole buffer

At least that makes sense to me. Actors guard resources.

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip
im the big ol' buffer

MononcQc
May 29, 2007

The Erlang approach to GC is conceptually simple, but in practice a bit more complex than that. By default you do have that "har har processes share nothing and can be GC'ed individually" thing, but stuff goes way deeper.

The way things work, you have a hierarchical memory allocation thing going on:


(from my own S3 bucket, don't probate)

You have two main allocators, and a bunch of sub-allocators (numbered 1-9). There will be one instance of each sub-allocator per scheduler (and you should have one scheduler per core by default), plus one instance to be used by linked-in drivers using async threads.

Each of these sub-allocators will request memory from mseg_alloc and sys_alloc depending on the use case, and in two possible ways: as a multiblock carrier or as a single block carrier. I'll get into each sub-allocator's use case and the difference between single and multiblock carriers just in a minute, but I have to explain mseg_alloc and sys_alloc.

sys_alloc is basically your Erlang equivalent of malloc for the entire VM. It's the one asking for memory from the OS and redistributing it internally. mseg_alloc can mmap a larger chunk of memory and behave a bit like a cache. Sub-allocators will ask for memory from mseg_alloc or sys_alloc and return it there whenever it's no longer needed. The main allocators should keep the space held for a while in case another part of the system needs it, or return it to the OS.

Multiblock carriers are your common area to get memory from an Erlang process' point of view. Whenever you need to allocate a piece of data that is less than 8MB (by default, but tweakable), it will go in a multiblock carrier. The multiblock carrier can be shared by multiple terms and pieces of data within a sub-allocator to get more cache efficiency (usually, anything below 80% is judged problematic).

Whenever the item to be allocated is greater than the single block carrier threshold (sbct), the allocator switches this allocation into a single block carrier (sbcs). This basically fetches a larger chunk of memory for a single large piece of data. For efficiency reasons, a sub-allocator will always try to fetch stuff from mseg_alloc first, and only once a certain threshold is tripped will it go to sys_alloc.

This ends up looking a bit like this for each group of sub-allocator:



There is also some clever logic that can be done so that if one scheduler's allocator is overworked and one of them is underworked, the various sub-allocators can migrate mostly-free chunks of memory from one scheduler to another one.

Anyway, the 9 sub-allocators are:
  1. temp_alloc: does temporary allocations for short use cases (such as data living within a single C function call).
  2. eheap_alloc: heap data, used for things such as the Erlang processes' heaps.
  3. binary_alloc: the allocator used for reference counted binaries (> 64 bytes binaries are ref counted and global).
  4. ets_alloc: ETS tables store their data in an isolated part of memory that isn't garbage collected, but allocated and deallocated as long as terms are being stored in tables. They're a small in-memory DB that allows destructive updates.
  5. driver_alloc: used to store driver data in particular, which doesn't keep drivers that generate Erlang terms from using other allocators. The driver data allocated here contains locks/mutexes, options, Erlang ports (file descriptors), etc.
  6. sl_alloc: short-lived memory blocks will be stored there, and include items such as some of the VM's scheduling information or small buffers used for some data types' handling.
  7. ll_alloc: long-lived allocations will be in there. Examples include Erlang code itself and the atom table.
  8. fix_alloc: allocator used for frequently used fixed-size blocks of memory. One example of data used there is the internal processes' C struct, used internally by the VM.
  9. std_alloc: catch-all allocator for whatever didn't fit the previous categories. The process registry for named process is there.

So whenever you think of 'Erlang has isolated processes', that's mostly true (eheap_alloc), but stuff routinely lives out of there. The memory into each of these sub-allocators also has its own fancy-pants strategy:
  1. Best fit (bf)
  2. Address order best fit (aobf)
  3. Address order first fit (aoff)
  4. Address order first fit carrier best fit (aoffcbf)
  5. Address order first fit carrier address order best fit (aoffcaobf)
  6. Good fit (gf)
  7. A fit (af)
These basically decide the kind of search algorithm used to locate the next best free multiblock carrier to allocate memory to, followed by how to pick a block within that carrier to allocate memory to, in order to do tradeoffs in CPU, run time, memory compactness, chances of fragmentation, etc. based on data size, how regular it is, or whatever. By default, each type of sub-allocator has its own strategy picked, and people rarely need to tweak these, but it happened to me at some point and it's pretty neat to change one option for one type of sub-allocator and see memory go down 60% after large usage spikes.

I can go in details for these strategies if asked, but otherwise I'll keep going.

So for your run-of-the-mill garbage collection, you have to know how an Erlang process is laid out. It basically has this piece of memory that can be imagine as one box:

[                  ]

On one end you have the heap, and on the other, you have the stack:

[heap |     | stack]

In practice there's more data (you have an old heap and a new heap, for generational GC, and also a virtual binary heap, to account for the space of reference-counted binaries on a specific sub-allocator not used by the process -- binary_alloc vs. eheap_alloc):

[heap   ||    stack]

The space is allocated more and more up until either the stack or the heap can't fit in anymore. This triggers a minor GC. The minor GC moves the data that can be kept into the old heap, GCs the rest, may end up reallocating for more space.

After a given number of minor GCs and/or reallocations, a full-sweep GC is performed, which inspects both the new and old heaps, frees up more space, and so on. When a process dies, both the stack and heap are taken out at once. reference-counted binaries are decreased, and if the counter is at 0, they vanish.

When that happens, over 80% of the time, the only thing that happens is that the memory is marked as available in the sub-allocator and can be taken back by new processes or other ones that may need to be resized. Only after having this memory unused -- and the multiblock carrier unused also -- is it returned to mseg_alloc or sys_alloc, which may or may not keep it for a while longer.

There are quite a few clever optimizations around for more stuff. For example, a process' mailbox is a double queue: an internal one, and an external (lock-free, iirc) one. When a process reads its mailbox, it locks both queues, copies data from the external one into its internal one, unlocks the external queue, and keeps going. The external queue is not in the same memory space as the internal one either, meaning that GC doesn't keep other processes in other schedulers from sending messages around.

So yeah, when people say Erlang's memory model is simple, it means it's conceptually simple, but there's a lot of clever engineering going on in there. 99% of Erlang developers will never need to know about all these things, which is fairly nice, but it's still available when production issues knock on your door.

MononcQc fucked around with this message at 15:32 on Apr 7, 2014

Adbot
ADBOT LOVES YOU

MononcQc
May 29, 2007

Also here's a post on Chicken Scheme's garbage collector and how it impact the compilation strategy while we're at it: http://www.more-magic.net/posts/internals-gc.html

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply