p-lang thread: (now (have you (problems two)))

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > p-lang thread: (now (have you (problems two)))

«‹›1784 »

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

The "Linux problem" is solved by RHEL, which bundles a bunch of old libraries and says "build and certify your apps for these and we'll do all the hard work of maintaining these libraries we didn't write". There's also the LSB, which says the same thing, and is just a fancy standardization of whatever the latest RHEL includes.

# ? Jun 11, 2014 01:55

Adbot: ADBOT LOVES YOU

# ? Jun 11, 2024 18:40

pseudorandom name: May 6, 2007

GrumpyDoctor posted:

lol

This is how glibc's sprintf works right now.

OTOH, glibc only has one sprintf.

# ? Jun 11, 2014 01:59

Subjunctive: Sep 12, 2006; ✨sparkle and shine✨

Mr Dog posted:

i wish ms wouldn't refer to their libc as a crt though argh that's not what a crt is god drat it, a crt is something that sets up the stack and prepares argc and argv (ok and envp but who gives a gently caress about that) for main

it does that, I thought (though some of that is statically in the binary as well I imagine?)

I don't know why you would call something "libc" on a Windows platform. is there some canonical "C runtime" definition that excludes implementation of library routines for some important reason?

# ? Jun 11, 2014 02:01

hackbunny: Jul 22, 2007; I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

suffix posted:

microsoft

proclick

chmods please posted:

i thought it did or is it just winmain or what.

no, it's statically linked in the executable, an object file called crt0.obj

I forget what the actual entry point for Win32 is conventionally called (boo, shame, boo), but it's implemented by crt0.obj, which sets up stuff and calls the user-defined main (or WinMain. or wmain. or wWinMain)

main vs WinMain makes no difference whatsoever, the code is the same and it can detect which entry point to call. the only difference between main and WinMain is that, if the linker finds a symbol called _main, it sets the executable's subsystem to "Win32 console", if it finds a symbol called _WinMain@16 (stdcall calling convention) it sets it to "Win32". this is just the default and it can be overridden with linker option /SUBSYSTEM (so you can have a text-mode program that starts from a WinMain and vice-versa)

WinMain used to actually matter in the bad old days when Windows was if not incompatible, at least extremely hostile to ANSI C and the arguments to WinMain actually were vital information provided by the kernel itself, while argc and argv were a relatively alien concept. if you look at the source code of crt0, you'll see that the kernel no longer passes any argument to the entry point (lie, it passes a pointer to the process's PEB, the grand central data structure with pointers to the heap, environment variables, command line, etc., but its layout is technically a secret at the Win32 layer crt0 operates at), and the arguments to WinMain are queried with public APIs you can use yourself (the arguments to main come from the libc's __argc and __argv exports)

(the w variants use wide strings, i.e. UTF-16)

GrumpyDoctor posted:

lol

BSD libc is like that too. comes in handy for implementing snprintf on top of fprintf for platforms that lack it (e.g. Symbian S50)

# ? Jun 11, 2014 02:08

Quebec Bagnet: Apr 28, 2009; mess with the honk
you get the bonk; Lipstick Apathy

hackbunny posted:

no, it's statically linked in the executable, an object file called crt0.obj

I forget what the actual entry point for Win32 is conventionally called (boo, shame, boo), but it's implemented by crt0.obj, which sets up stuff and calls the user-defined main (or WinMain. or wmain. or wWinMain)

main vs WinMain makes no difference whatsoever, the code is the same and it can detect which entry point to call. the only difference between main and WinMain is that, if the linker finds a symbol called _main, it sets the executable's subsystem to "Win32 console", if it finds a symbol called _WinMain@16 (stdcall calling convention) it sets it to "Win32". this is just the default and it can be overridden with linker option /SUBSYSTEM (so you can have a text-mode program that starts from a WinMain and vice-versa)

WinMain used to actually matter in the bad old days when Windows was if not incompatible, at least extremely hostile to ANSI C and the arguments to WinMain actually were vital information provided by the kernel itself, while argc and argv were a relatively alien concept. if you look at the source code of crt0, you'll see that the kernel no longer passes any argument to the entry point (lie, it passes a pointer to the process's PEB, the grand central data structure with pointers to the heap, environment variables, command line, etc., but its layout is technically a secret at the Win32 layer crt0 operates at), and the arguments to WinMain are queried with public APIs you can use yourself (the arguments to main come from the libc's __argc and __argv exports)

yeah that's it, i haven't done native windows programming in a while. i went and looked and i think it's (w(in))mainCRTStartup? idk enough about PEs to find out what the actual assigned entry point is but it looks like there's a different crt0.obj for every combination of unicode and console subsystem? i assume there's a completely different kind of fun for native mode programs

# ? Jun 11, 2014 03:15

Subjunctive: Sep 12, 2006; ✨sparkle and shine✨

hackbunny posted:

solid post

# ? Jun 11, 2014 03:17

Dessert Rose: May 17, 2004; awoken in control of a lucid deep dream...

Microsoft posted:

In order to unify these different CRTs, we have split the CRT into three pieces:

lol

# ? Jun 11, 2014 10:13

hackbunny: Jul 22, 2007; I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

chmods please posted:

yeah that's it, i haven't done native windows programming in a while. i went and looked and i think it's (w(in))mainCRTStartup? idk enough about PEs to find out what the actual assigned entry point is but it looks like there's a different crt0.obj for every combination of unicode and console subsystem? i assume there's a completely different kind of fun for native mode programs

oooh you're right I think. it's the linker that detects which user-level entry point was used (main/wmain/WinMain/wWinMain), from that the subsystem, and from the subsystem which raw entry point to use (mainCRTStartup/wmainCRTStartup/etc.). christ I'm rusty

there is no real OS magic in mainCRTStartup, it's all compiler magic. it's where C and C++ static constructors are called * **, it contains the __try/__except block that turns unhandled exceptions into ANSI C signals and a ton of other little things

the real fun is in C++/CLI executables. they are mixed native/managed executables (what a terrible idea), so you have a raw entry point in native code that queries the executable's CLI metadata to locate the user-level entry point in managed code and call it. this is because they never made a managed CRT, so even C++/CLI programs have to use the native one and end up as an ungodly mix of native and managed code

* you know about C++ constructors I guess; the C constructors have no dedicated syntax, it's all linker fuckery encoded in proprietary #pragmas and magic segment names. mostly used by the CRT to initialize its subsystems, I guess because of the lack of an init-once primitive or because the initialize-on-first-use was tried and found lacking

** I was about to say that the PE format has no native concept of constructors to be run before the entry point, unlike ELF, but that's not strictly true, you have TLS constructors. they technically are for initializing __thread-storage variables with a non-trivial constructor, but in practice they are passed the same arguments as DllMain, which is extremely interesting as you can get DLL_THREAD_ATTACH/DLL_THREAD_DETACH events in the main EXE, making _beginthread/_endthread redundant ***. I wonder why Microsoft never leveraged TLS constructors to do this: the old loader limitation where you couldn't dynamically load a DLL with a TLS segment doesn't even apply to an EXE. TLS constructors were even available before FLS ****

*** managing your threads with _beginthread/_endthread used to ensure that a CRT statically linked in an EXE can properly clean up thread-local data, because an EXE technically doesn't get the DLL_THREAD_ATTACH/DLL_THREAD_DETACH notifications a DLL would, and wouldn't know when to deallocate a thread's data ****

**** this was all made redundant (I think, I have no idea if the CRT uses it) by FLS (Fiber-Local Storage), a new TLS mechanism that integrates with fibers (user-scheduled sub-threads) but can be used without fibers. FLS, like POSIX thread-specific data but unlike Win32 TLS, lets you specify a destructor function for each slot you allocate. it was "only" introduced in Windows Server 2003, but that's irrelevant now that Windows XP is unsupported

# ? Jun 11, 2014 10:23

Sapozhnik: Jan 2, 2005; Nap Ghost

hackbunny posted:

** I was about to say that the PE format has no native concept of constructors to be run before the entry point, unlike ELF, but that's not strictly true, you have TLS constructors. they technically are for initializing __thread-storage variables with a non-trivial constructor, but in practice they are passed the same arguments as DllMain, which is extremely interesting as you can get DLL_THREAD_ATTACH/DLL_THREAD_DETACH events in the main EXE, making _beginthread/_endthread redundant ***. I wonder why Microsoft never leveraged TLS constructors to do this: the old loader limitation where you couldn't dynamically load a DLL with a TLS segment doesn't even apply to an EXE. TLS constructors were even available before FLS ****

Huh. I wrote an entire PE loader myself and never knew about this.

Mind you that's one of the things I like about PE: it's very easy to turn an on-disk image into an executing image. Though traversing in-memory IATs and such to populate them is kind of a pain.

# ? Jun 11, 2014 10:59

hackbunny: Jul 22, 2007; I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

Mr Dog posted:

Huh. I wrote an entire PE loader myself and never knew about this.

I wrote a PE obfuscator and I had to add support for TLS constructors because Borland compilers actually used them (and actually were still relevant), and they run before the entry point so they have to be available and de-obfuscated much sooner

they were so convenient I ended up adding TLS constructors to all obfuscated executables. they work exactly like DllMain (except they return void i.e. they can't veto process/thread creation), including the fact they are serialized under the loader lock

Mr Dog posted:

Mind you that's one of the things I like about PE: it's very easy to turn an on-disk image into an executing image. Though traversing in-memory IATs and such to populate them is kind of a pain.

I've written both a PE and an ELF mapper and I have to say ELF is much better from that point of view

ELF has a lower level concept than named sections (program segments), leaving named sections free to have crazy non-page alignments and odd sizes, plus it supports longer segment names (PE technically does too - you can set a segment name to an index into a non-resident symbol table. but you have to be a loving lunatic to actually generate COFF symbol tables for your PEs and rely on them at runtime). this means compilers, runtime libraries and operating systems can extend the format as they wish, with specially-named segments that delimit data/code with a special meaning. in PE the basic mapping unit is the segment, so delimiting special areas (IAT, resources, etc.) is done with directories, a statically allocated array of 15 (? 16?) offset-size tuples that Microsoft has exclusive control of (and long exhausted). extensions to the PE format are typically done by exporting a dummy symbol (see: IsTNT, Borland32) for boolean flags or with special resources (see: the assembly manifest atrocity) for more complex data

ELF program segments can be non-loadable too, meaning they won't be mapped in memory. there is no standard way to do this in PE

ELF also supports non-contiguous mappings, which, granted, are abused for things like classic UNIX executables, but also come in handy for core dumps. unlike the Windows environment where you have a format for executables and a separate format for crash dumps, you can use ELF for both. thanks to non-loadable segments you can even store non-memory data in an ELF crash dump, like a dump of the file table or per-thread register values. Microsoft had to devise a whole new format ("minidump") for that

but then, ELF is not strictly compatible with the Windows kernel anyway, because ELF assumes you can map any number of pages at any offset, even over existing ones, which mmap lets you do and Windows NtAllocateVirtualMemory absolutely doesn't

e: due to the extensive use of 32-bit file offsets and RVAs, PEs are also limited to 4GB in size, both on disk and in memory. another reason they aren't viable as a general mapping description format that works for crash dumps too

hackbunny fucked around with this message at 15:09 on Jun 11, 2014

# ? Jun 11, 2014 12:12

MononcQc: May 29, 2007

After tef linked to them on twitter, I've tried playing with LRW/HRW/Rendezvous Hashing that is a different way to get something similar to a circular hash (with consistent hashing), but based on IPs and a weighted hash.

In a regular hash ring, we use some hash algorithm and wrap it around a circle where bytes are bucketed to specific sets of nodes, which means needing to either neither resize the circle (or require to re-hash and redistribute everything), or use vnodes like Riak does (which allows to control the resizing and eliminating hot spots).

That quickly gets to be tricky, and LRW/HRW is a very nifty simpler algorithm that can yield similar results, without the complexity. The trick comes from hashing the key of an object, along with the IP of all nodes you can contact, and get a weight out of it. You then sort the nodes by weight, and can get the top N of them to replicate data there.

What's cool is that the hashing is stable, and if you take away one node, only keys that had it as the lowest (or highest) weight need to be moved around. If you had a replication factor, you may not need to move anything just yet (rewriting on further keys may eventually move everything fine). When you add a new node, you can let things go until they're replicated, or force a re-hashing of most items -- they'll mostly all re-hash to the same destination, but maybe 1/|Nodes| of them will require being moved around.

I've put together a short implementation in Erlang: https://github.com/ferd/lrw (and a more general one than in the paper)

This is another super simple yet surprisingly powerful algorithm. Need more of these.

# ? Jun 11, 2014 14:04

Shaggar: Apr 26, 2006

what happens if some clients don't get notification of the new host and how do you control that notification?

# ? Jun 11, 2014 16:05

MononcQc: May 29, 2007

Shaggar posted:

what happens if some clients don't get notification of the new host and how do you control that notification?

My guess would be that if you want that kind of safer behavior, you could replicate the keys with a factor of N. For example, DynamoDB and Riak will use the hash ring, but write the entry to 2 or (ideally) 3 nodes in sequence there. If the first one is unavailable, it will try the other ones on the hash ring.

This could be done with the Rendez-Vous Hashing scheme by having the final weighed list of N nodes, writing to say, 3 out of them. Then you can tolerate a given amount of failures or missing data by going to the next greatest ones until you find it.

But for the more general question of 'how do clients know?', the paper goes into this description:

quote:

It is easy to see that HRW avoids replication, thus potentially giving a higher hit rate, as long as clients have a consistent list of servers for each cluster. There are two possible methods for achieving such consistently quickly.

First, we could require participation from servers themselves, and have them maintain consistent lists. In this case, HRW could be run at the servers, and one server could forward or redirect client requests to another server. This method, however, incurs additional latency.

Alternatively, we could require clients to arrive at consistent lists, and try to minimize the convergence time for clients (and providers in eager-provider domains) to reach consistency in their server lists. One option is for clients to periodically resolve the server list amongst themselves, in which case the convergence time is equal to the resolution period. In this case, the resolution period should be chosen such that the convergence time is acceptable. On the other hand, when server lists are obtained from a "push" name service, this convergence time is simply the time until all clients receive a new server list from the name service.

So if I get it all and can extrapolate some, options are:

- Have servers do internal routing
- Use a proxy/load-balancer where Rendez-Vous Hashing is the dispatching method based on whatever token
- do the equivalent of DNS lookups and return a list of hosts that clients can use at intervals
- push notifications (with the caveat you had)
- You could also mix push/pull to sync nicer and eventually correct missed notifications

If you combine that stuff with possible higher replication rate policies (say writing to many, reading from one or more, depending on the type of load you have), then you can probably get something fairly reliable without needing to replicate it everywhere all the time.

# ? Jun 11, 2014 16:29

Shaggar: Apr 26, 2006

yeah i was thinking removing a node or having a node die is pretty easy to handle by replicating its objects to the appropriate next node for that object based on the same alg the clients are using, but if your clients are out of sync with a new node entering the cluster you could have clients w/ old data looking for/updating objects at the wrong node.

i guess if the node could detect that its the wrong place for the object it could try to fix it. either by sending it to the right place, redirecting the client to the right place, or by telling the client to go pound sand and update its list of nodes. imo the detection is gonna add enough overhead already so just tell the client to fix it.

the pro move would be to use SRV records cause then you could put the weights right in there which makes adding/removing nodes easy and lets you do traditional location based dns.

to exit the cluster gracefully, a node sets its weight to 0, stops accepting requests, and then makes sure everything it had is replicated off to the proper secondary before shutting down. if you do the replication regularly its probably already up to date. clients that see the new 0 weight stop using it immediately and get the next node, clients out of sync who try to hit it get rejected and use the next node.

to add a node, have it fire up, add itself to srv with 0 weight, ask clustermates for all objects it would own at the weight it plans to go in at, once all those objects are received, set weight to whatever it should be. then if a request is made to a node that no longer owns the object, punt the client back w/ an error that its node list is out of date. client updates its list and finds the new server.

# ? Jun 11, 2014 16:53

Notorious b.s.d.: Jan 25, 2003; by Reene

PE is derived from COFF
ELF was designed to replace COFF

it's not an accident that ELF is easier to use

# ? Jun 11, 2014 17:13

MononcQc: May 29, 2007

Shaggar posted:

yeah i was thinking removing a node or having a node die is pretty easy to handle by replicating its objects to the appropriate next node for that object based on the same alg the clients are using, but if your clients are out of sync with a new node entering the cluster you could have clients w/ old data looking for/updating objects at the wrong node.

i guess if the node could detect that its the wrong place for the object it could try to fix it. either by sending it to the right place, redirecting the client to the right place, or by telling the client to go pound sand and update its list of nodes. imo the detection is gonna add enough overhead already so just tell the client to fix it.

Yeah that all works. There's also use-case-specific stuff there that you can add. I.e. for a cache server, it doesn't matter if data is replicated in all kinds of wonky places because it will eventually time out and fix itself, as long as clients are trusted to eventually readjust. The authoritative data can always be found from somewhere else.

Using it as a scheme for a key/value store likely requires you to be more careful. I like the idea of telling clients to go pound sand, but the redirection-based one is pretty clever because it lets the clients update on a per-need basis; if you have very infrequent changes in topology or a relatively large number of nodes, you can minimize how often things need updating. If you're writing very-frequently, it might become cheaper to frequently update your list, though.

Also as much as I hate sticky sessions, it sounds like a decent algorithm to have load-balancing done that way, with automated migration from endpoint A to new endpoint D once it's added to the cluster, slowly propagating resource usage. There's still a shitload of stuff that can go wrong, but that's more based on sticky sessions being terrible than anything else.

Shaggar posted:

the pro move would be to use SRV records cause then you could put the weights right in there which makes adding/removing nodes easy and lets you do traditional location based dns.

to exit the cluster gracefully, a node sets its weight to 0, stops accepting requests, and then makes sure everything it had is replicated off to the proper secondary before shutting down. if you do the replication regularly its probably already up to date. clients that see the new 0 weight stop using it immediately and get the next node, clients out of sync who try to hit it get rejected and use the next node.

to add a node, have it fire up, add itself to srv with 0 weight, ask clustermates for all objects it would own at the weight it plans to go in at, once all those objects are received, set weight to whatever it should be. then if a request is made to a node that no longer owns the object, punt the client back w/ an error that its node list is out of date. client updates its list and finds the new server.

That does sound pretty nifty as an approach, especially for planned replacements. It requires you to really have and want to control things for the entire stack at once (the client needs to know about that ad-hoc protocol, the people maintaining servers must be able to toy with DNS, TTLs have to be respected everywhere, etc.), and you still need to handle catastrophic out-of-protocol failures, but it lets you know between 'retirement' and 'crashes/netsplits' which is pretty cool.

MononcQc fucked around with this message at 17:27 on Jun 11, 2014

# ? Jun 11, 2014 17:24

Quebec Bagnet: Apr 28, 2009; mess with the honk
you get the bonk; Lipstick Apathy

hackbunny posted:

the real fun is in C++/CLI executables. they are mixed native/managed executables (what a terrible idea), so you have a raw entry point in native code that queries the executable's CLI metadata to locate the user-level entry point in managed code and call it. this is because they never made a managed CRT, so even C++/CLI programs have to use the native one and end up as an ungodly mix of native and managed code

don't all managed executables have a native stub which starts up the managed code with ICLRRuntimeHost? or did that go away?

quote:

** I was about to say that the PE format has no native concept of constructors to be run before the entry point, unlike ELF, but that's not strictly true, you have TLS constructors. they technically are for initializing __thread-storage variables with a non-trivial constructor, but in practice they are passed the same arguments as DllMain, which is extremely interesting as you can get DLL_THREAD_ATTACH/DLL_THREAD_DETACH events in the main EXE, making _beginthread/_endthread redundant ***. I wonder why Microsoft never leveraged TLS constructors to do this: the old loader limitation where you couldn't dynamically load a DLL with a TLS segment doesn't even apply to an EXE. TLS constructors were even available before FLS ****

you mean DllMain as a TLS constructor? i wonder if it's related to initialization order, is that not guaranteed? possibly could interfere with COM as well since it has its own threading concerns

# ? Jun 11, 2014 17:42

Shaggar: Apr 26, 2006

MononcQc posted:

Yeah that all works. There's also use-case-specific stuff there that you can add. I.e. for a cache server, it doesn't matter if data is replicated in all kinds of wonky places because it will eventually time out and fix itself, as long as clients are trusted to eventually readjust. The authoritative data can always be found from somewhere else.

Using it as a scheme for a key/value store likely requires you to be more careful. I like the idea of telling clients to go pound sand, but the redirection-based one is pretty clever because it lets the clients update on a per-need basis; if you have very infrequent changes in topology or a relatively large number of nodes, you can minimize how often things need updating. If you're writing very-frequently, it might become cheaper to frequently update your list, though.

Also as much as I hate sticky sessions, it sounds like a decent algorithm to have load-balancing done that way, with automated migration from endpoint A to new endpoint D once it's added to the cluster, slowly propagating resource usage. There's still a shitload of stuff that can go wrong, but that's more based on sticky sessions being terrible than anything else.

That does sound pretty nifty as an approach, especially for planned replacements. It requires you to really have and want to control things for the entire stack at once (the client needs to know about that ad-hoc protocol, the people maintaining servers must be able to toy with DNS, TTLs have to be respected everywhere, etc.), and you still need to handle catastrophic out-of-protocol failures, but it lets you know between 'retirement' and 'crashes/netsplits' which is pretty cool.

its really a huge bummer that http doesn't use srv records cause you could do so much great stuff. really the minimum client side protocol would be that if you get denied by the server, refresh your list of servers. altho maybe you do that for timeouts too? idk. i like the idea of the server weight in the hash tho

# ? Jun 11, 2014 18:21

Notorious b.s.d.: Jan 25, 2003; by Reene

Shaggar posted:

really the minimum client side protocol would be that if you get denied by the server, refresh your list of servers.

This is not how DNS works

# ? Jun 11, 2014 18:28

Shaggar: Apr 26, 2006

no, its how this protocol we're talking about would work

# ? Jun 11, 2014 18:32

Shaggar: Apr 26, 2006

well yeah i guess if dns is setup right then you wouldn't cache it internally

# ? Jun 11, 2014 18:36

Notorious b.s.d.: Jan 25, 2003; by Reene

clients can choose to cache or not. and they will sometimes impose their own minimum ttls, defying the standard.

you have no way to know whether you're talking to an end-user desktop or an intermediate caching resolver.

# ? Jun 11, 2014 18:37

Shaggar: Apr 26, 2006

or maybe you use your own internal dns caching that uses the domain's nameservers directly so you don't need to wait for the ttl if you need a fresh list.

# ? Jun 11, 2014 18:38

Blotto Skorzany: Nov 7, 2008; He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip

cache invalidation blows goat dicks

# ? Jun 11, 2014 18:39

Shaggar: Apr 26, 2006

continually failing because the data in the cache is wrong is worse

# ? Jun 11, 2014 18:41

Shaggar: Apr 26, 2006

altho really a redirect instead of punting the client is probably fine

# ? Jun 11, 2014 18:42

Subjunctive: Sep 12, 2006; ✨sparkle and shine✨

Shaggar posted:

continually failing because the data in the cache is wrong is worse

skip-cache-on-error is the sort of thing that often leads to inadvertent DoS situations; have to manage backoff pretty carefully. edit: we once melted the Android push notification service for a few hours because we used the backoff example pattern from their docs, which unfortunately added 2 instead of multiplying by 2.

but yeah, phil karlton wasn't kidding about cache invalidation

# ? Jun 11, 2014 18:49

Notorious b.s.d.: Jan 25, 2003; by Reene

Otto Skorzeny posted:

cache invalidation blows goat dicks

there are two hard problems in computer science: naming things, cache invalidation, and off-by-one errors.

# ? Jun 11, 2014 18:53

MononcQc: May 29, 2007

https://bugs.php.net/bug.php?id=66608

quote:

Description:
------------
The test script I provided _should_ print out "1", "2", "3", "4", and "5" in order and then it should exit gracefully.

When I run the test script under PHP 5.6.0alpha1 (downloaded and built today from http://downloads.php.net/tyrael/php-5.6.0alpha1.tar.gz), it prints out "1", "2", and "3", and then it gets stuck in an infinite loop that keeps printing out "4" over and over again. It never prints out "5" and the php process never exits.

I can consistently reproduce this behavior on my machine. I'm running CentOS Linux 6.3 on a 64-bit Intel Xeon processor, but I think this bug should be easily reproducible on most other platforms and processors.

I attached gdb to the php process to investigate, and it appears that the Zend engine is getting stuck in an infinite loop that keeps jumping between in execute_ex(), ZEND_JMP_SPEC_HANDLER(), ZEND_FAST_RET_SPEC_HANDLER(), and ZEND_ECHO_SPEC_CONST_HANDLER() from "zend_vm_execute.h".

I can also reproduce this bug with various builds of PHP 5.5.0.

Test script:
---------------
PHP code:
function bar() {
  try {
    echo "1\n";
  } finally {
    try {
      throw new Exception("");
    } catch (Exception $ab) {
      echo "2\n";
    } finally {
      echo "3\n";
    }
    echo "4\n";
  }
  echo "5\n";
}
bar();
Expected result:
----------------
1
2
3
4
5

Actual result:
--------------
1
2
3
4
4
4
4
4
4
4
4
... the php process keeps printing "4" over and over forever ...

# ? Jun 11, 2014 18:54

Nomnom Cookie: Aug 30, 2009

tbf that's an alpha. to be unfair how did their test suite not catch this, try/catch/finally inside a try/finally isn't a super complicated construct

# ? Jun 11, 2014 19:13

Nomnom Cookie: Aug 30, 2009

oh wait i just saw 5.5 has it too. lol

# ? Jun 11, 2014 19:14

Notorious b.s.d.: Jan 25, 2003; by Reene

Kevin Mitnick P.E. posted:

tbf that's an alpha. to be unfair how did their test suite not catch this, try/catch/finally inside a try/finally isn't a super complicated construct

if anyone understood the bug, it would be closed and there would be a test.

# ? Jun 11, 2014 19:14

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

Kevin Mitnick P.E. posted:

to be unfair how did their test suite not catch this

http://gcov.php.net/viewer.php?version=PHP_HEAD&func=tests

# ? Jun 11, 2014 19:17

MononcQc: May 29, 2007

In other JS news (crosspost from security thread):

# ? Jun 11, 2014 19:22

Nomnom Cookie: Aug 30, 2009

Suspicious Dish posted:

http://gcov.php.net/viewer.php?version=PHP_HEAD&func=tests

http://gcov.php.net/viewer.php?version=PHP_HEAD&func=graph&mode=Year

# ? Jun 11, 2014 19:31

spongeh: Mar 22, 2009; BREADAGRAM OF PROTECTION

Kevin Mitnick P.E. posted:

oh wait i just saw 5.5 has it too. lol

the best part about php is how the language is able to keep going instead of just crashing out the first error it finds. tbc was right.

# ? Jun 11, 2014 19:59

hackbunny: Jul 22, 2007; I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

chmods please posted:

don't all managed executables have a native stub which starts up the managed code with ICLRRuntimeHost? or did that go away?

it was a hack for Windows 2000 and maybe Windows XP too. the native stub couldn't do anything but call _CorExeMain/whatever the equivalent for DLLs was, the assembly verifier checks that. From Rotor's source code, I remember they actually disassembled the stub to ensure all it did was jump immediately to the mscoree!_CorExeMain import

later versions of Windows know about managed executables and skip the stub completely, loading mscoree.dll automatically and starting the initial thread from _CorExeMain

chmods please posted:

you mean DllMain as a TLS constructor? i wonder if it's related to initialization order, is that not guaranteed? possibly could interfere with COM as well since it has its own threading concerns

I mean a TLS constructor as a DllMain for EXEs. the CRT even secretly came with an undocumented TLS constructors segment that you could hitch your constructor to, using the same linker/#pragma magic * you use to link in a C constructor/destructor, so they actually kind of support it. in fact, I wonder if they now support __thread storage C++ objects with non-trivial constructors/destructors

COM is completely orthogonal to the CRT so that's not that. I guess some obscure backwards compatibility issue. like maybe Win32s (a limited Win32 implementation for old 16-bit Windows) didn't implement TLS constructors?

* if you're wondering what kind of magic is that: if a segment name contains a dollar sign, it's actually a combination of segment$subsegment. the linker takes all of those subsegments, groups them by segment name, sorts them by subsegment name, and merges them together in a single segment. let's say there's a segment .magic you want to merge your stuff into: you declare static data allocated in a section named, let's say, .magic$XXX. somewhere else, in another compilation unit/object file, there's a magic_begin symbol that points to a segment named .magic$AAA; somewhere else, a magic_end symbol points to .magic$ZZZ. the linker merges .magic$AAA, .magic$XXX and .magic$ZZZ, in that order, into a segment named .magic. whatever code consumes the segment's data can locate the segment by the magic_begin and magic_end symbols

in the part of the CRT that's always statically linked, there's subsegments like this for C constructors, C destructors, C++ static constructors, C++ static destructors, TLS constructors, TLS destructors, and I think even two for C++/CLI static constructors/destructors (no idea what the hell they contain, whether pointers to native stubs or handles to CLI metadata). they are arrays of function pointers which makes it easy to add stuff to them and to skip subsegment alignment padding (padding will show up in the array as null pointers)

# ? Jun 11, 2014 21:24

karms: Jan 22, 2006; by Nyc_Tattoo; Yam Slacker

MononcQc posted:

In other JS news (crosspost from security thread):

# ? Jun 11, 2014 23:23

suffix: Jul 27, 2013; Wheeee!

MononcQc posted:

After tef linked to them on twitter, I've tried playing with LRW/HRW/Rendezvous Hashing that is a different way to get something similar to a circular hash (with consistent hashing), but based on IPs and a weighted hash.

In a regular hash ring, we use some hash algorithm and wrap it around a circle where bytes are bucketed to specific sets of nodes, which means needing to either neither resize the circle (or require to re-hash and redistribute everything), or use vnodes like Riak does (which allows to control the resizing and eliminating hot spots).

That quickly gets to be tricky, and LRW/HRW is a very nifty simpler algorithm that can yield similar results, without the complexity. The trick comes from hashing the key of an object, along with the IP of all nodes you can contact, and get a weight out of it. You then sort the nodes by weight, and can get the top N of them to replicate data there.

What's cool is that the hashing is stable, and if you take away one node, only keys that had it as the lowest (or highest) weight need to be moved around. If you had a replication factor, you may not need to move anything just yet (rewriting on further keys may eventually move everything fine). When you add a new node, you can let things go until they're replicated, or force a re-hashing of most items -- they'll mostly all re-hash to the same destination, but maybe 1/|Nodes| of them will require being moved around.

I've put together a short implementation in Erlang: https://github.com/ferd/lrw (and a more general one than in the paper)

This is another super simple yet surprisingly powerful algorithm. Need more of these.

oh, hey, this is cool

we use consistent hashing at work, and as far as i can tell you need a shitton of points to get some semblance of even distribution

i think this might do better

# ? Jun 12, 2014 00:21

Adbot: ADBOT LOVES YOU

# ? Jun 11, 2024 18:40

syntaxrigger: Jul 7, 2011; Actually you owe me 6! But who's countin?

Spent all day fighting with Visual Studio to force Apache Cordova to play nice with WinJS, and it still is buggy. It would be awesome if there was a default project that incorporated the two off the bat because gently caress doing all of that poo poo manually. Missing my CLI. I love that they are going to integrate Apache Cordova with Visual Studio because Microsoft tooling generally seems a million times better than eclipse but gently caress. So annoying to use right now.

Ok done bitching

# ? Jun 13, 2014 00:40

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > p-lang thread: (now (have you (problems two)))

«‹›1784 »