Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
LOOK I AM A TURTLE
May 22, 2003

"I'm actually a tortoise."
Grimey Drawer

Sagacity posted:

It's odd that your unit tests didn't catch this though

Yeah, about that...

The funny thing is that IIRC I had good test coverage for most of the stuff I was doing related to these automatic payments, but the particular part of the code that contained the bug was interfacing with the older parts of the codebase, which was incredibly resistent to unit testing for a variety of reasons. The biggest issue in this case, apart from my stupidity, was a lack of end-to-end testing.

Adbot
ADBOT LOVES YOU

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

NtotheTC posted:

you didn't work for wanadoo by any chance did you? i remember people's DAOC inventories vanishing and them basically having to go "uh, could you tell us what items you had and please be honest?"

No. But yes, it's a relatively common error.

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

LOOK I AM A TURTLE posted:

Yeah, about that...

The funny thing is that IIRC I had good test coverage for most of the stuff I was doing related to these automatic payments, but the particular part of the code that contained the bug was interfacing with the older parts of the codebase, which was incredibly resistent to unit testing for a variety of reasons. The biggest issue in this case, apart from my stupidity, was a lack of end-to-end testing.

I was expecting the test to have used the same incorrect collection.

Polio Vax Scene
Apr 5, 2009



Kazinsal posted:

I broke 911 calling for an office of a couple hundred people and didn't notice for three weeks until someone misdialed 911 and... it didn't work.

Always test your critical changes.

curious how you would test this that doesnt involve telling a 911 operator "hey just testing our system thanks! :) "

Volmarias
Dec 31, 2002

EMAIL... THE INTERNET... SEARCH ENGINES...

Polio Vax Scene posted:

curious how you would test this that doesnt involve telling a 911 operator "hey just testing our system thanks! :) "

It's literally what you do. "Hey, just testing to make sure outbound dialing goes through to 911 correctly. Everything is fine, thanks!"

Raymond T. Racing
Jun 11, 2019

Polio Vax Scene posted:

curious how you would test this that doesnt involve telling a 911 operator "hey just testing our system thanks! :) "

Pretty much.

I called the non emergency line, said I wanted to test our e911 information, they said go ahead, then I hung up and called 911. As soon as it connected: “this is a non-emergency test call of our e911 information, can I please get the address reported? call back number? Thanks”

spaced ninja
Apr 10, 2009


Toilet Rascal

Polio Vax Scene posted:

curious how you would test this that doesnt involve telling a 911 operator "hey just testing our system thanks! :) "

You’d just schedule a test with them. We had to do it twice a year at one place I worked.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

Polio Vax Scene posted:

curious how you would test this that doesnt involve telling a 911 operator "hey just testing our system thanks! :) "

You can call up the local 911 facility (on their non-emergency number) and coordinate a time to be doing the test - during which time they'll route incoming calls from you to an automated responder instead of tying up an actual person.

If it's just a one-off call you can probably just dial 911 and then explain that it's not an emergency and you're just testing the system - but you wouldn't want to do that if you were, for example, making a test call from every single extension to ensure that the location information was being propagated correctly.

QuarkJets
Sep 8, 2008

Groovy is a very useful language, if someone gives you a groovy script you know right away that you should stop working with them

darthbob88
Oct 13, 2011

YOSPOS
My best fuckup, that actually went live, was a combination of screwing up CDNs and code splitting that cost us our biggest customer.

I was working for an e-commerce startup providing recommendation services to small/medium online stores. The client-side service, that I mostly worked with, loaded 3 JS files to do the job- one file with the client-specific configuration, one file with platform-specific functionality, and one with core cross-platform stuff. This kinda predated build tools, so those files got loaded separately rather than as a single bundle.

One customer requested a minor functionality, IIRC changing the name in our Google Analytics reporting from opaque stuff like "PDP1" to "product detail page". I put that function in the platform file, and had it reference a look-up table in the core file, so that it would be available if other platforms wanted the same functionality. I did some quick tests, and it worked fine, so I pushed it up to the CDN.

A couple days later, I got a call that we'd broken the client's site. The platform file with the function had updated, but the core file with the lookup table it referenced had not, so the platform file threw a null pointer error and broke the rest of the customer's page.

Lessons learned:
  • I fixed my code to more gracefully handle null errors.
  • I added a cache-buster to the loading URLs for our files, to guarantee that a customer never got a file more than an hour old. Kinda invalidated the reason for the CDN, but it beat losing another customer.
  • If I did it again, I'd also make sure to bundle the code together, so files updating at different times wasn't a risk.

My favorite error: Also at that e-commerce startup, I was working on installing our service on two websites at the same time, and accidentally switched the config files between them. They were on the same platform and had broadly the same layout for their recommendations, so the recs still mostly worked, apart from URLs. But for the 20-30 minutes it took me to fix it, a men's skincare site was loading recommendations for a sex toy site. "If you like this hazel-infused skin cream, might we also recommend a BIG OL' DILDO"

Ranzear
Jul 25, 2013

LOOK I AM A TURTLE posted:

incredibly resistent to unit testing

Stealing this verbiage for future use.

Carbon dioxide
Oct 9, 2012

Just encountered this animation on a banking website.


Ignore how weird the animation itself is for a second.

This is not an animated gif. Look at the inspect element to see what they did to make it move. :wtc:

Drastic Actions
Apr 7, 2009

FUCK YOU!
GET PUMPED!
Nap Ghost
Yeah, it's lottie

Kazinsal
Dec 13, 2011



The demos on that page use 12% of the cycle time on my RTX 3080, a nine hundred dollar (USD) graphics card.

God drat, the new web is terrifying.

ynohtna
Feb 16, 2007

backwoods compatible
Illegal Hen
I still semi-regularly encounter devs who are all "I really miss Flash, it made making cool things so easy!" before showing me over long, scribbly jank-rear end animations that belong in a middle school.

No, not a single one of them had to suffer paying phone bills during the modem era, why do you ask? (:cloud:)

Polio Vax Scene
Apr 5, 2009



give me one good reason why your system, which stores a decimal value, does not accept my submitted int value.

Remulak
Jun 8, 2001
I can't count to four.
Yams Fan

Polio Vax Scene posted:

give me one good reason why your system, which stores a decimal value, does not accept my submitted int value.
Man if you aren’t gonna read the API docs then use the example form.
Enter the integer, then look on the right, there’s a pull-down you use to specify if that integer is a representation of the fraction or exponent, then a dialog box to describe how the integer was encoding the fraction (or exponent), a repeat of the controls for the exponent (or fraction) then a radio button to indicate if the base is 2 or 10.

QuarkJets
Sep 8, 2008

Monstrous, only base e should be permitted

Volmarias
Dec 31, 2002

EMAIL... THE INTERNET... SEARCH ENGINES...

darthbob88 posted:

[*]I added a cache-buster to the loading URLs for our files, to guarantee that a customer never got a file more than an hour old. Kinda invalidated the reason for the CDN, but it beat losing another customer.

Note, be very careful when doing this, a typo of "chest-burster" is unfortunately common, and auto correct has just made things worse.

quote:

But for the 20-30 minutes it took me to fix it, a men's skincare site was loading recommendations for a sex toy site. "If you like this hazel-infused skin cream, might we also recommend a BIG OL' DILDO"

Issue closed, who are we to judge, very possibly WAI.

Xarn
Jun 26, 2015

darthbob88 posted:

. But for the 20-30 minutes it took me to fix it, a men's skincare site was loading recommendations for a sex toy site. "If you like this hazel-infused skin cream, might we also recommend a BIG OL' DILDO"

Sounds more useful than the usual recommendation systems.

Carbon dioxide
Oct 9, 2012

https://cohost.org/cathoderaydude/post/1228730-taking-the-deepest-p

Read this whole article. It's worth it.

Kazinsal
Dec 13, 2011



Oh my god.

As a long-time osdev hobbyist this is one of the most terrifying things I have ever read and yet I can't help but want to meet the people who came up with this and ask them all sorts of questions about it.

Red Mike
Jul 11, 2011

quote:

See, the other disgusting thing that OSM did when the machine was first booted was to go into the e820 table, where the BIOS defines what memory is available to the system, and declare ~512MB of it as nonexistent (or "Address Range Reserved.") That means that when Windows begins booting, if the machine has 2GB of memory, it only sees 1.5GB, as if the other 512 wasn't even installed.

I normally love super ingenious hacks that let you get basically a modern thing running on outdated hardware/software. I don't love this, it reads like something made by a single trusted brilliant engineer who was a year away from retirement and knew they'd never really have to maintain this, just make it look like they could for long enough to leave the company.

Also as the article itself points out, it's not like the modern thing is all that modern anyway, the start-up time was still slow anyway.

Kazinsal
Dec 13, 2011


Red Mike posted:

I normally love super ingenious hacks that let you get basically a modern thing running on outdated hardware/software. I don't love this, it reads like something made by a single trusted brilliant engineer who was a year away from retirement and knew they'd never really have to maintain this, just make it look like they could for long enough to leave the company.

Also as the article itself points out, it's not like the modern thing is all that modern anyway, the start-up time was still slow anyway.

It's masterful and it works because you *have* to obey the ACPI Reserved spots in an E820 memory map. It's not completely explained in the post but "E280" comes from the traditional BIOS call from the mid-90s onwards that a bootloader or 16-bit kernel uses to retrieve the memory map: interrupt 0x15, AX=0xE820. You basically invoke the BIOS with that interrupt service routine and an offset (or 0 for "give me the beginning) over and over until it stops returning descriptions of blocks of memory. When ACPI showed up it basically extended that and standardized the type IDs it returns, and on UEFI the GetMemoryMap boot services function returns more or less the E820 map with extra little tags to identify what reserved blocks of memory are UEFI reclaimable etc.

Any operating system written after 1994 or so understands E820 memory maps, and even since the beginning of the Multiboot protocol, GRUB has collected the E820 map from the firmware and passed it onto the kernel it loads (if requested) pretty much unchanged, but in one big blob instead of requiring the OS to request each sequential block in the memory map. One of the big things about it is that it has a couple types called ACPI Reclaimable and Reserved. ACPI Reclaimable is memory that the firmware plopped some tables needed for system initialization into and that the firmware and system management mode code will never need, so once the system is online the kernel can just use those chunks as regular memory. Reserved memory is always assumed to be owned by the firmware, so you just straight up aren't allowed to use it. This is usually used for memory that the firmware may be using in system management mode, and if a kernel starts blasting through that then the system will probably just straight up crash as soon as the firmware needs to take control for whatever reason (someone pressed the power button, a legacy device that's being emulated in SMM is being accessed, etc).

Here's what an E820 map looks like from the dmesg of one of my Linux VMs running on ESXi:

code:
[    0.000000] kernel: BIOS-provided physical RAM map:
[    0.000000] kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009f3ff] usable
[    0.000000] kernel: BIOS-e820: [mem 0x000000000009f400-0x000000000009ffff] reserved
[    0.000000] kernel: BIOS-e820: [mem 0x00000000000dc000-0x00000000000fffff] reserved
[    0.000000] kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000bfedffff] usable
[    0.000000] kernel: BIOS-e820: [mem 0x00000000bfee0000-0x00000000bfefefff] ACPI data
[    0.000000] kernel: BIOS-e820: [mem 0x00000000bfeff000-0x00000000bfefffff] ACPI NVS
[    0.000000] kernel: BIOS-e820: [mem 0x00000000bff00000-0x00000000bfffffff] usable
[    0.000000] kernel: BIOS-e820: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
[    0.000000] kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec0ffff] reserved
[    0.000000] kernel: BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[    0.000000] kernel: BIOS-e820: [mem 0x00000000fffe0000-0x00000000ffffffff] reserved
[    0.000000] kernel: BIOS-e820: [mem 0x0000000100000000-0x000000023fffffff] usable
Notice that there are gaps in the memory layout -- nothing exists between 0xC0000000 and 0xEFFFFFFF for example. This is why you HAVE to obey what your E820 map says. There will be gaps in your memory layout. There will be ROM in places you wouldn't expect ROM to live. There will be chunks of your RAM taken up by the firmware that you will never get back and you must accept that. You are only allowed to use anything that's marked as Usable or as Reclaimable, and the Reclaimable space is only usable after you've finished booting up all your cores and telling ACPI that it no longer has direct control of a handful of ancillary system functions like various aspects of the power button. The reason this hack works is because the OS must accept what the hack says because the OS has no way of knowing this hack exists. As far as it's concerned, the reserved space in the memory map that the hack has written into the E820 table could be a gigantic memory-mapped window into some PCIe device. It doesn't care. It's marked as reserved, it's reserved, and the OS cannot use that memory, period.

This is an absolutely batshit insane hack and I both hate and love it.

Jen heir rick
Aug 4, 2004
when a woman says something's not funny, you better not laugh your ass off

Kazinsal posted:


This is an absolutely batshit insane hack and I both hate and love it.

Great post. I don't know much about hardware stuff. Why is this such a batshit insane hack?

Kazinsal
Dec 13, 2011


Jen heir rick posted:

Great post. I don't know much about hardware stuff. Why is this such a batshit insane hack?

Basically there aren't any formal specifications for what x86 firmware may and may not do but there are specifications for what's required to allow standards-compliant systems to function, and this abuses that lack to do some seriously messed up poo poo. The firmware is supposed to present an accurate memory map of the system to anything that boots on it. The fact that it's bullshitting the memory map to specific systems based on a whole bunch of proprietary WTF isn't technically illegal, just thoroughly frowned upon. In this case the firmware is hiding the Linux memory from the Windows system and vice versa under the guise of "the firmware claims this block of memory for its own use", which isn't really illegal by the spec, it's just the equivalent of walking up to Saint Peter and pulling out the old testament and spending the next four hours rules lawyering god.

smackfu
Jun 7, 2004

Always sad when very clever hacks end up in DOA products.

Jeffrey of YOSPOS
Dec 22, 2005

GET LOSE, YOU CAN'T COMPARE WITH MY POWERS
One running operating system is writing to the other one's filesystem journal while the other operating system thinks it is asleep? That is......fully insane. Using ACPI as your hypervisor is kind of a delightful hack but the filesystem part makes it not fun any more.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
It's the sort of thing where one implementation can do something clever assuming everyone else follows the standards, but if it becomes widespread and other systems start trying to interact with it (e.g. by assuming a 512MB reservation must be one particular vendor's hack, and so it can poke into that memory to do something clever) then everything becomes a huge mess.

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

Jen heir rick posted:

Great post. I don't know much about hardware stuff. Why is this such a batshit insane hack?

The bios is interrupting ACPI sleep requests and turning the machine back and forth between jeckyll and hyde, in the name of power efficiency.

shame on an IGA
Apr 8, 2005

Jabor posted:

It's the sort of thing where one implementation can do something clever assuming everyone else follows the standards, but if it becomes widespread and other systems start trying to interact with it (e.g. by assuming a 512KB reservation must be one particular vendor's hack, and so it can poke into that memory to do something clever) then everything becomes a huge mess.

welcome to MS-DOS, we've missed you

Xarn
Jun 26, 2015

This is amazing, and the insanity reminds me of the protected -> real -> protected mode jump hack Windows used for compatibility with old drivers.

Qwertycoatl
Dec 31, 2008

Probably on a system like that without really enough memory, just giving the extra 512MB to Windows would have resulted in a far better product than that insane thing

omeg
Sep 3, 2012

That's some premium cursed poo poo and I love it.

Lime
Jul 20, 2004

Wild to learn that Severance was based on a netbook from 2009

redleader
Aug 18, 2005

Engage according to operational parameters
the most disturbing thing about that post is that it made me realize i've been reading acpi as apci

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
the post was pretty interesting, as someone who doesn't know much about how operating systems work at a low level. but the constant suggestion that one should have an emotional reaction to it was a bit irritating*. I mean all the "ye gods how awful it is that they are doing this! how gross! and now let me tell you about what they do with the file system" stuff. it's all just computer bullshit you don't have to try to make it juicy by talking about how it's the computer equivalent of HP lovecraft or something.

* the author of the piece is less guilty of this than all the commenters reacting to it

nielsm
Jun 1, 2009



That's a really cool hack, and I'm not even sure I would classify it as "insane". No more than the well-accepted EMM386 (and compatible) memory managers from the days of DOS.
That thing calling itself a "memory manager"? It actually switches the CPU into paged protected mode and wraps the entire running DOS system into a virtual-x86 session. That's arguably just as involved if not even more blind-tying a running system and replacing the entire world around it.

Kazinsal
Dec 13, 2011


nielsm posted:

That's a really cool hack, and I'm not even sure I would classify it as "insane". No more than the well-accepted EMM386 (and compatible) memory managers from the days of DOS.
That thing calling itself a "memory manager"? It actually switches the CPU into paged protected mode and wraps the entire running DOS system into a virtual-x86 session. That's arguably just as involved if not even more blind-tying a running system and replacing the entire world around it.

Disagree. That's just called a hypervisor.

Adbot
ADBOT LOVES YOU

nielsm
Jun 1, 2009



Kazinsal posted:

Disagree. That's just called a hypervisor.

A normal hypervisor doesn't install itself underneath an already running system, unless it's malware. A normal boot process loads the hypervisor, then loads the OS. With EMM you boot the OS, then pull the rug and switch in a new layer underneath the kernel.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply