Coding Horrors: You can gather all your technical debt into one easy framework!

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›1503 »

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Python definitely has a problem where they underestimated the long-term maintenance costs of a large standard library, but that would have been an issue even if python 3 never existed.

# ? Nov 22, 2019 19:52

Adbot: ADBOT LOVES YOU

# ? Jun 8, 2024 08:17

NtotheTC: Dec 31, 2007

It's a case of fighting one battle at a time I think. 3 years ago the majority of people weren't even on python3 and getting people to switch was the main driving force so I suspect a lot of things got put in the "to do later" column

# ? Nov 22, 2019 20:01

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

I mean, it's weird how "proper Unicode support" was the big impetus for Python 3 and they completely failed on that regard. If the experts making Python 3 can't get the standard library properly Unicode-clean, what chance do you have?

you all know that I've long been an opponent of Python 3's "proper Unicode support" because I don't think the string model they implemented meaningful or correct, and it caused a bunch of churn for no reason, made things harder, and they've ended up putting back most of the things they removed.

Randomly corrupting ZIP files because "filenames should be Unicode" (the format never had that guarantee) is bad.

# ? Nov 22, 2019 20:36

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

Yes, the main problem with that article wasn't the content. It's main problem was overselling the importance of the content with a clickbait headline.

# ? Nov 22, 2019 21:15

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

It's been 11 years and we're still trying to convince people it's worth it to bother switch to Python 3, with the latest incentive not being that Python 3 is better, but that Python 2 is end-of-life and won't receive security updates. I'd say that counts as an "Incredible Disaster".

# ? Nov 22, 2019 22:08

NtotheTC: Dec 31, 2007

It's definitely worth it to switch to python3. It has been for ages. The only people left to convince are people like the luddite in the article that can't understand why "thing that worked OK" isn't around forever

# ? Nov 22, 2019 22:13

QuarkJets: Sep 8, 2008

Suspicious Dish posted:

It's been 11 years and we're still trying to convince people it's worth it to bother switch to Python 3, with the latest incentive not being that Python 3 is better, but that Python 2 is end-of-life and won't receive security updates. I'd say that counts as an "Incredible Disaster".

Python 3 being better is what I usually see as the advertised reason to switch, and it's true; Python 3 is actually just better, in a lot of ways. But using a better and faster language isn't enough incentive for some people, so you have to also point out that you're going to cut off support soon (and even that won't be enough to convince everyone)

# ? Nov 22, 2019 22:33

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

Suspicious Dish posted:

It's been 11 years and we're still trying to convince people it's worth it to bother switch to Python 3, with the latest incentive not being that Python 3 is better, but that Python 2 is end-of-life and won't receive security updates. I'd say that counts as an "Incredible Disaster".

People weren't not switching to Python 3 because of unicode stuff, people were not switching to Python 3 because python 3 wasn't clearly and obviously better for the effort involved in the switch.

# ? Nov 22, 2019 22:58

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

High migration cost because of the Unicode changes is a big reason people don't switch, and as you can see, even the standard library developers couldn't find a way to make the new string model work well in a lot of scenarios.

I think that's a drastic failure on the part of the language designers. If the standard library can't do something "correctly", perhaps you defined "correctly" wrong. Python 3's string model is a blight on an otherwise great language, and still most of what makes migration difficult.

# ? Nov 22, 2019 23:14

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

Suspicious Dish posted:

High migration cost because of the Unicode changes is a big reason people don't switch,

I mean we can keep asserting the opposite at each other, but that's not too productive. All I can say is that in my experience unicode issues are not a large portion of the reason people put off switching.

Like, I agree unicode in python 3 has a lot of nonsense. I've heard your rants about it over the years and I don't recall disagreeing with you on the facts, it's just not been a major factor for people IME.

Hell, most of the time people wrote off doing the migration before they even got far enough in to even realize there was a problem with unicode.

# ? Nov 22, 2019 23:27

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

Yeah, it's possible that I've had a more unique experience, because a lot of what I wrote was GNOME code, which was lots of 1) local daemon servers using custom binary protocols (D-Bus), 2) file mangling. I had a very bad experience porting that over to Python 3, and at the time, I remember running into a lot of missing features or "you don't need that"-s from core developers.

I know Mercurial is having a lot of trouble with Unicode in their migration, and it was a big pain for Twisted as well, but I don't know too much about the wider community.

# ? Nov 22, 2019 23:32

QuarkJets: Sep 8, 2008

Suspicious Dish posted:

High migration cost because of the Unicode changes is a big reason people don't switch, and as you can see, even the standard library developers couldn't find a way to make the new string model work well in a lot of scenarios.

I think that's a drastic failure on the part of the language designers. If the standard library can't do something "correctly", perhaps you defined "correctly" wrong. Python 3's string model is a blight on an otherwise great language, and still most of what makes migration difficult.

In my experience dealing with people who are hesitant to switch, unicode has never even been brought up. YMMV

# ? Nov 22, 2019 23:42

Xarn: Jun 26, 2015

My experience at job-2 was that migrating to Python 3 fixed a long-standing issue with unicode handling in data our web service was receiving. :shrug:

# ? Nov 22, 2019 23:59

NtotheTC: Dec 31, 2007

The biggest barrier to adoption that saw with python3 was important third party libraries taking ages to migrate. As soon as momentum built and the big frameworks switched over everything went pretty quickly.

# ? Nov 23, 2019 00:18

ultrafilter: Aug 23, 2007; It's okay if you have any questions.

Yeah. I thought about switching a long time ago but that was before the scipy stack was compatible, and that was a dealbreaker for me.

# ? Nov 23, 2019 01:57

UraniumAnchor: May 21, 2006; Not a walrus.

I recently had to get a small-medium codebase ported from 2 to 3 and the biggest problems I had were things that were byte strings before getting double-escaped when dumped to a text stream, such as a csv export.

By the time I pulled the trigger the dozen or so dependencies it has all worked just fine with python 3, so that wasn't a barrier at all, thankfully.

# ? Nov 23, 2019 02:08

Ruggan: Feb 20, 2007; WHAT THAT SMELL LIKE?!

NtotheTC posted:

The biggest barrier to adoption that saw with python3 was important third party libraries taking ages to migrate. As soon as momentum built and the big frameworks switched over everything went pretty quickly.

That�s what I saw too. Most people I ran into cared mainly that they couldn�t continue using their libs if they switched.

# ? Nov 23, 2019 03:05

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Yeah, it sounds like Python 3 has deep conceptual problems with things like filename encodings, but that stuff just doesn�t matter for the vast majority of people: you need to have both (1) wacky filenames that aren�t legal in Unicode and (2) an inability to fix the problem by just cleaning up your files with some other tool. The fact is that most people treat Python as the write-maybe-twice scripting language that it is. A lot of deep conceptual problems are remarkably easy to work around if you don�t slip into High Dudgeon at the idea of someone else putting work on you.

# ? Nov 23, 2019 03:28

FrantzX: Jan 28, 2007

I am starting to be of the belief that the assumption that char (a code unit of some encoding) == byte (an unsigned 8-bit integer value) is one of the two biggest mistakes ever made in the design of programming languages.

The other is that sizeof(int) is platform dependent.

# ? Nov 23, 2019 04:55

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

If your codebase is large enough that porting to python 3 is a major project then it's too large for a dynamically-typed language anyway.

# ? Nov 23, 2019 05:18

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

I think it�s a sensible choice for a systems-oriented language to provide a bunch of fixed-width integer types but focus the language/library around a target-determined type that can express any number of objects that you can have simultaneously in your program. If you�re going to make a principled argument that semantics are more important than performance considerations, I feel that you�re really arguing for bigints as the default integer type.

The real problem in C is that there�s a lot of pressure towards making int an at-most-32-bits type rather than that �any number of objects� type, so everything else is a huge mess. (Also it doesn�t want to assume that integer sizes are multiples of 8.)

# ? Nov 23, 2019 05:22

FrantzX: Jan 28, 2007

I can see why a platform dependent integer type is needed, pointers for example. However, it shouldn't be the default int type. Going thru APIs like win32 or opengl shows how much effort people go thru nailing down int sizes.

If I had a time machine, this is how integers should work in C and derivatives:

byte / sbyte (maybe with a alias of int8 / sint8 for consistency)
int16 / uint16
int32 / uint32
int64 / uint64
word / sword - a platform dependent integer type that is the native word size for the running system.

Also:

quote:

target-determined type that can express any number of objects that you can have simultaneously in your program.

Isn't this size_t?

# ? Nov 23, 2019 05:55

Adhemar: Jan 21, 2004; Kellner, da ist ein scheussliches Biest in meiner Suppe.

Plorkyeran posted:

If your codebase is large enough that porting to python 3 is a major project then it's too large for a dynamically-typed language anyway.

# ? Nov 23, 2019 05:55

Bruegels Fuckbooks: Sep 14, 2004; Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

python is less retarded than powershell, perl, vbscript, and javascript before ecmascript 5, but i actually prefer programming in modern javascript to python - and i motherfucking hate javascript.

# ? Nov 23, 2019 06:05

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

FrantzX posted:

I can see why a platform dependent integer type is needed, pointers for example. However, it shouldn't be the default int type. Going thru APIs like win32 or opengl shows how much effort people go thru nailing down int sizes.

A word-sized type is almost always justifiably future-proof for a system API. Smaller types are usually unnecessary micro-optimizations outside of situations where a specifically-size type is externally specified (e.g. UTF-8 code units). The only real counter-examples for types that need to go larger are in the interfaces to systems with obviously greater addressing capabilities, like file sizes/offsets or network addresses. Again, the big problem with C is that it�s tempting to use int, which is usually not a good choice.

FrantzX posted:

Isn't this size_t?

Well, I think using an unsigned type would be a mistake, but otherwise yes. If int was ssize_t, I think the whole thing would hang together much better.

# ? Nov 23, 2019 06:33

redleader: Aug 18, 2005; Engage according to operational parameters

why/how is python uniquely terrible wrt unicode/bytes compared to most (all?) other mainstream languages?

# ? Nov 23, 2019 07:38

Ola: Jul 19, 2004

Between all of those un-upgradable Python 2 projects, I doubt humanity is losing benign AI or a cancer cure

# ? Nov 23, 2019 07:42

Foxfire_: Nov 8, 2010

Python3's unicode/bytes filenaming problems aren't python specific; they're mostly fallout from trying to be cross platform and complete in a universe where unix filenames are alien unicorns compared to how they work on windows/macs/sane people's brains.

A filename on unix is not text. It is an arbitrary sequence of any bytes except for 0x00 [reserved for terminator] and [0x2F reserved for separating directories].

Things that are perfectly fine unix filenames:
- Byte sequences that don't decode into valid UTF-8 (e.g. 0x80, 0x01 isn't the right number of bytes to form a UTF-8 codepoint, but it's a legal filename)
- Byte sequences that don't decode into printable ASCII (e.g. "<newline>, <bell>, <bell>, <backspace>, <linefeed>" is a legal filename)
- Byte sequences that decode into shell bombs (try doing 'rm *' in a directory that contains a file named '-rf /' for fun and profit!)

Anything that renders a filename typically either assumes an encoding or takes it from a system-wide setting. But two systems won't necessarily agree about how to display any particular filename (if it is displayable at all).

Python needs the bytes interfaces since you don't want it to explode when it gets a non-unicodable filename, and if it's going to be a useful scripting language you don't want it to be unable to manipulate some files, but people also want to treat filenames as text, since 99.999% of the time they are and that's how people actually think of them.

# ? Nov 23, 2019 09:45

b0lt: Apr 29, 2005

Foxfire_ posted:

A filename on unix is not text. It is an arbitrary sequence of any bytes except for 0x00 [reserved for terminator] and [0x2F reserved for separating directories].

OS X, a literal UNIX, has far more insane filename behavior than that. APFS, their new file system requires filenames to be a (currently, so no using newly assigned code points until a kernel update) valid unicode string and treats it (UTF-8 encoded) as an opaque string after validation. HFS+, their ancient rusty pile of garbage, performed its own special snowflake unicode decomposition that wasn't equivalent to any of the officially defined unicode normalization forms.

You run into similar problems on every other unixy system whenever you mount a filesystem with a different definition of what consists a valid filename (for example, the EFI-mandated FAT system partition).

# ? Nov 23, 2019 10:45

Athas: Aug 6, 2007; fuck that joker

One particularly amusing locale thing is locale-aware IO functions. For example, Haskell's functions for reading text from files will pick up the encoding to use from the locale, and use e.g. ASCII when in an ASCII locale, UTF-8 in a UTF-8 locale, and presumably insanity when on Windows. In isolation, this seems reasonable enough.

Of course, with the popularity of minimal container instances, people are really fond of using the C locale, which is pure ASCII, so my Haskell programs end up exploding with an encoding error when asked to read a file containing UTF-8. My solution was to change my code to switch all encoding/decoding to UTF-8, always. If one of my users is passing in a file that is not valid UTF-8, then I'm perfectly ready to say that it's them that is buggy.

# ? Nov 23, 2019 11:25

Foxfire_: Nov 8, 2010

I'm not super familiar with APFS, so I did some googling and apparently they fixed it's initially stupid behavior.

Now it stores utf-8, preserving the input normalization, but uses a normalization insensitive comparison for locating files. (This is stolen from what ZFS) does.

This seems like the least insane thing possible. It doesn't support distinct files created on some other filesystem named with the same Unicode text in different normalizations, but anyone who tries that deserves what they get.

Posix technically only requires filenames to support Roman alphanumeric+some common symbols, so it's still compliant, just not what unixes have traditionally done

(also, I was wrong and windows is also bad. It's paths are arbitrary sequences of int16s, they don't need to be valid utf-16, and the same text encoded differently isn't the same path)

# ? Nov 23, 2019 11:51

comedyblissoption: Mar 15, 2006

I mean the fact that windows, unix, and OSX filenames are all not strict UTF-8 suggests that you shouldn't use strict UTF-8 for filename handling and you can blame python 3 for the resultant state of affairs.

# ? Nov 23, 2019 12:56

Falcorum: Oct 21, 2010

Thermopyle posted:

Also a lot of stuff was accepted into the standard library that shouldn't have been

Let me tell you a story about C++, 2D graphics, audio, and web views... :v:

# ? Nov 24, 2019 02:01

Dylan16807: May 12, 2010

Foxfire_ posted:

Things that are perfectly fine unix filenames:
- Byte sequences that don't decode into valid UTF-8 (e.g. 0x80, 0x01 isn't the right number of bytes to form a UTF-8 codepoint, but it's a legal filename)
- Byte sequences that don't decode into printable ASCII (e.g. "<newline>, <bell>, <bell>, <backspace>, <linefeed>" is a legal filename)
- Byte sequences that decode into shell bombs (try doing 'rm *' in a directory that contains a file named '-rf /' for fun and profit!)

All of these are the same on windows. Except that it's invalid UTF-16 rather than invalid UTF-8.

Edit: I mean, shell globbing works differently but that's not a filesystem thing.

Dylan16807 fucked around with this message at 03:01 on Nov 24, 2019

# ? Nov 24, 2019 02:53

Nude: Nov 16, 2014; I have no idea what I'm doing.

darthbob88 posted:

I just ran it through this sandbox, and yeah, "j hit 17".

code:

<?php
        //Enter your code here, enjoy!
for ($i = 0, $j = 50; $i< 100; $i++) {
    while ($j--) {
        if ($j == 17) goto end;
    }
}
echo "i = $i";
end: 
    echo "j hit 17";

Was messing around with this and does someone have a good reason why:

PHP code:

<?php
$j = 2;
for (;;) {
    while ($j--) {
        if ($j == -10) {
            break 2;
        }
    }
}
echo "j is $j"; // j is -10
    
$j = 2;
while($j--) {
    if ($j == -10) {
        break;
    }
}
echo "j is $j"; // j is -1

I get while(0) ends the loop. But why when nested in the for loop is this avoided?

Nude fucked around with this message at 03:19 on Nov 24, 2019

# ? Nov 24, 2019 03:13

NiceAaron: Oct 19, 2003; Devote your hearts to the cause~

Nude posted:

Was messing around with this and does someone have a good reason why:
PHP code:
<?php
$j = 2;
for (;;) {
    while ($j--) {
        if ($j == -10) {
            break 2;
        }
    }
}
echo "j is $j"; // j is -10
    
$j = 2;
while($j--) {
    if ($j == -10) {
        break;
    }
}
echo "j is $j"; // j is -1
I get while(0) ends the loop. But why when nested in the for loop is this avoided?

In the first block of code, the while loop is inside of an endless for loop so when the while loop exits (without hitting the "break 2") it just gets restarted. The first execution of the while loop ends when $j-- evaluates to 0, so $j is already -1 when the second execution of the while loop starts. (Which then continues until it hits the "$j == -10" condition and breaks out of all of the loops.)

In the second block of code, there is no outer for loop, so the while loop only executes once.

# ? Nov 24, 2019 03:45

SupSuper: Apr 8, 2009; At the Heart of the city is an Alien horror, so vile and so powerful that not even death can claim it.

Nude posted:

I get while(0) ends the loop. But why when nested in the for loop is this avoided?

The while "$j--" condition is always run even if the while body isn't.
By wrapping it in an infinite "for" loop, you ensure $j will keep going into the negatives (which is non-zero).

If you put the "$j--" in the body instead, then $j would never go below 0 (and you would get an infinite loop):

php:

<?
for (;;) {
    while ($j) {
        $j--;
        if ($j == -10) {
            break 2;
        }
    }
}
?>

# ? Nov 24, 2019 03:46

Nude: Nov 16, 2014; I have no idea what I'm doing.

Ah okay, ty, that actually makes sense.

# ? Nov 24, 2019 04:00

repiv: Aug 13, 2009

https://blocksandfiles.com/2019/11/25/hpe-issues-firmware-fix-to-to-stop-ssd-failure

Signed 16-bit is enough to store the total-hours-run counter, right? :downs:

# ? Nov 26, 2019 15:36

Adbot: ADBOT LOVES YOU

# ? Jun 8, 2024 08:17

Doc Hawkins: Jun 15, 2010; Dashing? But I'm not even moving!

the current target of my mild ire is nullthrows, a completely unnecessary external dependency that doesn't even do its job well

no points for guessing that it's javascript

code:

function nullthrows(x, message) {
  if (x != null) {
    return x;
  }
  var error = new Error(message !== undefined ? message : 'Got unexpected ' + x);
  error.framesToPop = 1; // Skip nullthrows's own stack frame.
  throw error;
}

that's it

that's the whole package

# ? Nov 26, 2019 18:33

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›1503 »