Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
Python definitely has a problem where they underestimated the long-term maintenance costs of a large standard library, but that would have been an issue even if python 3 never existed.

Adbot
ADBOT LOVES YOU

NtotheTC
Dec 31, 2007


It's a case of fighting one battle at a time I think. 3 years ago the majority of people weren't even on python3 and getting people to switch was the main driving force so I suspect a lot of things got put in the "to do later" column

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
I mean, it's weird how "proper Unicode support" was the big impetus for Python 3 and they completely failed on that regard. If the experts making Python 3 can't get the standard library properly Unicode-clean, what chance do you have?

you all know that I've long been an opponent of Python 3's "proper Unicode support" because I don't think the string model they implemented meaningful or correct, and it caused a bunch of churn for no reason, made things harder, and they've ended up putting back most of the things they removed.

Randomly corrupting ZIP files because "filenames should be Unicode" (the format never had that guarantee) is bad.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Yes, the main problem with that article wasn't the content. It's main problem was overselling the importance of the content with a clickbait headline.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
It's been 11 years and we're still trying to convince people it's worth it to bother switch to Python 3, with the latest incentive not being that Python 3 is better, but that Python 2 is end-of-life and won't receive security updates. I'd say that counts as an "Incredible Disaster".

NtotheTC
Dec 31, 2007


It's definitely worth it to switch to python3. It has been for ages. The only people left to convince are people like the luddite in the article that can't understand why "thing that worked OK" isn't around forever

QuarkJets
Sep 8, 2008

Suspicious Dish posted:

It's been 11 years and we're still trying to convince people it's worth it to bother switch to Python 3, with the latest incentive not being that Python 3 is better, but that Python 2 is end-of-life and won't receive security updates. I'd say that counts as an "Incredible Disaster".

Python 3 being better is what I usually see as the advertised reason to switch, and it's true; Python 3 is actually just better, in a lot of ways. But using a better and faster language isn't enough incentive for some people, so you have to also point out that you're going to cut off support soon (and even that won't be enough to convince everyone)

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Suspicious Dish posted:

It's been 11 years and we're still trying to convince people it's worth it to bother switch to Python 3, with the latest incentive not being that Python 3 is better, but that Python 2 is end-of-life and won't receive security updates. I'd say that counts as an "Incredible Disaster".

People weren't not switching to Python 3 because of unicode stuff, people were not switching to Python 3 because python 3 wasn't clearly and obviously better for the effort involved in the switch.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
High migration cost because of the Unicode changes is a big reason people don't switch, and as you can see, even the standard library developers couldn't find a way to make the new string model work well in a lot of scenarios.

I think that's a drastic failure on the part of the language designers. If the standard library can't do something "correctly", perhaps you defined "correctly" wrong. Python 3's string model is a blight on an otherwise great language, and still most of what makes migration difficult.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Suspicious Dish posted:

High migration cost because of the Unicode changes is a big reason people don't switch,

I mean we can keep asserting the opposite at each other, but that's not too productive. All I can say is that in my experience unicode issues are not a large portion of the reason people put off switching.

Like, I agree unicode in python 3 has a lot of nonsense. I've heard your rants about it over the years and I don't recall disagreeing with you on the facts, it's just not been a major factor for people IME.

Hell, most of the time people wrote off doing the migration before they even got far enough in to even realize there was a problem with unicode.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Yeah, it's possible that I've had a more unique experience, because a lot of what I wrote was GNOME code, which was lots of 1) local daemon servers using custom binary protocols (D-Bus), 2) file mangling. I had a very bad experience porting that over to Python 3, and at the time, I remember running into a lot of missing features or "you don't need that"-s from core developers.

I know Mercurial is having a lot of trouble with Unicode in their migration, and it was a big pain for Twisted as well, but I don't know too much about the wider community.

QuarkJets
Sep 8, 2008

Suspicious Dish posted:

High migration cost because of the Unicode changes is a big reason people don't switch, and as you can see, even the standard library developers couldn't find a way to make the new string model work well in a lot of scenarios.

I think that's a drastic failure on the part of the language designers. If the standard library can't do something "correctly", perhaps you defined "correctly" wrong. Python 3's string model is a blight on an otherwise great language, and still most of what makes migration difficult.

In my experience dealing with people who are hesitant to switch, unicode has never even been brought up. YMMV

Xarn
Jun 26, 2015
My experience at job-2 was that migrating to Python 3 fixed a long-standing issue with unicode handling in data our web service was receiving. :shrug:

NtotheTC
Dec 31, 2007


The biggest barrier to adoption that saw with python3 was important third party libraries taking ages to migrate. As soon as momentum built and the big frameworks switched over everything went pretty quickly.

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


Yeah. I thought about switching a long time ago but that was before the scipy stack was compatible, and that was a dealbreaker for me.

UraniumAnchor
May 21, 2006

Not a walrus.
I recently had to get a small-medium codebase ported from 2 to 3 and the biggest problems I had were things that were byte strings before getting double-escaped when dumped to a text stream, such as a csv export.

By the time I pulled the trigger the dozen or so dependencies it has all worked just fine with python 3, so that wasn't a barrier at all, thankfully.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


NtotheTC posted:

The biggest barrier to adoption that saw with python3 was important third party libraries taking ages to migrate. As soon as momentum built and the big frameworks switched over everything went pretty quickly.

That’s what I saw too. Most people I ran into cared mainly that they couldn’t continue using their libs if they switched.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Yeah, it sounds like Python 3 has deep conceptual problems with things like filename encodings, but that stuff just doesn’t matter for the vast majority of people: you need to have both (1) wacky filenames that aren’t legal in Unicode and (2) an inability to fix the problem by just cleaning up your files with some other tool. The fact is that most people treat Python as the write-maybe-twice scripting language that it is. A lot of deep conceptual problems are remarkably easy to work around if you don’t slip into High Dudgeon at the idea of someone else putting work on you.

FrantzX
Jan 28, 2007
I am starting to be of the belief that the assumption that char (a code unit of some encoding) == byte (an unsigned 8-bit integer value) is one of the two biggest mistakes ever made in the design of programming languages.

The other is that sizeof(int) is platform dependent.

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
If your codebase is large enough that porting to python 3 is a major project then it's too large for a dynamically-typed language anyway.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
I think it’s a sensible choice for a systems-oriented language to provide a bunch of fixed-width integer types but focus the language/library around a target-determined type that can express any number of objects that you can have simultaneously in your program. If you’re going to make a principled argument that semantics are more important than performance considerations, I feel that you’re really arguing for bigints as the default integer type.

The real problem in C is that there’s a lot of pressure towards making int an at-most-32-bits type rather than that “any number of objects” type, so everything else is a huge mess. (Also it doesn’t want to assume that integer sizes are multiples of 8.)

FrantzX
Jan 28, 2007
I can see why a platform dependent integer type is needed, pointers for example. However, it shouldn't be the default int type. Going thru APIs like win32 or opengl shows how much effort people go thru nailing down int sizes.

If I had a time machine, this is how integers should work in C and derivatives:

byte / sbyte (maybe with a alias of int8 / sint8 for consistency)
int16 / uint16
int32 / uint32
int64 / uint64
word / sword - a platform dependent integer type that is the native word size for the running system.

Also:

quote:

target-determined type that can express any number of objects that you can have simultaneously in your program.
Isn't this size_t?

Adhemar
Jan 21, 2004

Kellner, da ist ein scheussliches Biest in meiner Suppe.

Plorkyeran posted:

If your codebase is large enough that porting to python 3 is a major project then it's too large for a dynamically-typed language anyway.

:agreed:

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.
python is less retarded than powershell, perl, vbscript, and javascript before ecmascript 5, but i actually prefer programming in modern javascript to python - and i motherfucking hate javascript.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

FrantzX posted:

I can see why a platform dependent integer type is needed, pointers for example. However, it shouldn't be the default int type. Going thru APIs like win32 or opengl shows how much effort people go thru nailing down int sizes.

A word-sized type is almost always justifiably future-proof for a system API. Smaller types are usually unnecessary micro-optimizations outside of situations where a specifically-size type is externally specified (e.g. UTF-8 code units). The only real counter-examples for types that need to go larger are in the interfaces to systems with obviously greater addressing capabilities, like file sizes/offsets or network addresses. Again, the big problem with C is that it’s tempting to use int, which is usually not a good choice.

FrantzX posted:

Isn't this size_t?

Well, I think using an unsigned type would be a mistake, but otherwise yes. If int was ssize_t, I think the whole thing would hang together much better.

redleader
Aug 18, 2005

Engage according to operational parameters
why/how is python uniquely terrible wrt unicode/bytes compared to most (all?) other mainstream languages?

Ola
Jul 19, 2004

Between all of those un-upgradable Python 2 projects, I doubt humanity is losing benign AI or a cancer cure

Foxfire_
Nov 8, 2010

Python3's unicode/bytes filenaming problems aren't python specific; they're mostly fallout from trying to be cross platform and complete in a universe where unix filenames are alien unicorns compared to how they work on windows/macs/sane people's brains.

A filename on unix is not text. It is an arbitrary sequence of any bytes except for 0x00 [reserved for terminator] and [0x2F reserved for separating directories].

Things that are perfectly fine unix filenames:
- Byte sequences that don't decode into valid UTF-8 (e.g. 0x80, 0x01 isn't the right number of bytes to form a UTF-8 codepoint, but it's a legal filename)
- Byte sequences that don't decode into printable ASCII (e.g. "<newline>, <bell>, <bell>, <backspace>, <linefeed>" is a legal filename)
- Byte sequences that decode into shell bombs (try doing 'rm *' in a directory that contains a file named '-rf /' for fun and profit!)

Anything that renders a filename typically either assumes an encoding or takes it from a system-wide setting. But two systems won't necessarily agree about how to display any particular filename (if it is displayable at all).

Python needs the bytes interfaces since you don't want it to explode when it gets a non-unicodable filename, and if it's going to be a useful scripting language you don't want it to be unable to manipulate some files, but people also want to treat filenames as text, since 99.999% of the time they are and that's how people actually think of them.

b0lt
Apr 29, 2005

Foxfire_ posted:

A filename on unix is not text. It is an arbitrary sequence of any bytes except for 0x00 [reserved for terminator] and [0x2F reserved for separating directories].

OS X, a literal UNIX, has far more insane filename behavior than that. APFS, their new file system requires filenames to be a (currently, so no using newly assigned code points until a kernel update) valid unicode string and treats it (UTF-8 encoded) as an opaque string after validation. HFS+, their ancient rusty pile of garbage, performed its own special snowflake unicode decomposition that wasn't equivalent to any of the officially defined unicode normalization forms.

You run into similar problems on every other unixy system whenever you mount a filesystem with a different definition of what consists a valid filename (for example, the EFI-mandated FAT system partition).

Athas
Aug 6, 2007

fuck that joker
One particularly amusing locale thing is locale-aware IO functions. For example, Haskell's functions for reading text from files will pick up the encoding to use from the locale, and use e.g. ASCII when in an ASCII locale, UTF-8 in a UTF-8 locale, and presumably insanity when on Windows. In isolation, this seems reasonable enough.

Of course, with the popularity of minimal container instances, people are really fond of using the C locale, which is pure ASCII, so my Haskell programs end up exploding with an encoding error when asked to read a file containing UTF-8. My solution was to change my code to switch all encoding/decoding to UTF-8, always. If one of my users is passing in a file that is not valid UTF-8, then I'm perfectly ready to say that it's them that is buggy.

Foxfire_
Nov 8, 2010

I'm not super familiar with APFS, so I did some googling and apparently they fixed it's initially stupid behavior.

Now it stores utf-8, preserving the input normalization, but uses a normalization insensitive comparison for locating files. (This is stolen from what ZFS) does.

This seems like the least insane thing possible. It doesn't support distinct files created on some other filesystem named with the same Unicode text in different normalizations, but anyone who tries that deserves what they get.

Posix technically only requires filenames to support Roman alphanumeric+some common symbols, so it's still compliant, just not what unixes have traditionally done

(also, I was wrong and windows is also bad. It's paths are arbitrary sequences of int16s, they don't need to be valid utf-16, and the same text encoded differently isn't the same path)

comedyblissoption
Mar 15, 2006

I mean the fact that windows, unix, and OSX filenames are all not strict UTF-8 suggests that you shouldn't use strict UTF-8 for filename handling and you can blame python 3 for the resultant state of affairs.

Falcorum
Oct 21, 2010

Thermopyle posted:

Also a lot of stuff was accepted into the standard library that shouldn't have been

Let me tell you a story about C++, 2D graphics, audio, and web views... :v:

Dylan16807
May 12, 2010

Foxfire_ posted:

Things that are perfectly fine unix filenames:
- Byte sequences that don't decode into valid UTF-8 (e.g. 0x80, 0x01 isn't the right number of bytes to form a UTF-8 codepoint, but it's a legal filename)
- Byte sequences that don't decode into printable ASCII (e.g. "<newline>, <bell>, <bell>, <backspace>, <linefeed>" is a legal filename)
- Byte sequences that decode into shell bombs (try doing 'rm *' in a directory that contains a file named '-rf /' for fun and profit!)
All of these are the same on windows. Except that it's invalid UTF-16 rather than invalid UTF-8.

Edit: I mean, shell globbing works differently but that's not a filesystem thing.

Dylan16807 fucked around with this message at 03:01 on Nov 24, 2019

Nude
Nov 16, 2014

I have no idea what I'm doing.

darthbob88 posted:

I just ran it through this sandbox, and yeah, "j hit 17".
code:
<?php
        //Enter your code here, enjoy!
for ($i = 0, $j = 50; $i< 100; $i++) {
    while ($j--) {
        if ($j == 17) goto end;
    }
}
echo "i = $i";
end: 
    echo "j hit 17";

Was messing around with this and does someone have a good reason why:
PHP code:
<?php
$j = 2;
for (;;) {
    while ($j--) {
        if ($j == -10) {
            break 2;
        }
    }
}
echo "j is $j"; // j is -10
    
$j = 2;
while($j--) {
    if ($j == -10) {
        break;
    }
}
echo "j is $j"; // j is -1
I get while(0) ends the loop. But why when nested in the for loop is this avoided?

Nude fucked around with this message at 03:19 on Nov 24, 2019

NiceAaron
Oct 19, 2003

Devote your hearts to the cause~

Nude posted:

Was messing around with this and does someone have a good reason why:
PHP code:
<?php
$j = 2;
for (;;) {
    while ($j--) {
        if ($j == -10) {
            break 2;
        }
    }
}
echo "j is $j"; // j is -10
    
$j = 2;
while($j--) {
    if ($j == -10) {
        break;
    }
}
echo "j is $j"; // j is -1
I get while(0) ends the loop. But why when nested in the for loop is this avoided?

In the first block of code, the while loop is inside of an endless for loop so when the while loop exits (without hitting the "break 2") it just gets restarted. The first execution of the while loop ends when $j-- evaluates to 0, so $j is already -1 when the second execution of the while loop starts. (Which then continues until it hits the "$j == -10" condition and breaks out of all of the loops.)

In the second block of code, there is no outer for loop, so the while loop only executes once.

SupSuper
Apr 8, 2009

At the Heart of the city is an Alien horror, so vile and so powerful that not even death can claim it.

Nude posted:

I get while(0) ends the loop. But why when nested in the for loop is this avoided?
The while "$j--" condition is always run even if the while body isn't.
By wrapping it in an infinite "for" loop, you ensure $j will keep going into the negatives (which is non-zero).

If you put the "$j--" in the body instead, then $j would never go below 0 (and you would get an infinite loop):
php:
<?
for (;;) {
    while ($j) {
        $j--;
        if ($j == -10) {
            break 2;
        }
    }
}
?>

Nude
Nov 16, 2014

I have no idea what I'm doing.
Ah okay, ty, that actually makes sense.

repiv
Aug 13, 2009

https://blocksandfiles.com/2019/11/25/hpe-issues-firmware-fix-to-to-stop-ssd-failure

Signed 16-bit is enough to store the total-hours-run counter, right? :downs:

Adbot
ADBOT LOVES YOU

Doc Hawkins
Jun 15, 2010

Dashing? But I'm not even moving!


the current target of my mild ire is nullthrows, a completely unnecessary external dependency that doesn't even do its job well

no points for guessing that it's javascript

code:
function nullthrows(x, message) {
  if (x != null) {
    return x;
  }
  var error = new Error(message !== undefined ? message : 'Got unexpected ' + x);
  error.framesToPop = 1; // Skip nullthrows's own stack frame.
  throw error;
}
that's it

that's the whole package

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply