|
Pie Colony posted:that's because CR isn't a line break it is on Mac and Apple II thanks stebe
|
# ? Jun 15, 2017 18:16 |
|
|
# ? Jun 1, 2024 03:27 |
|
Anyone got any tips for dealing with lovely json in python? I've got a bunch of scraped json data that needs to be put into dataframes but there are errant commas, apostrophes and quotation marks in there (gently caress people and their stupid names and company) that keep breaking the standard parsing libraries and its driving me insane
|
# ? Jun 15, 2017 19:01 |
|
sounds malformed imo
|
# ? Jun 15, 2017 19:08 |
|
ideally you should get the other party to fix their json whether or not you can do that (but ESPECIALLY if you can't do that and need to come up with some loose frankenparser hack), put it all down in writing and make sure SOMEONE knows that you're being sent corrupted data and any complications or delays resulting are the other party's fault
|
# ? Jun 15, 2017 19:45 |
|
NihilCredo posted:ideally you should get the other party to fix their json lol im a grad student, no one values my time, not even me
|
# ? Jun 15, 2017 19:51 |
|
vodkat posted:Anyone got any tips for dealing with lovely json in python? BUT JSON IS SO GOOD AND MUCH BETTER THAN XML!! HOW CAN THIS BE???L?
|
# ? Jun 15, 2017 21:15 |
|
write a custom json parser to deal specifically with your lovely json, it's not hard
|
# ? Jun 15, 2017 21:18 |
|
Shaggar posted:BUT JSON IS SO GOOD AND MUCH BETTER THAN XML!! HOW CAN THIS BE???L? How do you deal with corrupted or invalid XML, typically?
|
# ? Jun 15, 2017 21:19 |
|
Regex that poo poo into the ground of course
|
# ? Jun 15, 2017 21:29 |
|
vodkat posted:lol im a grad student, no one values my time, not even me is it possible at all to work with the source to fix their bug do they make it obvious they're using something specifically lovely and hand rolled
|
# ? Jun 15, 2017 22:12 |
|
Hi, This is the guy using your stuff. It looks like your JSON is not valid according to several known libraries and parsers. Maybe we can work together to fix up your broken poo poo?? What language are you using Love, the grad student with a heart of gold
|
# ? Jun 15, 2017 22:14 |
|
qhat posted:write a custom json parser to deal specifically with your lovely json, it's not hard necrotic posted:Regex that poo poo into the ground of course this is what i've done and it seems to be working for now. still a lovely and tedious way to waste away my day
|
# ? Jun 15, 2017 22:41 |
|
Doom Mathematic posted:How do you deal with corrupted or invalid XML, typically? i have someone unironically trying to get me to generate a checksum of an xml file because 'if we download it and corrupt it we could maybe get one character wrong somewhere in the body of the file but it would still be valid' i mean come on, if you're so worried that ftp is gonna corrupt your downloada how do you know the checksum isn't corrupt too?
|
# ? Jun 15, 2017 23:01 |
|
there's... nothing wrong with asking for a checksum???
|
# ? Jun 15, 2017 23:31 |
|
JewKiller 3000 posted:there's... nothing wrong with asking for a checksum??? if we have some subtle corruption in the file then we would probably corrupt the checksum too because we'd have corrupted it at source. actual point to point ftp corruption is a non issue as far as i am concerned it's like the auditors that want to watch me run a db query because they want to go line by line through the output to prove that it's the same as downloading it from the UI. like sure, you can do that but don't kid yourself that you've done anything worthwhile with your time.
|
# ? Jun 15, 2017 23:52 |
|
i don't really see the problem? if both the file and the checksum are corrupted, they (probably) won't match, so just retry use a proper hash function instead of a crappy checksum to get rid of the 'probably' qualified
|
# ? Jun 15, 2017 23:55 |
|
Yeah if you care about the integrity over network use a checksum.
|
# ? Jun 16, 2017 00:00 |
|
holy loving poo poo c++ std::regex is irredeemably terrible in every way
|
# ? Jun 16, 2017 00:04 |
|
vodkat posted:lol im a grad student, no one values my time, not even me if I was doing this I'd first try to write code to sanitize the input into parseable JSON, and only if that proved to be difficult would I try to parse the lovely JSON myself I call this principle "separation of bailing wire"
|
# ? Jun 16, 2017 00:11 |
|
redleader posted:i don't really see the problem? if both the file and the checksum are corrupted, they (probably) won't match, so just retry my point to them was that a partial download was the only failure case and as it's xml that would always be a malformed file and fail to parse so why bother with a checksum because if you're that paranoid about random bits flipping or whatever just give up now though i guess while they're doing this the offshore devs can't be loving anything else up so there is that
|
# ? Jun 16, 2017 00:21 |
|
Soricidus posted:holy loving poo poo c++ std::regex is irredeemably terrible in every way I like RE2 for regexes in c++
|
# ? Jun 16, 2017 00:42 |
|
Powerful Two-Hander posted:i have someone unironically trying to get me to generate a checksum of an xml file because 'if we download it and corrupt it we could maybe get one character wrong somewhere in the body of the file but it would still be valid' sign it instead.
|
# ? Jun 16, 2017 00:50 |
|
Powerful Two-Hander posted:my point to them was that a partial download was the only failure case and as it's xml that would always be a malformed file and fail to parse so why bother with a checksum because if you're that paranoid about random bits flipping or whatever just give up now i'm not familiar with the failure modes of ftp, but if you're only worried about partial downloads then normal xml parsing will do intuitively it seems like you're more likely to run into random bit flips over the network than from local storage or memory, but idk about any of that poo poo. if so then a checksum or whatever might help a bit it's probably one of those can't hurt, won't help situations
|
# ? Jun 16, 2017 02:11 |
|
necrotic posted:Regex that poo poo into the ground of course
|
# ? Jun 16, 2017 03:19 |
|
",".join(list) python can be so loving terrible sometimes
|
# ? Jun 16, 2017 03:59 |
|
qhat posted:",".join(list) I used to hate this so much (and the only reason I still don't is that I don't write much Python any more).
|
# ? Jun 16, 2017 04:16 |
|
Star War Sex Parrot posted:cls: there's something really cool/terrifying about injecting data and/or code into memory and then overwriting a function's return address to execute that new code, all from an input string
|
# ? Jun 16, 2017 04:27 |
|
qhat posted:",".join(list) i was really surprised at how quickly i became stockholm-syndromed to this particular idiom
|
# ? Jun 16, 2017 04:32 |
qhat posted:",".join(list) yeha there needs to be a list.join(','). like bitch you know what i want give it to me
|
|
# ? Jun 16, 2017 04:33 |
|
there should be one, and preferably only one, obvious* way to do it * to guido, and no one else
|
# ? Jun 16, 2017 04:33 |
JewKiller 3000 posted:there should be one, and preferably only one, obvious* way to do it dont quote the zen of python
|
|
# ? Jun 16, 2017 04:34 |
|
qhat posted:",".join(list) python is terrible all of the time
|
# ? Jun 16, 2017 04:42 |
|
Shaggar posted:python is terrible all of the time the more i think about this the truer it becomes
|
# ? Jun 16, 2017 05:00 |
|
i think by far the worst thing about python is the people who are so called experts at python like someone asks a question about how to do something weird in python on SO or something and gets ambushed by a half dozen cunts trying to impose their best practice autism on the poster
|
# ? Jun 16, 2017 05:04 |
|
that's core to the Linux/p-lang culture.
|
# ? Jun 16, 2017 05:11 |
|
That's pretty much core to any language ever in the history of computer programming.
|
# ? Jun 16, 2017 05:39 |
|
Skim Milk posted:yeha there needs to be a list.join(','). like bitch you know what i want give it to me "thing".join("other thing") would be impossible if it went both ways though edit- here's what happens for those of you who havent been braindamaged by python yet code:
|
# ? Jun 16, 2017 05:51 |
|
the worst thing about python is that it looks so easy that it's almost impossible to keep a beginner from trying it and wasting their brain dijkstra-basic-style
|
# ? Jun 16, 2017 05:51 |
|
John Big Booty posted:That's pretty much core to any language ever in the history of computer programming. not for c# or java
|
# ? Jun 16, 2017 05:56 |
|
|
# ? Jun 1, 2024 03:27 |
|
Python is really good for little things and really bad for big things p much
|
# ? Jun 16, 2017 05:56 |