Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
canis minor
May 4, 2011

QuarkJets posted:

The performance hit for using CSV is that the files are really big, because they're ASCII text. That's okay if you don't have much data or are training on one computer, but a lot of training is done with supercomputers, and all of the nodes you're using need access to the data, so your performance is dependent on how much data needs to be moved around. Distributed file systems like gpfs and hdfs try to make that process as straightforward as possible for the user but there are still performance ramifications under the hood

The performance hit for using SQL is similar but different, depending on implementation. If the nodes all need to query a central database for data, then that is a bottleneck. If you just use the db to create a CSV (or even a more compact format) and then use that, then that's a bottleneck. If you're just using the db to log changes to CSV files then that seems pretty innocuous to me, but I wonder if something like sticking the CSVs into a git repo might be better

Excellent, thank you very much - so the problem is bottleneck of access to database vs filesystem - that makes complete sense.

Yes, a separate DB will be used to log changes to CSVs - I guess you could use a repo, but I think you might end up with branching out how the algorithm works, so that might be the reasoning there - to categorize datasets in a sane, describable manner.

Adbot
ADBOT LOVES YOU

streetlamp
May 7, 2007

Danny likes his party hat
He does not like his banana hat

iospace
Jan 19, 2038



... why.

I mean it works, but why.

return0
Apr 11, 2007

The horror is that it could be package-private but isn’t.

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



return0 posted:

The horror is that it could be package-private but isn’t.

Meh, easier to unit test that way. I do not actually believe there are tests for that, though.

csammis
Aug 26, 2003

Mental Institution

areEqual :colbert:

HappyHippo
Nov 19, 2003
Do you have an Air Miles Card?

Incredible. Where did you find this?

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


HappyHippo posted:

Incredible. Where did you find this?

I saw it on reddit a couple of days ago.

VikingofRock
Aug 24, 2008




Nth Doctor posted:

I saw it on reddit a couple of days ago.

My favorite thing out of that thread was this great regex:

Python code:

def prime(n):
  return not re.match(r'^.?$|^(..+)\1+$', 'p' * n)

xtal
Jan 9, 2011

by Fluffdaddy

VikingofRock posted:

My favorite thing out of that thread was this great regex:

Python code:
def prime(n):
  return not re.match(r'^.?$|^(..+)\1+$', 'p' * n)

I spent a moment trying to figure out how that worked, before realizing I preferred not knowing.

TooMuchAbstraction
Oct 14, 2012

I spent four years making
Waves of Steel
Hell yes I'm going to turn my avatar into an ad for it.
Fun Shoe
It looks like it's subverting the regex system to do division on strings that look like 'pppppppppppppp' where the number of 'p's in the string equals the number being tested. It probably involves backwards lookahead or whatever that term is called.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

xtal posted:

I spent a moment trying to figure out how that worked

very slowly, I suspect

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

TooMuchAbstraction posted:

It looks like it's subverting the regex system to do division on strings that look like 'pppppppppppppp' where the number of 'p's in the string equals the number being tested. It probably involves backwards lookahead or whatever that term is called.

the \1 is a backreference

(..+) is "two or more characters"
(..+)\1+ is "some sequence of two or more characters, repeated two or more times"

if that disjunct matches on a string of length n then it implies that the string's length is composite.

if the string consists of a specific character repeated n times (which it does here), if that disjunct fails to match then it means that n is either composite, 0, or 1.

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

Hammerite posted:

the \1 is a backreference

(..+) is "two or more characters"
(..+)\1+ is "some sequence of two or more characters, repeated two or more times"

if that disjunct matches on a string of length n then it implies that the string's length is composite.

if the string consists of a specific character repeated n times (which it does here), if that disjunct fails to match then it means that n is either composite, 0, or 1.

So in short:
1. Make a string with a number of P's matching the number contained in the string.
2. Pass that string to a regular expression that matches if it contains sequences of two or more characters that are repeated two or more times (effectively doing trial division)
3. If the strings don't match, it's prime.

VikingofRock
Aug 24, 2008




Bruegels Fuckbooks posted:

So in short:
1. Make a string with a number of P's matching the number contained in the string.
2. Pass that string to a regular expression that matches if it contains sequences of two or more characters that are repeated two or more times (effectively doing trial division)
3. If the strings don't match, it's prime.

Yup. The stuff before the | is just special casing 0 and 1.

Vanadium
Jan 8, 2005

I love how concise that is. Just needs a sufficiently smart regex engine.

Foxfire_
Nov 8, 2010

csv is probably the best long term storage format if you want a dataset to still be useable in 20 years.

HDF5 or a database will probably be long abandoned by then and their file formats are complicated and hard to reverse engineer from the data. You can go from unknown old datafile -> it's csv -> it's this particular flavor of csv -> here's a program that reads it fairly easily

SardonicTyrant
Feb 26, 2016

BTICH IM A NEWT
熱くなれ夢みた明日を
必ずいつかつかまえる
走り出せ振り向くことなく
&



I've actually used that specific try/catch trick for a test framework. Although it wasn't...that.

xtal
Jan 9, 2011

by Fluffdaddy

SardonicTyrant posted:

I've actually used that specific try/catch trick for a test framework. Although it wasn't...that.

https://github.com/seattlerb/minitest/blob/e6bc4485730403faff6966c1671cf5de72b2d233/lib/minitest/assertions.rb#L321

That's how basically anyone does it

coca bojangles
Nov 2, 2011

Superhuman hindsight

streetlamp posted:

uh why would someone write a little JS snippet like this

code:
document.write("<scr" + "ipt language='JavaScr" + "ipt' type='text\/javascr" + "ipt' src='https://notmyworkplace.com/cgi-bin/cgiwrap/getcaldetail.pl?ID="+pair[1]+"'><\/scr" + "ipt>");

That's how the MySpace worm worked iirc.

https://samy.pl/myspace/tech.html

Khorne
May 1, 2002

Suspicious Dish posted:

Old browsers were really bad and would parse <script>"<script>"</script> as a nested script tag because that's sort of weird. The text/javascript replacement either to get around some extremely sort of basic filter / block or a misunderstanding of what was going on with the "scr" + "ipt" thing. Note, that this hack hasn't been necessary since IE5.
I saw this advocated in an 2018 article.

Khorne fucked around with this message at 03:08 on Sep 7, 2018

Gun Metal Cray
Apr 27, 2005

Pillbug
https://twitter.com/bitnk/status/935494635379716098

Soricidus
Oct 21, 2010
freedom-hating statist shill

xss counts as rce now? Or did they do more than pop an alert

Ola
Jul 19, 2004

Soricidus posted:

xss counts as rce now? Or did they do more than pop an alert

Even if the book is a success and they write a followup, I think any serious exploit will make for an unwieldy book title. It's a real life Bobby Tables type of thing, that's awesome.

hailthefish
Oct 24, 2010

I'm just disappointed they didn't write it under the pen name "Heinz'); DROP TABLE Authors; --"

boo_radley
Dec 30, 2005

Politeness costs nothing
Making the rounds on Twitter today
https://mobile.twitter.com/highmeh/status/1037765408764383232

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


https://twitter.com/emilymhorsman/status/1037499651925127168

C++. My God.

Meat Beat Agent
Aug 5, 2007

felonious assault with a sproinging boner
An operator that returns either of two operands completely unchanged, when passed two lvalues, will return an lvalue. That's not really that much of a horror (nor specific to C++)

(e: to clarify, the above only actually works because a and b are already both lvalues on their own, this isn't some magical property of the ternary operator)

FlapYoJacks
Feb 12, 2009

1) That’s C.
2) That seems fine.

redleader
Aug 18, 2005

Engage according to operational parameters
Do I detect a hint of C++ Stockholm syndrome?

Zopotantor
Feb 24, 2013

...und ist er drin dann lassen wir ihn niemals wieder raus...

ratbert90 posted:

1) That’s C.

It's not if you use a C++ compiler, like they did.

b0lt
Apr 29, 2005

Meat Beat Agent posted:

An operator that returns either of two operands completely unchanged, when passed two lvalues, will return an lvalue. That's not really that much of a horror (nor specific to C++)

(e: to clarify, the above only actually works because a and b are already both lvalues on their own, this isn't some magical property of the ternary operator)

It's somewhat magical, the ternary operator only returns an l-value if the result expressions are l-values of the same type. This will fail to compile, for example:

C++ code:
void foo(bool x) {
    int a;
    char b;
    (x ? a : b) = 0;
}

redleader
Aug 18, 2005

Engage according to operational parameters

b0lt posted:

It's somewhat magical, the ternary operator only returns an l-value if the result expressions are l-values of the same type. This will fail to compile, for example:

C++ code:
void foo(bool x) {
    int a;
    char b;
    (x ? a : b) = 0;
}

pure, utter garbage. i'm writing up a proposal to fix this glaring oversight for the c++ standards committee as we speak.

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

it should fail to compile even if it were on the right hand side

Soricidus
Oct 21, 2010
freedom-hating statist shill
i did it, i wrote the world's first Sufficiently Smart Compiler

code:
#include <stdio.h>

int main(void)
{
    fputs("no\n", stderr);
    return 1;
}

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

redleader posted:

Do I detect a hint of C++ Stockholm syndrome?

Why do you think it would suggest stockholm syndrome? Is there something unreasonable going on in that code?

If it were ambiguous what the code might mean then it would be unreasonable. As it is, there's nothing ambiguous there. It's clear if it were legal code what its meaning ought to be... and as it happens, it is legal code and has that meaning. What's the problem?

FlapYoJacks
Feb 12, 2009
Just turn the Ternary into a if else statement and it looks perfectly normal.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
The result of the conditional operator is always an r-value in C. In fact, I think that’s true in every C-family language except C++. C++’s behavior is nonetheless sensible, not because it means that you can assign to a conditional l-value but because it means there aren’t crazy performance gotchas when using the conditional operator in a more traditional way with operands that happen to have non-trivial type. If C++ forced the result of ?: to be an r-value, c = (cond ? a : b); would have to copy the source value to a temporary before it called the copy-assignment operator.

Assigning to a conditional operator is home to a lovely bit of still-unimplemented behavior in Clang, by the way: note that the operands can be any sort of l-value, including a bit-field.

Xarn
Jun 26, 2015
Using ternary operator to assign to a bit field you say? I think I have a new piece of code to scare people with. :v:

Adbot
ADBOT LOVES YOU

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

b0lt posted:

It's somewhat magical, the ternary operator only returns an l-value if the result expressions are l-values of the same type. This will fail to compile, for example:

C++ code:
void foo(bool x) {
    int a;
    char b;
    (x ? a : b) = 0;
}

Here b is being promoted to an int and thus that expression isn't an lvalue. If type conversions produced an lvalue you'd be left with an int& that actually refers to a char.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply