Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
gonadic io
Feb 16, 2011

>>=

cinci zoo sniper posted:

it's me, i'm the programmer this thread was made for

we are all that programmer

Adbot
ADBOT LOVES YOU

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

cinci zoo sniper posted:

the issue i have with my way of testing is that it is not, uhh, automated enough. i sit down with a function, figure out math for creating an input, then i create it and feed it in. how my imagination perceived testing is that i have a thing that comes with a bunch of things on its own, feeds them all, and looks for discrepancies - the result there, as far as i get it atm, is that i basically use my own function, directly or written for the second time, to gauge its accuracy, which seems to defeat the purpose of testing. function itself wouldn't know it's broken, would it?

i probably just dont understand unit testing correctly. ive never bothered to seriously read about it, and my internal difference between "unit testing" and "testing" is that in the former i test piece by piece, not a vs z

most people don't understand unit testing, so it's not like you're alone in that

one big complication is that there are two fundamentally different types of testing, but people just lump them all in as a "unit test" without clarifying which one they mean.

- regression testing is all about making it possible to change code, without inadvertently breaking functionality. a good regression test starts failing if the code changes in such a way that something is now broken ("has regressed"), while continuing to pass if the change doesn't affect the functionality tested by that particular test.
- specification testing is all about ensuring that the code being written matches some form of prior specification as to what it should be doing. often you can use the same bank of tests to test multiple different implementations that claim to meet the specification.

for regression testing, it's totally reasonable to run the code, see what value it's actually outputting in a relevant scenario, and just writing a test to ensure that it keeps outputting that same value in that particular scenario. if you've chosen a bad scenario your test might end being a bad test (in that it fails all the time even when the changes don't actually break anything), but it will still be a valid regression test.

on the other hand, if you're doing specification testing, that's completely backwards and you'd be right to be suspicious of it. what you're supposed to do to write specification tests is to create a scenario, look at the specification to see what the output should be, and write the test to make sure that's the case. this is a place where the test-driven-development model of writing your tests before you even start writing the actual code can be useful.

Arcsech
Aug 5, 2008

cinci zoo sniper posted:

i was checking github repos earlier today and while on paper it seemed fantastic, r.net hasn't been developed for 2 years, and r provider for a year - that isn't inspiring too much confidence

this is possible, I haven't touched f# for about 1.5 years so

cinci zoo sniper
Mar 15, 2013




MALE SHOEGAZE posted:

unit testing is just about testing bits of code in isolation. mathematical functions make this really simple because math functions are (in theory at least, i'm sure things are wildly different in practice) "pure" and deterministic: a single input always maps to a single output.

So, to effectively unit test a math function, you should just be able to define a table of expected inputs and expected outputs, and then feed that table into a test that executes the function with the given input and assert that it matches the expected output. That's it. That's unit testing.

This is more complex and difficult to do in non-functional code because frequently the unit under test will rely on some implicit global state in order to return the desired output, so in order to unit test effectively, you have to setup your global state. This is why dependency injection is popular: It makes your global state an explicit parameter to the unit being tested, which allows you to unit test more easily.

while the last paragraph partially went over my head, the first two seem to correspond to what i have been doing. just in a typical for myself manner, i was, in an entirely unnecessary overcomplication, banging my head against procedural generation of expected outputs for functions which are "unique output generators", i.e. not different implementations of functions found in well-known and trusted math libraries that could be benchmarked against

cinci zoo sniper
Mar 15, 2013




Jabor posted:

most people don't understand unit testing, so it's not like you're alone in that

one big complication is that there are two fundamentally different types of testing, but people just lump them all in as a "unit test" without clarifying which one they mean.

- regression testing is all about making it possible to change code, without inadvertently breaking functionality. a good regression test starts failing if the code changes in such a way that something is now broken ("has regressed"), while continuing to pass if the change doesn't affect the functionality tested by that particular test.
- specification testing is all about ensuring that the code being written matches some form of prior specification as to what it should be doing. often you can use the same bank of tests to test multiple different implementations that claim to meet the specification.

for regression testing, it's totally reasonable to run the code, see what value it's actually outputting in a relevant scenario, and just writing a test to ensure that it keeps outputting that same value in that particular scenario. if you've chosen a bad scenario your test might end being a bad test (in that it fails all the time even when the changes don't actually break anything), but it will still be a valid regression test.

on the other hand, if you're doing specification testing, that's completely backwards and you'd be right to be suspicious of it. what you're supposed to do to write specification tests is to create a scenario, look at the specification to see what the output should be, and write the test to make sure that's the case. this is a place where the test-driven-development model of writing your tests before you even start writing the actual code can be useful.

:tipshat: this is legitimately useful too, albeit i need more time to digest the distinction between specification and regression tests, since at a glance, in my case they might be very close, especially for pure math functions, if not the same - unless its "content" vs "form". as for other functions, i now know how i can test my io things - just need to write a couple encoders and im good to go since both inputs and outputs can be mathematically defined as well

MononcQc
May 29, 2007

my dude can i interest you in property based testing

VikingofRock
Aug 24, 2008




Xarn posted:

I was phoneposting so didn't want to write code, but the canonical use is std::swap

C++ code:
// deep in templated code, so we have no idea what type we are working on

using std::swap; // bring std::swap to locally global scope
swap(a, b); // use unqualified swap, so ADL is used to find the "proper" swap function
            // This means that we look through a's and b's namespace for a swap that takes a and b as arguments
            // Failing that, we look into global scope where we placed std::swap, that takes any moveable and or copyable arguments.
For your code it would be
C++ code:
template<typename Container, typename Init, typename F>
auto foldl(const Container& container, const Init& initial, const F& binary_op) {
    using std::cbegin;
    using std::cend;
    return foldl(
        cbegin(container),
        cend(container),
        initial,
        binary_op
    );
}
so if container is a special type that has its own whatever::cbegin type defined, you use that, instead of forcing std::cbegin.

This is super cool and I hadn't fully grasped the implications of ADL until I saw this post. This actually solves a problem that's been in my code for my thesis for a little over a year, where I had my own classes with to_string() defined, and I wanted to use them in conjunction with std::to_string(), such that calls would use my class's to_string() if it existed and std::to_string() otherwise. Previously I had been using template magic to solve this, but needless to say your way is much cleaner and more readable. Thanks!

cinci zoo sniper
Mar 15, 2013




the only major thing i have no idea how to test as a "proper programmer" then (not be mistaken for me running ahead of the horses) is the matplotlib plots, but im not sure its worth the hassle. although, i could probably write up some visual decoder or use some arcane mpl api to reverse engineer data off the plot and evaluate it against inputs + "transforms". transforming in general should be done out of plots, imo, but i have some plot-specific math encapsulated within the plotting functions, e.g. "zoom up to this and that, or apply this "filter" to data"

MononcQc
May 29, 2007

MononcQc posted:

my dude can i interest you in property based testing

look it's me, i wrote another god drat free book in my free time http://propertesting.com/

(it is still not fully complete, hasn't gone review, etc. for the standard disclaimer of a thing you just dump out there)

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder

cinci zoo sniper posted:

while the last paragraph partially went over my head, the first two seem to correspond to what i have been doing. just in a typical for myself manner, i was, in an entirely unnecessary overcomplication, banging my head against procedural generation of expected outputs for functions which are "unique output generators", i.e. not different implementations of functions found in well-known and trusted math libraries that could be benchmarked against

yeah generated inputs are a thing that get used in different testing methodologies (see property based testing: http://www.scalatest.org/user_guide/property_based_testing) but less so for normal unit testing.

you just want to define inputs/outputs for as many "classes" of input as possible (edge cases, etc). you're not going to capture all cases on your first attempt (probably), and that's fine: you add new test cases when you encounter/fix bugs.

DONT THREAD ON ME fucked around with this message at 17:31 on Jul 10, 2017

cinci zoo sniper
Mar 15, 2013




MononcQc posted:

my dude can i interest you in property based testing
sort of yes but sort of jesus this is a lot of stuff to take in already and im afraid to just take piecemeal bites here and there without some structured approach to all of it

MononcQc
May 29, 2007

cinci zoo sniper posted:

sort of yes but sort of jesus this is a lot of stuff to take in already and im afraid to just take piecemeal bites here and there without some structured approach to all of it

yeah. I think what could be worthwhile is to think of the general approach of property-based testing without adopting and marrying a framework.

http://propertesting.com/book_what_is_a_property.html is a pretty language-agnostic description I tried to make. It sounds like it may play better into the kind of checking you're looking to do over your code.

cinci zoo sniper
Mar 15, 2013




MALE SHOEGAZE posted:

yeah generated inputs are a thing that get used in different testing methodologies (see property based testing: http://www.scalatest.org/user_guide/property_based_testing) but less so for unit testing.

you just want to define inputs/outputs for as many "classes" of input as possible (edge cases, etc). you're not going to capture all cases on your first attempt (probably), and that's fine: you add new test cases when you encounter/fix bugs.


time to write issue #46

e: im still terrible its actual issue #74 of the current iteration of issue tracker, just the #46th open one

cinci zoo sniper fucked around with this message at 17:35 on Jul 10, 2017

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder

MononcQc posted:

look it's me, i wrote another god drat free book in my free time http://propertesting.com/

(it is still not fully complete, hasn't gone review, etc. for the standard disclaimer of a thing you just dump out there)

nice, i've been wanting to get a better understanding of property based testing

cinci zoo sniper
Mar 15, 2013




MononcQc posted:

look it's me, i wrote another god drat free book in my free time http://propertesting.com/

(it is still not fully complete, hasn't gone review, etc. for the standard disclaimer of a thing you just dump out there)

MononcQc posted:

yeah. I think what could be worthwhile is to think of the general approach of property-based testing without adopting and marrying a framework.

http://propertesting.com/book_what_is_a_property.html is a pretty language-agnostic description I tried to make. It sounds like it may play better into the kind of checking you're looking to do over your code.

i'll go over it with coffee before i get to writing tests, cheers!

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

writing tests at all already puts you in the top... 20%? 10%? of programmers i think

cinci zoo sniper
Mar 15, 2013




NihilCredo posted:

writing tests at all already puts you in the top... 20%? 10%? of programmers i think

wait until you hear about my hand-crafted documentation because autogenerated one is not user friendly enought to be anywhere near the top of priorities. probably the only thing good about that project, but eh, its something

MononcQc
May 29, 2007

MALE SHOEGAZE posted:

nice, i've been wanting to get a better understanding of property based testing

it's going to be a bit disappointing for people using haskell-variants of property-based testing. The way the properties are built there is that types are generators, and shrinking guides how you reduce problem cases to reproducible ones. You have to be very, very involved in your shrinking.

By comparison, the dynamic langs that have it (clojure, erlang, python) have a combinator-based approach where the rules declared in creating the description of the data generators are automatically used in shrinking. They tend to be waaaay more flexible in terms of what inputs you can represent in your code. So probably the biggest benefit (for someone using scalacheck, haskell quickcheck, and whatnot) will be in the general tips and tricks between code samples since they may not apply exactly otherwise.

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

MononcQc posted:

look it's me, i wrote another god drat free book in my free time http://propertesting.com/

(it is still not fully complete, hasn't gone review, etc. for the standard disclaimer of a thing you just dump out there)

holy poo poo secret book unlocked! :eyepop:

cinci zoo sniper
Mar 15, 2013




i will still need to fill in autogen documentation eventually, but gently caress me if i like sphinx

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

cinci zoo sniper posted:

my question is where can i fit it into an existing ~le data science~ workflow, and what is the net benefit gained. im not a compsci person so im not much familiar with all the fancy theory terms (or .net or ocaml for that matter), so what i gather/assume so far is that its an "actually fast c# environment python with strong types but with [questionably quantifiable/qualitative] libraries"

you're not gonna beat python and r when it comes to libraries for data science. when people use stuff like f# or scala, it's generally because they prefer a better/safer language and are willing to put up with either inferior libraries or some interop effort.

but if you're already comfortable with python and r you have little reason to switch. f# has decent visualization and formatting tools (both native and compile-to-js), but that's about it.

the one thing that might be interestingly unique about f# data science is MBrace. it's a tool to make your ~big data crunching~ code run in ~the butt~ (AWS and Azure) that leverages computation expressions to make the offloading code retardedly simple, like:

code:
let jobs =  
    [ for i in 1 .. 10 -> 
         cloud { 
            let primes = Sieve.getPrimes 100000000
            return sprintf "calculated %d primes %A on machine '%s'" primes.Length primes Environment.MachineName 
         }
        |> cluster.CreateProcess ]

let jobResults = 
    [ for job in jobs -> job.Result ]

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:
Anyone ever write any mutation tests? I still remember learning it in school and thought they were kind of cool but I've never written or used them.

cinci zoo sniper
Mar 15, 2013




NihilCredo posted:

you're not gonna beat python and r when it comes to libraries for data science. when people use stuff like f# or scala, it's generally because they prefer a better/safer language and are willing to put up with either inferior libraries or some interop effort.

but if you're already comfortable with python and r you have little reason to switch. f# has decent visualization and formatting tools (both native and compile-to-js), but that's about it.

the one thing that might be interestingly unique about f# data science is MBrace. it's a tool to make your ~big data crunching~ code run in ~the butt~ (AWS and Azure) that leverages computation expressions to make the offloading code retardedly simple, like:

code:
let jobs =  
    [ for i in 1 .. 10 -> 
         cloud { 
            let primes = Sieve.getPrimes 100000000
            return sprintf "calculated %d primes %A on machine '%s'" primes.Length primes Environment.MachineName 
         }
        |> cluster.CreateProcess ]

let jobResults = 
    [ for job in jobs -> job.Result ]

huh. that is what i am weary of, hardly justifiable library selection. will think how important safety is for job projects, might opt for some interop then but otherwise probably not. as for the cloud, no, the data doesn't leave anywhere for any external vendor of any sort, so that's of no use

Doc Hawkins
Jun 15, 2010

Dashing? But I'm not even moving!


ThePeavstenator posted:

Anyone ever write any mutation tests? I still remember learning it in school and thought they were kind of cool but I've never written or used them.

I don't know what it would mean to write them, but I have used mutant. It's too slow to run pre-commit or pre-merge across a whole non-trivial project, but you can invoke it against a particular method or class you want to be absolutely sure is thoroughly tested.

Luigi Thirty
Apr 30, 2006

Emergency confection port.

yay i got my game logging into game center

now i just need to add turn-based multiplayer :getin:

network effect means i should probably make it free w/banner ads and a no-ads upgrade so i won't make millions oh well

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:

Doc Hawkins posted:

I don't know what it would mean to write them, but I have used mutant. It's too slow to run pre-commit or pre-merge across a whole non-trivial project, but you can invoke it against a particular method or class you want to be absolutely sure is thoroughly tested.

Sorry I meant "write unit tests and then run mutants through them". I used PIT in school but I was wondering if mutation testing was even a thing that I should even bother keeping up with for actual jobs.

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
i thought I was doing pretty good with rust but I'm just getting destroyed by the borrow checker in the toy interpreter i'm working on. it was going fine up until the point where I needed to create bindings and now it's a mess.

i think my best advice for writing rust code is to do a vertical prototype before you do anything else because trying to add the right lifetimes after the fact is extremely difficult if not impossible -- non-trivial lifetime issues can't just be fixed by providing the correct annotations, you need to restructure your whole program.

Shaggar
Apr 26, 2006
my best advice for writing rust code is don't.

Arcsech
Aug 5, 2008

Shaggar posted:

my best advice for writing rust code is don't.

shaggar was wrong

gonadic io
Feb 16, 2011

>>=

MALE SHOEGAZE posted:

i thought I was doing pretty good with rust but I'm just getting destroyed by the borrow checker in the toy interpreter i'm working on. it was going fine up until the point where I needed to create bindings and now it's a mess.

i think my best advice for writing rust code is to do a vertical prototype before you do anything else because trying to add the right lifetimes after the fact is extremely difficult if not impossible -- non-trivial lifetime issues can't just be fixed by providing the correct annotations, you need to restructure your whole program.

post your code (clone everything onto the heap)

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder

gonadic io posted:

post your code (clone everything onto the heap)

Working:
https://github.com/daviswahl/monkey_rs/blob/master/src/evaluator.rs

Lifetime explosion:
https://github.com/daviswahl/monkey_rs/blob/functions/src/evaluator.rs

I haven't even started trying to correct the lifetime issues because I need to go back and think about how I actually want to model ownership here. Basically I walk the AST producing Objects. The objects are ultimately returned by the top level "parse" method, so ownership is given to the caller. However, I also need to be able to bind objects into the environment hash. It makes the most sense to give the environment ownership of the objects, because that's where the binding lives.

However, as I said, my program returns ownership of the evaluated object, so there's clearly a conflict there. Now that I type it out it sounds obvious that my evaluator needs to return a reference instead of returning ownership. However if I do that, I won't be able to return a reference to anything that isn't bound into the environment hash, because it would immediately go out of scope.

VikingofRock
Aug 24, 2008




MALE SHOEGAZE posted:

Working:
https://github.com/daviswahl/monkey_rs/blob/master/src/evaluator.rs

Lifetime explosion:
https://github.com/daviswahl/monkey_rs/blob/functions/src/evaluator.rs

I haven't even started trying to correct the lifetime issues because I need to go back and think about how I actually want to model ownership here. Basically I walk the AST producing Objects. The objects are ultimately returned by the top level "parse" method, so ownership is given to the caller. However, I also need to be able to bind objects into the environment hash. It makes the most sense to give the environment ownership of the objects, because that's where the binding lives.

However, as I said, my program returns ownership of the evaluated object, so there's clearly a conflict there. Now that I type it out it sounds obvious that my evaluator needs to return a reference instead of returning ownership. However if I do that, I won't be able to return a reference to anything that isn't bound into the environment hash, because it would immediately go out of scope.

IMO the simplest thing to do here is to wrap all the data in Rc / Arc (+ RefCell if you need to modify the contained data). If that's too expensive, then you need to carefully consider what is responsible for allocating / deallocating the data--this is the owner of the data. If you can convince Rust that the owner of the data will outlive the other thing (which it had better!), then you can have the non-owner contain references to the owner. If for some reason this doesn't work, then your next options are (1) have the non-owner contain some sort of key used to lookup and retrieve items in the owned data, or (2) use raw pointers. The last option is not really so bad--you can contain all the unsafety in the access methods, and then provide a safe interface by returning raw_ptr.as_ref().expect("aw crud that pointer was null").

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

ThePeavstenator posted:

Sorry I meant "write unit tests and then run mutants through them". I used PIT in school but I was wondering if mutation testing was even a thing that I should even bother keeping up with for actual jobs.

mutation testing is totally awesome, but something that only .1% of projects will have any use for. there's no point in using some fancy tool to find code which is insufficiently tested when you can just look at your code coverage report and see that 30% of your code is never run at all.

Beamed
Nov 26, 2010

Then you have a responsibility that no man has ever faced. You have your fear which could become reality, and you have Godzilla, which is reality.


Shaggar posted:

my best advice for writing rust code is don't.

the day has finally come where Shagger was wrong

why do you hate Rust Shaggar

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder

VikingofRock posted:

IMO the simplest thing to do here is to wrap all the data in Rc / Arc (+ RefCell if you need to modify the contained data). If that's too expensive, then you need to carefully consider what is responsible for allocating / deallocating the data--this is the owner of the data. If you can convince Rust that the owner of the data will outlive the other thing (which it had better!), then you can have the non-owner contain references to the owner. If for some reason this doesn't work, then your next options are (1) have the non-owner contain some sort of key used to lookup and retrieve items in the owned data, or (2) use raw pointers. The last option is not really so bad--you can contain all the unsafety in the access methods, and then provide a safe interface by returning raw_ptr.as_ref().expect("aw crud that pointer was null").

Thanks, I'll give Rc a shot next time I work on this. I'm not really comfortable enough yet with lifetimes to understand their limits, so I wasn't sure if Rc would just be cheating.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Shaggar posted:

my best advice for writing code is don't.

swr

Shaggar
Apr 26, 2006

Beamed posted:

the day has finally come where Shagger was wrong

why do you hate Rust Shaggar

its open sores fad lang trash.

Slurps Mad Rips
Jan 25, 2009

Bwaltow!

VikingofRock posted:

I will implement foldl/foldr however I drat well please THANK YOU VERY MUCH
I highly await this thread's C++ gurus telling me everything I did wrong

you actually did it right, its just that std::accumulate exists and you reimplemented it :v:

Mao Zedong Thot
Oct 16, 2008


Shaggar posted:

its open sores fad lang trash.

that the availability of source code even factors into your opinion of a language is just so loving twee lmao

Adbot
ADBOT LOVES YOU

necrotic
Aug 2, 2005
I owe my brother big time for this!
you must not have seen shaggar post before

  • Locked thread