Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Pollyanna
Mar 5, 2005

Milk's on them.


What sort of techniques are good for handling data structures in FP? I've already noticed that things like recursion and tail-call optimization needs you to pass in something like an accumulator, your previous value, and you current value, but it's never quite sunk in.

Adbot
ADBOT LOVES YOU

Athas
Aug 6, 2007

fuck that joker

Pollyanna posted:

What sort of techniques are good for handling data structures in FP? I've already noticed that things like recursion and tail-call optimization needs you to pass in something like an accumulator, your previous value, and you current value, but it's never quite sunk in.

I don't think there is any answer to this question that lies between "use recursion" and "study algebraic recursion schemes". I suggest reading Purely Functional Data Structures, which is an awesome demonstration of how to engineer simple high-performance data structures in both lazy and strict languages. You can skip the theory if it's not your cup of tea.

Unfortunately, in a cruel twist of fate, the data structures you usually want to use in a purely functional language (recursive structures) tend to be terrible for parallelisation, despite the fact that the code itself has no side effects, and thus should not be too hard to run in parallel. I have yet to see a really convincing parallel functional language - NESL is probably the best I have seen, but it's old.

Athas fucked around with this message at 20:50 on Aug 3, 2015

MononcQc
May 29, 2007

Pollyanna posted:

What sort of techniques are good for handling data structures in FP? I've already noticed that things like recursion and tail-call optimization needs you to pass in something like an accumulator, your previous value, and you current value, but it's never quite sunk in.
A bit of rambling ahead.

So that's quite a topic, and there's a lot of stuff to optimize for. The big two are always memory and CPU. Tail-recursion with an accumulator goes for the memory argument. The idea there is really to avoid growing the stack for pending operations and to instead do them progressively, one at a time, with a constant-depth stack.

But more generally, the big issue with functional data structures is going to be that a mutable model will tend to use an O(1) access data structure as it's major building block for a lot of them: an array where each element can be modified individually and accessed 'instantly'. The usage of pointers and references also helps there under the same kind of optimization because you can keep a view of any part of a data structure (like the two ends of a linked list) to get fast append and prepend operations.

Those will be at the core of a lot dynamic algorithms and simple data structures like buffers of all kinds and whatnot, and will help you shave off large factors of complexity by using pointers or reference to cache a given part of a data structure and access it instantly. The cost of these, obviously, is that any dangling pointers or references to individual elements is risky. Either they need to be copied explicitly (and deeply) or they need to never be held for too long unless you feel like risking funny mutations here and there. So you can deep copy and that's super expensive.

To cope with that problem, functional languages like Haskell or Erlang introduce sharing. Let's say I have a tree T:

code:
      d
    /   \
   b     g
 /  \   /  \
a    c f    h
Now let's say my code does:
code:
Old = T, % regular assignment
New = insert(e, Old)
Now here, in a language like Erlang or Haskell or whatever, I can use both Old and New as distinct trees safely, keep them as cache, rollback to them or whatever. To avoid being super expensive when you update the tree, nodes that are in common are shared across both data structures, and otherwise part of the tree points to the old one, giving you something a bit like this in memory:



Basically, the entire unchanged subtree starting at b is kept for both Old and New trees, but the rest changes in some way. The h node is also shared because it had no reason to change. The F node had to be copied and rewritten to allow to change what child it has, but note that if the content of F is large, that can be copied literally. I.e. if my node is {Current, Left, Right}, I need to change the node, but could still point to the existing 'Current' value from the older tree.

This is efficient and useful, and lets you allocate O(log n) memory only for each new element across trees, rather than copying the whole thing.

So what about arrays? Well let's imagine an array of 10 elements: [0 1 2 3 4 5 6 7 8 9]. If those 10 elements are stored in the array directly, then every time I modify the array, I have to copy 100% of it to avoid mutability. If I instead point to 10 elements, I can update one of them, then create a new array with 9 pointers in it that go to the same data as the old one, and a new in there:
code:
old     data     new
0   ->   a   <-   0
1   ->   b   <-   1
2   ->   c        2 --> f
3   ->   d   <-   3
4   ->   e   <-   5
...     ...       ...
Sooner or later (as the array grows), the cost of copying every element for each individual update becomes very expensive, and you end up allocating a lot of memory for nearly no changes. So what do you do? Gradually, you decide to turn your arrays into trees. It is less costly to copy O(log n) of the pointers like a tree than it is to do all of them. It's a compromise.

So that's the big secret of functional data structures, and immutable ones in general. All you have is trees and singly-linked lists (which are just very flat trees by the way), and you make do with it. The vast majority of work with functional data structure is therefore going to go in something like:
  1. Find a way to get your compiler or hardware to say "this here is a safe mutation" and make it slip it in as a very clever thing
  2. use some language extension (unsafe mode, FFI, etc.) to do your thing
  3. organize elements in the tree or list or heap in such a way that your common use case is O(1) and infrequent operations are more expensive, but not too bad

The compiler people (and people in Rust) will spend a lot of time in 1. The Functional people (such as Chris Okasaki) will spend a lot of time in 3.

So you end up with the fancy Okasaki data structures and others:
  • leftist heaps for frequent access to a max/min element
  • doubly-ended queues (keep a list of 'IN' and a list of 'OUT' elements) for quick access to both insertions and extraction
  • trees and heaps of various branching factors
  • representing sequences or pieces of data as tries instead of lists of elements
For example, if I want to test for the presence of an item in a structure, I could build a trie of hashes and get O(log255N) matching rather than O(1) with an hashmap, which hey, turns out not to be that bad! I can even break up work in many parts and eventually merge the tries at once and parallelize part of my 'trie-building work'.

There's plenty of sources for such data structures. What makes the Okasaki book (Purely Functional Data Structures) so interesting is that he's made the analysis in a way such that the cost of expensive operation can be considered to be a kind of 'debt' accumulated over the lifetime of the data structure. So the cost is equivalent to each operation done (and is therefore O(1) if you were to spread it over the lifetime of things). To illustrate it, let's use the doubly-ended queue.

If I were to use a single linked list as a queue, I would have the following stuff:
code:
[6, 5, 4, 3, 2, 1]
where 1 is the oldest element. Fetching it would require me to rewrite the entire linked list and give me O(1) insertion, but O(N) extraction. One option would then be to split my thing into two lists:
code:
New           Old
[6, 5, 4, 3], [1, 2]
I pop the head off Old and it costs me O(1) (first element of lists is O(1) to add and remove). I add '7' and it's O(1). So while both lists are full, I have O(1) insertion and O(1) extraction. The trick is that when the Old list is empty:
code:
New              Old
[7, 6, 5, 4, 3], []
We reverse the other New list (an O(n) operation) and turn it into the Old list:
code:
New Old
[], [3, 4, 5, 6, 7]
That means that my average case for popping an element is O(1), but my worst case is O(N) -- whenever I need to flip the New list into an Old one.

and that's where the amortization handwaving takes place: if reversing the list is O(N) and I pay the cost only once per element, then I can, over a long time, smudge its cost into each insertion or each removal (which is O(1)). As such, because I have to pay 1/N of the cost of reversal for each element, and only once, I can pretend that over the lifetime of the program, the reversal is costing O(1) per element. As such, I say that my amortized cost for the queue is O(1) for insertion and removal.

So functional data structures often end up having the right kind of tree to optimize the very common use case we have, and then make sure (as much as possible) that all other rarer and costlier operations only need to be done once per element in the data structure, letting us handwave it as a lower amortized cost.

...

There's one last category which is "I want to represent circular references doubly-linked data structures the way I can with pointers", and the result for that are zippers. I'm a bit short on time for now but can expand on this if asked to.

xtal
Jan 9, 2011

by Fluffdaddy

Athas posted:

I don't think there is any answer to this question that lies between "use recursion" and "study algebratic recursion schemes". I suggest reading Purely Functional Data Structures, which is an awesome demonstration of how to engineer simple high-performance data structures in both lazy and strict languages. You can skip the theory if it's not your cup of tea.

Unfortunately, in a cruel twist of fate, the data structures you usually want to use in a purely functional language (recursive structures) tend to be terrible for parallelisation, despite the fact that the code itself has no side effects, and thus should not be too hard to run in parallel. I have yet to see a really convincing parallel functional language - NESL is probably the best I have seen, but it's old.

What do you mean by that? I've enjoyed writing concurrent code in Clojure, Haskell and Scala much more than other languages. Immutable data structures and pure functions make for dead simple parallelism.

In any case, I'm starting to see that the right way to do concurrency is multiple processes communicating through a message broker.

QuantumNinja
Mar 8, 2013

Trust me.
I pretend to be a ninja.

MononcQc posted:

There's one last category which is "I want to represent circular references doubly-linked data structures the way I can with pointers", and the result for that are zippers. I'm a bit short on time for now but can expand on this if asked to.

For zippers:
http://learnyouahaskell.com/zippers
http://strictlypositive.org/diff.pdf
https://www.st.cs.uni-saarland.de/edu/seminare/2005/advanced-fp/docs/huet-zipper.pdf

xtal posted:

What do you mean by that? I've enjoyed writing concurrent code in Clojure, Haskell and Scala much more than other languages. Immutable data structures and pure functions make for dead simple parallelism.

In any case, I'm starting to see that the right way to do concurrency is multiple processes communicating through a message broker.

The rebuilding process in MononcQc's last post is part of the rub. If each process has a pointer to D at the top of the tree, then how do they each learn that D has changed to the new D because a single process needed to add something? They need a shared pointer, right? And if each thread needs to put things into this tree, they all need to update not just that shared pointer, but maintain the new nodes that another tree may have generated during insertion. You end up needing a safety mechanism like an atomic lock on that reference; otherwise, process A may read node D, rebuild part of the tree, then yield to Process B, which reads node D, rebuilds part of the tree, then yields, then process A updates D to be D_A and yields, then process B updates D with D_B. D_A's updates are entirely lost because the nodes build as part of D_A were only ever contained in Process A and its update. An atomic lock on the structure will solve this, but that's a huge slowdown in the program. The other thing you can do is 'queue' updates into a single thread that has final say over everything, but even then requesting values from D means that the manager has to discharge all of the updates pending that might be relevant to looking up the value.

In general, the parallelism is dead simple if it's in the form of a map or a fold: if you can discretize the work and combine it "later". If the threads all need to share and update data, though, as the computation proceeds, you're in a lot of trouble. In fact, this is the entire motivation of LVars, and they've been used to address these sorts of problems.

QuantumNinja fucked around with this message at 19:27 on Aug 3, 2015

MononcQc
May 29, 2007

I have written http://ferd.ca/yet-another-article-on-zippers.html explaining the design of zipper lists, zipper binary trees, and zipper forests (arbitrary trees), and also has a few sources in there too :toot:

QuantumNinja posted:

The rebuilding process in MononcQc's last post is part of the rub. If each process has a pointer to D at the top of the tree, then how do they each learn that D has changed to the new D because a single process needed to add something? They need a shared pointer, right? And if each thread needs to put things into this tree, they all need to update not just that shared pointer, but maintain the new nodes that another tree may have generated during insertion. You end up needing a safety mechanism like an atomic lock on that reference; otherwise, process A may read node D, rebuild part of the tree, then yield to Process B, which reads node D, rebuilds part of the tree, then yields, then process A updates D to be D_A and yields, then process B updates D with D_B. D_A's updates are entirely lost because the nodes build as part of D_A were only ever contained in Process A and its update. An atomic lock on the structure will solve this, but that's a huge slowdown in the program. The other thing you can do is 'queue' updates into a single thread that has final say over everything, but even then requesting values from D means that the manager has to discharge all of the updates pending that might be relevant to looking up the value.

In general, the parallelism is dead simple if it's in the form of a map or a fold: if you can discretize the work and combine it "later". If the threads all need to share and update data, though, as the computation proceeds, you're in a lot of trouble. In fact, this is the entire motivation of LVars, and they've been used to address these sorts of problems.

That's one of many ways. The thing there is that you still do your parallelism from a point of view of having a central authority, so yes:
  • using locks
  • using atomic operations / lock-free data structures
  • fan-in/fan-out of the work, with a central thread/process doing the writing and updating of elements
  • read-only references to a shared data-structure. Want a newer one? Ask for a new copy. Each copy is immutable anyway. Somewhat reminiscent of MVCC, which I believe clojure provides
  • L-Vars ensuring only safe execution paths are taken
  • Software transactional memory
Then you have another big family of option, where each process is allowed to write/update their vision of the data structure, and you try to have them converge (usually by some form of message-passing). This brings you into the realm of [strong] eventual consistency and other compromises:
  • lamport clocks
  • vector clocks
  • version vectors
  • CRDTs
  • using monotonic values to pick last-write-wins values
  • other similar variations

This is still complex, but the advantage of most functional languages is that any breakage to nothing shared/immutability everywhere will be the exception rather than the norm; you can't easily end up accidentally writing code that isn't thread safe. In languages like Haskell, the restrictions on global state also means that generally, all the weird rear end interaction that cause problems are restricted in ways most languages won't do, and that's where a lot of the ease comes from.

Different languages do different stuff. Mercury or some Schemes will do automatic parallelism, Erlang has everything explicit (processes and message passing, some tables with shared global write spaces, though the data from them is copied to processes), Haskell has various options (say, software transactional memory), and so on.

giogadi
Oct 27, 2009

Great posts, everyone.

Tangential: I found Okasaki's book to be impenetrable. It reads like a Ph.D. thesis (because it is) that is more about flaunting his personal results than genuinely trying to teach the reader. I admit that I'm rusty in amortized complexity; would I have an easier time if I went back and studied that first? I'd love to see a reader-friendly, Haskell-based treatment of data structures that included graphs (although I know that's a whole other can of worms).

Just venting here because that drat book made me feel reeeeeeeal dumb.

MononcQc
May 29, 2007

The book shows a bunch of data structure; the amortization analysis really boils down to mathematical proofs of "this operation happens once per element at most, ever, therefore amortized!", and the usage of laziness (with delay/force) is mostly so that if someone reuses the same data structure many times (so that I re-pop the same element 500 times because I use old references of the immutable datastructure), I can have memoization to make sure I never paid the cost more than once.

You can look at the annex (for the Haskell code) or just paper over to the ML code and reimplement it as is and you'll get a bunch of useful data structures.

As a non-academic who does a bunch of functional programming, I'd say I'd found a few main usages for the book:

- a dictionary of data structures I can steal and reimplement
- a way to see how I could rig my own data structures and think about their performance -- the book has trick in things like how or when to rebalance
- a bunch of tricks that can be readapted from one datastructure to the next and gotchas about use cases (it is possibly faster to do a negative lookup ['thing doesn't exist'] in a trie than a hash)

The big analysis bit is useful to wrap your head around, but being able to replicate it has not proven essential to me once you get a kind of intuitive feel for things you do.

QuantumNinja
Mar 8, 2013

Trust me.
I pretend to be a ninja.

MononcQc posted:

As a non-academic who does a bunch of functional programming, I'd say I'd found a few main usages for the book:

- a dictionary of data structures I can steal and reimplement
- a way to see how I could rig my own data structures and think about their performance -- the book has trick in things like how or when to rebalance
- a bunch of tricks that can be readapted from one datastructure to the next and gotchas about use cases (it is possibly faster to do a negative lookup ['thing doesn't exist'] in a trie than a hash)

The big analysis bit is useful to wrap your head around, but being able to replicate it has not proven essential to me once you get a kind of intuitive feel for things you do.

Incidentally, the same thing could be said about CLRS, and I think that's the point: Okasaki's book is about being a reference guide that lets you catch onto the pattern yourself. Treat it as such, and it's nice. Even as an academic, I think about the book this way.

sarehu
Apr 20, 2007

(call/cc call/cc)
I thought Okasaki's book was an exceptionally nice book to read from cover to cover.

Athas
Aug 6, 2007

fuck that joker

sarehu posted:

I thought Okasaki's book was an exceptionally nice book to read from cover to cover.

It is true that the proofs can be hairy (especially when done in two different ways), and the chapter on data-structural bootstrapping can be hard to understand, but these can be skipped. It's still worthwhile to read the implementations of the data structures and the high-level arguments for why they are efficient.

dirby
Sep 21, 2004


Helping goons with math

gonadic io posted:

Automatic momoization is extremly unintuitive and finicky in haskell. If you need something to memoise, either store it in an explicit list and pass it around, or you have to do the same kind of test as in that post. Or you could just use a library like memo that does it explicitly behind the scenes.
Thanks for this post. Haskell is really intuitive for me to code in for math and it's good to know what it takes to get memoization working right.

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer
Totally new to haskell (learning it for a class in school)

It seems pretty cool so far, but I'm stumped on what I thought was a pretty basic operation:

I have defined a data type like so:

Data Day = Mon | Tue | Wed | Thu | Fri | Sat | Sun

and I need to create an instance of Show that will print out full names of the days like

>Show Mon
"Monday"

My textbook says that defining the Day data type this way is an enum type, so I thought "well this is easy! I can just make a list of the strings of day names and use a day value as the index!"

Haskell does not work that way, apparently.

I thought I was going to write sometihng like
code:
getDayName :: Day -> String
getDayName m = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]!!m

instance Show Day where
        show m = show (getDayName m)
but m needs to be converted to an int. Okay that makes sense. Googling how to convert an enum type into an int type makes it seem like this is a totally wrong thing to do and not the haskell way of doing things.

am I barking up the wrong tree here?

dirby
Sep 21, 2004


Helping goons with math

Snak posted:

Totally new to haskell (learning it for a class in school)

It seems pretty cool so far, but I'm stumped on what I thought was a pretty basic operation:

am I barking up the wrong tree here?
Your approach would require building a mapping to ints and this list, neither of which you need. Just cut out the middlemen: you want show Mon = "Monday", etc. so write those...

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer

dirby posted:

Your approach would require building a mapping to ints and this list, neither of which you need. Just cut out the middleman: you want show Mon = "Monday", etc. so write those...

Wait really? I should just make a statement for each possible value of Day? I thought that sounded like poor solution that does not scale well...

dirby
Sep 21, 2004


Helping goons with math

Snak posted:

Wait really? I should just make a statement for each possible value of Day? I thought that sounded like poor solution that does not scale well...
If your type has more than, say, ten constructors, then you're probably doing your type wrong. That said, it's certainly possible to use enumerated type stuff to do this, so if you specifically want help doing that, say so, I guess.

dirby fucked around with this message at 01:13 on Sep 17, 2015

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

Snak posted:

Wait really? I should just make a statement for each possible value of Day? I thought that sounded like poor solution that does not scale well...

Well, think about what you need to write. This is what you need:
code:
show :: Day -> String
There's a way to do it with a list like you were trying to, but what you've actually written is a Show implementation for Day like this:
code:
show :: Int -> String
In a simple case like this, it's simpler to just match on each possible Day instead of involving enumeration of lists.

fart simpson fucked around with this message at 01:19 on Sep 17, 2015

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer

dirby posted:

If your type has more than, say, ten constructors, then you're probably doing your type wrong. That said, it'seems possible to use enumerated type stuff to do this, so if you specifically want help doing that, say so, I guess.

nope, I just didn't want my professor to look at my code and say "well you hardcoded every value like an idiot freshman" instead of using all the map stuff we talked about in class. But since it sounds like map is more complicated and expensive here, the way you suggested, which I dismissed, is the better way.

edit:^ yes, because I thought there would be an easy way to get at the int value of a enum type, like there is in c.

edit2: But thanks, I got it working no problem now.

Snak fucked around with this message at 01:23 on Sep 17, 2015

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

Well, there is a way to do it. I'd involve another typeclass, called Enum: https://hackage.haskell.org/package/base-4.8.1.0/docs/Prelude.html#t:Enum

You would want to define an instance of Enum for your Day type. Or you could derive the instance in this case, but I don't know if you know deriving yet?

e: Since you already have it working, I'll show you a way of doing it more along the lines of how you originally approached the problem:
code:
data Day = Mon | Tue | Wed | Thu | Fri | Sat | Sun deriving (Enum)

instance Show Day where
	show day = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"] !! fromEnum day 
In a simple case like this, I'd prefer to just hardcode in the pattern matches rather than involving Enum though.

fart simpson fucked around with this message at 01:32 on Sep 17, 2015

KaneTW
Dec 2, 2011

I can't think of anything better than
code:
instance Show Day where
  show Mon = "Monday"
   ...
. The solution using indexing is rather ugly and not really Haskelly.

I wouldn't derive Enum because
code:
> succ Sun
*** Exception: succ{Day}: tried to take `succ' of last tag in enumeration
makes no sense.

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

Oh yeah there's that too.

hannibal
Jul 27, 2001

[img-planes]

Snak posted:

edit:^ yes, because I thought there would be an easy way to get at the int value of a enum type, like there is in c.

This is the assumption you have to drop - algebraic data types in Haskell are not like enums in C, usage of the Enum typeclass not withstanding. It's not like C where there's an underlying integer that you can just reach in and grab. If anything you can forget most of what you know from C/C++ when doing Haskell.

QuantumNinja
Mar 8, 2013

Trust me.
I pretend to be a ninja.
If you absolutely want to do list lookup-based showing, I think you're better off calling lookup:

code:
module Days where

import Data.Maybe

data Day = Mon | Tue | Wed | Thu | Fri | Sat | Sun
  deriving (Eq)

dayStrings :: [(Day, String)]
dayStrings = [ (Mon, "Monday")
             , (Tue, "Tuesday")
             , (Wed, "Wednesday")
             , (Thu, "Thursday")
             , (Fri, "Friday")
             , (Sat, "Saturday")
             , (Sun, "Sunday")
             ]

instance (Show Day) where
  show d = fromMaybe "" $ lookup d dayStrings
At the point where you've written this code, however, it's pretty drat clear that you should have just directly written the seven LHS/RHS pairs of the Show instance.

raminasi
Jan 25, 2005

a last drink with no ice
There's no way around hardcoding an explicit mapping for each Enum value somewhere.

Pollyanna
Mar 5, 2005

Milk's on them.


Might as well do it in a config file or something.

Sedro
Dec 31, 2008
For the most Robust, Scalable Enterprise Solution you should use a time service. http://www.timeapi.org/utc/mon?format=%25A

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

GrumpyDoctor posted:

There's no way around hardcoding an explicit mapping for each Enum value somewhere.

But maybe you can trick the compiler into doing it for you!

tazjin
Jul 24, 2015


GrumpyDoctor posted:

There's no way around hardcoding an explicit mapping for each Enum value somewhere.

Strictly speaking in this case he could change the constructor names to the full weekday names and derive Show.

Also unless explicitly forbidden by the task you're doing you should probably use the time library for time&date stuff.

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer
Wow. Thanks for all the feedback. It was a really simple, basic assignment to demonstrate we knew how to make basic Haskell functions. I was only confused because I don't have a feel for "the Haskell way" and I was bringing a bunch of C baggage along for the ride.

tazjin
Jul 24, 2015


Snak posted:

Wow. Thanks for all the feedback. It was a really simple, basic assignment to demonstrate we knew how to make basic Haskell functions. I was only confused because I don't have a feel for "the Haskell way" and I was bringing a bunch of C baggage along for the ride.

Generally when you come across an explanation for concepts in Haskell and read sentences like "this is comparable to an enum in C", don't take them too seriously and don't make the assumption things will work the same way.

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer
I'm back with more super-newbie questions about Haskell...


To use my previous example, let's say I have defined a type
code:
data Day = Mon | Tue | Wed | Thu | Fri | Sat | Sun deriving (Eq, Enum, Show)
Is there a way for me to iterate through the possibly values of the Day type?

Like if I wanted to do a "for each day in (Mon, Tue, Wed, Thu, Fri, Sat, Sun)" type thing, is that possible to get just from the type itself, or do I need to hardcode a list?

Votlook
Aug 20, 2005
You can just hardcode it like this.

code:
days = [Mon..Sun]
You can use the [from..to] syntax here because Day is an instance of Enum.
For a more generic solution make your enum type make an instance of Bounded like so:

code:
data Day = Mon | Tue | Wed | Thu | Fri | Sat | Sun deriving (Eq, Enum, Bounded, Show)
This provides an implementation of minBound and maxBound (Mon and Sun in this case)
You can then use the minBound and maxBound to define the list of all enum elements:

code:
enumElems = [minBound .. maxBound ]
Note that is is a generic list for all types that are an instance of both Enum and Bounded, eg:

code:
enumValues :: (Bounded t, Enum t) => [t]
If Haskell can't infer the 't', Day in your example, you'll need to provide the type like so

code:
enumElem :: [Day]
You can then use the map function to do stuff to each element, like so:

code:
map show (enumValues :: [Day])
This works for any type that implements Enum and Bounded, so

code:
data PrivatePart = Dick | Balls deriving (Eq, Enum, Bounded, Show)
map show (enumValues :: [PrivatePart])
will just show each private part.

Votlook fucked around with this message at 21:18 on Sep 28, 2015

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer
awesome. That is so great. Thanks

UncleBlazer
Jan 27, 2011

Can anyone give me a simple example of monad transformers? I get my regular monads (Maybe, Either, IO, Reader, Writer, State) but Transformers look important and I can't seem to find a decent explanation of them. I get they stack monads and that's it.

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer
I have another horribly noobly question:

I know that defining veriables inside the then clause of an if "statement" is a terrible and un-Haskell-y thing to do, but I don't know how else to accomplish what I'm trying to do. I have a great and effective Haskell expression that gets me a list of two elements. I need the different of those two elements. I want to assign the difference of the value at index 1 and the value at index 0 to the then clause. I cannot figure out how to do it.

So i thought "oh, I will make a variable" and it looks something like this
code:
foo :: ( a -> a ) -> Maybe Int
foo f = if <some basic condition>
		then
			Just (list!!1 - list!!0) where list =  <long expression that generates the list>
		else
			Nothing
And no matter what I do it says there's a missing else clause.

It would probably work if I just skipped the variable and put the whole expression that generates the list in both places I currently use the list variable, but that would be really ugly

edit: it does in fact work perfectly if I do that.
I was going to say that I can't generate the list outside of the then clause, because then it will search an infinitely iterated list for a term that may never appear,
but because of lazy evaluation, can I safely do that? Because it won't evaluate until it's returned in the then clause anyway? That sounds horrible though.

Snak fucked around with this message at 00:19 on Sep 29, 2015

ShoulderDaemon
Oct 9, 2003
support goon fund
Taco Defender

Snak posted:

I know that defining veriables inside the then clause of an if "statement" is a terrible and un-Haskell-y thing to do, but I don't know how else to accomplish what I'm trying to do. I have a great and effective Haskell expression that gets me a list of two elements. I need the different of those two elements. I want to assign the difference of the value at index 1 and the value at index 0 to the then clause. I cannot figure out how to do it.

Either lift the where out of the if:
code:
foo f = if <some basic condition>
    then Just (list!!1 - list!!0)
    else Nothing
  where list = <long expression that generates the list>
or use a let expression:
code:
foo f = if <some basic condition>
    then let list = <long expression that generates the list>
      in Just (list!!1 - list!!0)
    else Nothing
Because of lazy evaluation, these are both exactly the same (the list will not be created unless it is actually required). They're also both approximately equivalent in terms of "looking like normal Haskell" and if you read other people's code you'll see lots of both. Pick whichever one you prefer.

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer
Thanks. That also makes sense. I really like Haskell and functional programming so far, it's just. so. different.

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


Free OCaml course

Snak
Oct 10, 2005

I myself will carry you to the Gates of Valhalla...
You will ride eternal,
shiny and chrome.
Grimey Drawer
I am back with more last-minute homework-related questions.

In my class we have been using Haskell primarily to explore the grammar and syntax of languages. To than end, our homework uses a very simple "while language" we have defined in data types. This aspect of things is pretty intuitive to me, so I'm not having a lot of trouble there. The following is an excerpt from my homework, and is the subject of my current confusion:
(variables are declared with the "newint" statment)

quote:

Below is a code outline for this static check. (checkS s []) is True if s
contains no undeclared variables. (check (NewInt x c) ds) records in ds that x
is declared when it recursively checks c
. Complete the definitions of checkS,
checkE and checkB. Test your solution with check below. In particular, (check
"newfib.wh") should be True and (check "fib.wh") should be False.
Specifically the sentence that I bolded. As usual, it is late on the due date, so if you aren't able to help me I completely understand.

I believe that "check" in (check (NewInt x c) ds) supposed to be "checkS", because check only takes one parameter: fname. I assume that this must be the case. So checkS's second parameter is the location that the result of the check will stored. This is why it is initially empty. The c in "NewInt x c" is the code block in which x is defined. Which makes sense; c is a statment, which means it can be a sequence of statements. So we can recursively checkS on each of these statements. In order to determine... something.

I guess I'm having trouble understanding the fail condition here. Since a program that declares it's variables with NewInt will have newint statements at the beginning of it's program. These statements are "hit" first by checkS, hence the need for recursion from that point, as described in the example. But what are we looking for? Well we're looking for variables that aren't declared by a newint. So what does that look like? Well it's any variable that isn't currently inside the recursion started by it's newint statement.

So that's what I understand. But I still have no idea how to implement it. Sorry for rambling, I wanted to describe my understanding as accurately as possible.

If my understanding is sort of correct, is there some kind of hint that you could give me to help me understand what the failure scenario of this test looks like?

For reference, here is the skeleton provided so you can see structure of the implementation I am expected to use.
code:
checkS :: Stat -> [Ident] -> Bool
checkS c r = case c of
                 (Skip _)       -> undefined
                 (Assign x e _) -> undefined
                 (Seq cs)       -> undefined
                 (If b _ s1 s2) -> undefined
                 (While b _ c)  -> undefined
                 (NewInt i c)   -> undefined
I am not asking for anyone to give me the answer or help me cheat.

Adbot
ADBOT LOVES YOU

sarehu
Apr 20, 2007

(call/cc call/cc)
checkS takes an expression and the list of variables that are in scope. If you want to know whether a variable's in scope, see if it's in that list.

  • Locked thread