Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Cryolite
Oct 2, 2006
sodium aluminum fluoride

KernelSlanders posted:

I use it for bigdataish tasks ... spark... large scale ML... NLP... data science... data exploration that I used to do in python/numpy/pandas...

I'm learning Scala coming from C# and this is exactly the type of stuff I'd like to eventually be doing too, bonus points if it's in Scala. Is your company by any chance in the Baltimore/DC area? :allears:

I really, really want to jump ship off of .NET and maybe into a position writing Scala next, but it looks like there aren't too many companies looking for it (or at least near me).

Adbot
ADBOT LOVES YOU

Cryolite
Oct 2, 2006
sodium aluminum fluoride

TE! posted:

can't pm you and you have no way to contact you in your profile.

Fixed. You can e-mail me at ubiquitous.croak@gmail.com.

Cryolite
Oct 2, 2006
sodium aluminum fluoride

Steve French posted:

I was hired to write scala having not written a line of it.

Are you in the Bay area, or a more normal city/place?

What did you do before? Lots of Java?

Cryolite
Oct 2, 2006
sodium aluminum fluoride

KernelSlanders posted:

It's not. It is somewhat unique to tech companies though. When an airline hires programmers they tend to do it through an HR department filled with people who print out forms to fax to people and they like simple rules like "ten years of python experience."

Steve French and sarehu I agree with you completely but I'm finding the above is true for at least some companies near me. I talked to a place recently that was excited I'm learning Scala (since, they told me, they have trouble finding people who know it), but because I haven't used JRuby on Rails I don't qualify -- not even Ruby on Rails, which I do have some experience with, but specifically because I haven't worked with JRuby on Rails I don't qualify.

However after talking with them a little bit I think the common refrain is true: I wouldn't want to work there anyway!

Cryolite
Oct 2, 2006
sodium aluminum fluoride
Why isn't ArrayBuffer[String] an Iterable[String]?

Let's say I have an ArrayBuffer like this:

code:
val a = ArrayBuffer[String]("AAA", "BBB", "CCC")
// I can make a string of the elements with some separator like this:
a.mkString(", ") // "AAA, BBB, CCC"
Coming from C# I'm used to being able to do string.Join(", ", a) which takes an IEnumerable<String>. If I pass a List<String> it works because that's an IEnumerable<String>.

In Scala I have to import asJavaIterable to get an implicit conversion from ArrayBuffer[String] to Iterable[String] so that the same thing works:

code:
import scala.collection.JavaConversions.asJavaIterable
String.join(", ", a)
Is this the only way to get this to work (for this overload of join)? Why was it built this way? Why couldn't ArrayBuffer[String] implement Iterable? Going through the inheritance heirarchy I see the Iterable trait in there, which after some confusion I realized is not the same as the Iterable interface.

I can also get it to work doing this:

code:
String.join(", ", a: _*)
...which converts a to a CharSequence... varargs argument for the other overload of join. What is that : _* doing? Is that an operator?

Cryolite
Oct 2, 2006
sodium aluminum fluoride

KernelSlanders posted:

I'm not sure why you feel String.join(",", a) is cleaner than a.mkString(",")

Oh no, I definitely like it more, I was just trying to take a technique I did know and use it as a chance to learn more about how things work. :)

Cryolite
Oct 2, 2006
sodium aluminum fluoride
How would you generate the first 10 even fibonacci numbers in Scala?

I overheard someone say they use this as an interview problem so I did it in C# by generating an IEnumerable<int>, filtering to the even ones and then taking 10 like this:

code:
static IEnumerable<int> Fibonacci()
{
    int a = 0;
    int b = 1;
    while (true)
    {
        yield return a;
        var temp = b;
        b = a + b;
        a = temp;
    }
}

static void Main(string[] args)
{
    var nums = Fibonacci().Where(x => x % 2 == 0).Take(10);
}
Looking around online I found this way to do it, but it's recursive:

code:
def fib(a: Int = 0, b: Int = 1): Stream[Int] = Stream.cons(a, fib(b, a + b))
fib().filter(_ % 2 == 0).take(10).force
There's also this way which I guess is also recursive but I don't completely understand it:

code:
def fib2(): Stream[Int] = 0 #:: fib2.scanLeft(1)(_ + _)

fib2.filter(_ % 2 == 0).take(10).force
Is there an idiomatic way to do this in Scala like the C# example? There doesn't seem to be anything like yield return in Scala. The most similar way might be to create a new Iterator like in this example where hasNext would always be true and I could imperatively manage whatever state I want before returning a value in the next method. However, this doesn't seem like good Scala.

How would you do it?

Also, what exactly is the fib2 function above doing? I know that #:: creates a stream with a head of 0 and a tail of fib2.scanLeft(1)(_ + _), but if it recursively calls fib2 again won't that tail start with 0, and then the next call to fib2 starts with 0, and so on? It works, but it seems like it shouldn't to me.

Cryolite
Oct 2, 2006
sodium aluminum fluoride
I wrote a small Scala app to try to find anagrams on Twitter. It uses Twitter4j to get a streaming sample of tweets and Slick with an embedded H2 database to save tweets and potential matches.

It's using a lot more memory than I expected and I'd like to learn why, if that's good or bad, or if I should even care. I'm coming from .NET and don't have very much experience with the JVM or JVM memory usage.

The code isn't special and is mostly copied from examples on the internet. However, after a few seconds the process starts using more than 2GB of memory. I didn't expect it to be that big. I used jmap to create a heap dump and jvisualvm to look at it. There's over a gig of just char[], byte[], and java.lang.String. From looking through the instances it looks like there are quite a few duplicates but otherwise there doesn't seem to be much out of the ordinary.

Is this normal or expected? CPU usage starts out around 40% but then seems to drop to a steady state of 5%. I'm handling about 50 tweets a second and saving about 5% of them to the database.

If I use -Xmx256m -Xms64m then the memory usage stays low but CPU usage stays high around 40% instead of 5%. Maybe this is due to all the extra garbage collecting it has to do for the heap to stay under 256MB?

I had the thought of running this on a cheap Digital Ocean or Linode instance and maybe turning it into a bot but it seems to be a lot more resource-intensive than I imagined.

Does anyone have any tips or resources for researching/minimizing memory usage in Scala apps? I'm kind of tempted to try the same thing in .NET or Python to see how the performance compares.

Cryolite
Oct 2, 2006
sodium aluminum fluoride
I am using it in persistent mode (db is about 225MB of candidate tweets and found matches now). I tried using SQLite in the beginning but had a lot of trouble getting Slick to generate the schema DDL for it so I switched to H2, but it wouldn't be too hard just to write it myself if I needed to switch. However I'm starting to get some actual anagrams so if I put this on a server and have multiple processes trying to access the database (like a Play application to manage which tweets get posted if I actually turn this into a bot) I'm worried SQLite wouldn't be a good choice since only one process can update SQLite at a time. H2 has a server mode so I figured it was a better choice if that happened. Maybe this isn't a concern at all.

Cryolite
Oct 2, 2006
sodium aluminum fluoride
I'm an idiot. I was trying to perform a certain type of query about every 500ms to try to find anagrams and the query was taking over 1200ms. Since Slick is non-blocking and just returns futures, it was very easy to bombard the database with more queries than it could keep up with.

I added an index on one column and my queries went from 1200ms to <20ms. There's no problem now and it uses <1% CPU and only 480MB of memory after running for over a day.

Cryolite
Oct 2, 2006
sodium aluminum fluoride
Has anyone here gotten Play projects in a multi-project SBT setup running in Intellij?

I'm trying to build an Intellij project that has some common code shared between a Play project and some other executables, so to figure out how to do this I'm trying to run some activator templates that have multi-project SBT setups with one or more Play projects and some common code shared between them. For example, this one and this one. However whenever I try to run the Play apps through Intellij like shown here I get a "No main class detected." error. I can't seem to find a solution to this by googling. Doing sbt, project x, ~run through the command line works, but debugging seems like a pain that way.

Does everyone do everything through the command line anyway? I really wish I could right-click and select "Run Play 2 App" but it only seems to work for the out-of-the-box single project template.

Cryolite fucked around with this message at 07:04 on Dec 10, 2015

Cryolite
Oct 2, 2006
sodium aluminum fluoride

Sedro posted:

You can do sbt x/run -jvm-debug 5005 then attach your remote debugger from Intellij.

Thanks, I eventually got this working and it works great.

I'm using Windows and had a lot of trouble initially getting it to listen. I ended up needing to do this to set SBT_OPTS before sbt myproject/run to get it to listen. Then connecting from Intellij was no problem.

Now I can run and even debug multiple Play apps at a time. Neat!

Cryolite
Oct 2, 2006
sodium aluminum fluoride
Coming from C# where the List type is backed by an array (like ArrayList in Java) I know I was surprised to learn Scala's List is really a linked list. I expected a type named List to have the performance characteristics of an array-backed data structure.

For idiomatic Scala (immutable and functional) I guess it makes sense for it to be a linked list. It's just different and I have to change my thinking when going back and forth.

Cryolite
Oct 2, 2006
sodium aluminum fluoride
I'm building some Twitter bots in Scala and thinking of trying to use akka for pub/sub so I only have one app sampling the Twitter stream publishing it to multiple subscribers of that stream. Is this a bad idea or should I use a real message queue?

I have a few ideas for Twitter bots and they mostly all involve consuming a stream of tweets. I’m halfway towards implementing one of them, but right now I have a single Scala app that both samples the Twitter stream and then processes/saves tweets. If I eventually get to the point of multiple bots I’d like to separate things out so I have only a single app that samples tweets and publishes them onto a queue of some kind, and then each app subscribes to that queue and acts on each tweet in its own way.

I’ve looked at zeromq and a few other message queues but it looks like akka might be really simple to use for this kind of thing. I don’t care if everything has to be written in Scala/Java, so I don't care if using something like zeromq means I can write the publisher in Java and subscribers in Python or something like that. It looks like I could publish tweets using an akka Event Bus and each bot would consume the event stream. This is all going to be on a single machine for now – some cheap Digital Ocean instance maybe – so I don’t need any of the distributed-ness that I think akka could provide. At least not for a while.

Has anyone used akka in this way? Is akka a bad fit for something like this? Should I stick to an actual queue like zeromq? Or if it’s OK that everything is in Scala would it actually be really simple to use akka for this?

Adbot
ADBOT LOVES YOU

Cryolite
Oct 2, 2006
sodium aluminum fluoride

Sedro posted:

Do you need persistence? Reliable delivery? Throttling/backpressure in case of slow consumers? If you don't need the features of a message queue product, don't use one. You can always add it later.

There's an example project using akka's distributed pub/sub and sources on github.

Thanks. I don't need those things, and that looks like a good starting point for what I need. I think I had looked at this specific example before but after looking a little more closely after your suggestion I think it'll work out.

I spent a few hours trying to get it working just now but can't seem to get my actors talking to each other across JVMs. Very frustrating. I oscillate back and forth between thinking Scala is great and thinking it's not for me because I'm a loving idiot who can't get anything working. I'll have to put this aside and try more later once I have a better understanding of akka. I'm mostly copying and pasting incantations without knowing what's going on.

  • Locked thread