|
KernelSlanders posted:Has anyone had any success using Spark's DataFrame object? Am I missing something or is the whole thing just a horrendously designed API that can't possibly be useful for anything? Like in pandas you can df['c'] = df['a'] + df['b']. Is there a simple way to do that in spark Dataframes? What about df['idx'] = df.id.map(lambda x: np.where(ids == x)[0][0])? Nope. You can try something like .withColumn(), I guess, but IIRC there's no straight-forward way to do it. How many other folks in here are doing Scala and Spark work? We just launched a small test cluster recently, and my life is now figuring out how we leverage that plus our existing Vertica (please kill me) store. Also now learning Scala because I've found functional programming in Python - my primary day to day language - a bit of a hassle, especially since most Spark docs and examples and use cases happen in Scala or apparently Clojure.
|
# ¿ Jul 26, 2015 06:48 |
|
|
# ¿ May 10, 2024 00:53 |