|
hey yosposters, i didn't see a thread for BIG DATA. does anyone else here work with WEBSCALE BIG DATA? do you machine learn? are you ushering in the dominion of our superintelligent DEEP LEARNING AI OVERLORDS? or do you just work on recommendation engines for lovely ecommerce stores? this thread is for anyone who can't see the random forest for the decision trees. hopefully it will SPARK some discussion. is that big data in your pants, or are you just happy to see me?
|
# ? Mar 3, 2015 23:20 |
|
|
# ? Jun 10, 2024 13:20 |
|
i'm gonna write a convolutional neural net (CNN) to learn to identify posts as bad as yours
|
# ? Mar 3, 2015 23:22 |
|
Glorgnole posted:i'm gonna write a convolutional neural net (CNN) to learn to identify posts as bad as yours more like a restricted buttsman machine
|
# ? Mar 3, 2015 23:24 |
|
we have medium sized data but use all the big stuff. its neat.
|
# ? Mar 3, 2015 23:26 |
|
Jonny 290 posted:we have medium sized data but use all the big stuff. big data tools are hot garbage though like if all your data could fit in a sql db u would want to do it b/c hadoop sucks hard
|
# ? Mar 3, 2015 23:30 |
|
the ladies tell me my data is quite large, op
|
# ? Mar 3, 2015 23:33 |
|
Malcolm XML posted:big data tools are hot garbage though hence all of the dbs sitting on top of it now, e.g. hbase, cassandra, impala, yomamma, etc.
|
# ? Mar 3, 2015 23:34 |
|
yeah we put cassandra on top
|
# ? Mar 3, 2015 23:35 |
|
lol if u write map reduce jobs
|
# ? Mar 3, 2015 23:37 |
|
What's the definition of big data, too much to fit in an excel workbook?
|
# ? Mar 3, 2015 23:40 |
|
*grabs crotch*
|
# ? Mar 3, 2015 23:44 |
|
pointsofdata posted:What's the definition of big data, too much to fit in an excel workbook? if you have to ask...
|
# ? Mar 4, 2015 00:08 |
|
anyone here do machine learning? i've been using scikit-learn at work, it's pretty frickin' awesome. especially combined with pandas. i know python is a plang and all, but it's really good at this kinda stuff.
|
# ? Mar 4, 2015 00:09 |
|
We just use Teradata OP
|
# ? Mar 4, 2015 01:00 |
|
i can't type "hadoop" without typing "hadpoop" and then deleting the extra p hth op
|
# ? Mar 4, 2015 01:10 |
|
http://molleindustria.org/files/BIG-DATA.html posted:BIG DATA EXCITES EVERYTHING
|
# ? Mar 4, 2015 01:10 |
|
what is this loving goddamn thread ona bout
|
# ? Mar 4, 2015 01:16 |
|
To teh
|
# ? Mar 4, 2015 01:19 |
|
I'd like to introduce to my friend, I call him BIG DATA
|
# ? Mar 4, 2015 01:22 |
|
it's my dick.
|
# ? Mar 4, 2015 01:22 |
|
NOTORIOUS H.B.D.
|
# ? Mar 4, 2015 01:28 |
|
horton: *puts ear on thistle* who: "hadoopkin lmao"
|
# ? Mar 4, 2015 01:37 |
|
pointsofdata posted:What's the definition of big data, too much to fit in an excel workbook? that is in fact the exact definition
|
# ? Mar 4, 2015 02:09 |
|
anyone here used infobright? my company is considering using it for data warehousing.
|
# ? Mar 4, 2015 03:46 |
|
purchase an exadata with oracle olap tia
|
# ? Mar 4, 2015 03:48 |
|
will someone explain in small words what a hadoop is thank you.
|
# ? Mar 4, 2015 03:52 |
|
i use json in an oracle 11g clob, so yes, i do big data
|
# ? Mar 4, 2015 03:55 |
|
yard salad posted:i use json in an oracle 11g clob, so yes, i do big data
|
# ? Mar 4, 2015 03:56 |
|
MALE SHOEGAZE posted:will someone explain in small words what a hadoop is thank you. The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.
|
# ? Mar 4, 2015 04:02 |
|
yard salad posted:i use json in an oracle 11g clob, so yes, i do big data lol
|
# ? Mar 4, 2015 04:04 |
|
MALE SHOEGAZE posted:will someone explain in small words what a hadoop is thank you. hadoop is an umbrella that has a few different software projects: yarn is a cluster manager system that distributes tasks among boxes in a cluster and schedules batch jobs hdfs is a distributed filesystem that stores the data you want to query mapreduce is a way of writing distributed/concurrent queries, but with a much shittier api than sql. "map" is basically the select phase of the query, and "reduce" is the aggregation part of the query. But you have to write it all in java (or scala or python or) hive is a way of autogenerating mapreduce queries from sql and of imposing a relational schema on your lovely hdfs data there's some other poo poo but that's basically it DimpledChad fucked around with this message at 04:21 on Mar 4, 2015 |
# ? Mar 4, 2015 04:05 |
|
DimpledChad posted:hadoop is an umbrella that has a few different software projects: sounds like a series of problems that got solved a long time ago.
|
# ? Mar 4, 2015 04:08 |
|
Citizen Tayne posted:sounds like a series of problems that got solved a long time ago. that's what you sound like, old man
|
# ? Mar 4, 2015 04:22 |
|
hadoop is also essentially an open source clone of google's mapreduce frameork that they used to use to build their web indexes. google hasn't used mapreduce for that for a long time, though, they use streaming poo poo, more similar to apache spark and/or storm (two competing apache projects that basically do the same thing).
|
# ? Mar 4, 2015 04:30 |
|
DimpledChad posted:anyone here do machine learning? i've been using scikit-learn at work, it's pretty frickin' awesome. especially combined with pandas. i know python is a plang and all, but it's really good at this kinda stuff. yeah
|
# ? Mar 4, 2015 05:06 |
|
Bloody posted:yeah care to elaborate?
|
# ? Mar 4, 2015 16:36 |
|
i basically work in a data warehouse and drive a digital forklift
|
# ? Mar 4, 2015 16:38 |
|
bout to go on cyber-fmla because i threw out my electro-spine
|
# ? Mar 4, 2015 16:39 |
|
http://www.kchodorow.com/blog/2013/10/02/the-rise-of-big-data/quote:The Rise of Big Data
|
# ? Mar 4, 2015 16:39 |
|
|
# ? Jun 10, 2024 13:20 |
|
https://www.chrisstucchio.com/blog/2013/hadoop_hatred.html quote:They handed me a flash drive with all 600MB of their data on it (not a sample, everything). For reasons I can't understand, they were unhappy when my solution involved pandas.read_csv rather than Hadoop.
|
# ? Mar 4, 2015 16:45 |