|
Malcolm XML posted:what happens when ur special pet db just dies due to a hw failure? or the datacenter goes down? in the big systems, to a first approximation, total hardware failure doesn't occur. (in the very large systems, even the cpus and memory are redundant.) when it does happen, you have a secondary. HA in a single datacenter is very easy, even with databases. when the datacenter goes down, you're in a geographic failover. Malcolm XML posted:or _______? or it's been a few years and u now need a 1000TB DB and poo poo that's crazy expensive to buy and to run so what ? that's a good problem to have. throw money at it. by the time you have a 1000 TB database your IT budget is pretty fat Malcolm XML posted:or what if u hit the cpu limit on intel systems? or some other single system h/w limit? Notorious b.s.d. fucked around with this message at 19:32 on Jun 14, 2014 |
# ? Jun 14, 2014 19:28 |
|
|
# ? Jun 5, 2024 07:59 |
|
incidentally 1000 TB oltp databases don't exist in the wild. they just don't. nobody's working set is that big. 1000 tb is the land of column-oriented stores and data warehousing. vertica and friends. you'll need that fat it budget when you want to start digging into that space.
|
# ? Jun 14, 2014 19:43 |
|
oh yeah one more thing: if i were vertica i'd be scared shitless of the thousand tiny hadoop vendors one of them sooner or later is going to make something better than hive and fuckloads cheaper than column-oriented stores or the various olap platforms big data is a dumb fad but somebody is gonna make those big data platforms work for software markets that actually exist
|
# ? Jun 14, 2014 19:52 |
|
so this is what its like... when spergs cry
|
# ? Jun 14, 2014 19:54 |
|
notorious bsd i don't think you've thought this through
|
# ? Jun 14, 2014 19:57 |
|
Hellsworn Barn posted:notorious bsd i don't think you've thought this through yeah talking to children like adults never pans out
|
# ? Jun 14, 2014 20:02 |
|
what if demand on your services is not constant (e.g. high demand one day a month)? what about network saturation? what about latency? what about durability?
|
# ? Jun 14, 2014 20:23 |
|
bsd pls stop talking about things u have no idea about it's much much easier to manage a ah service if it can run on large numbers of tiny machines because they are cheap and scale up and down as need to fit the demand envelope we had a dc go down to a powercut we just spin up instances in the other ones in like 10 min and were fine out data needs are around 100tb a day and growing so we need something that can handle that much data times 30 for monthly analysis can't really buy petabytes of ram
|
# ? Jun 14, 2014 20:25 |
|
hive sucks tho
|
# ? Jun 14, 2014 20:26 |
|
Malcolm XML posted:out data needs are around 100tb a day and growing so we need something that can handle that much data times 30 for monthly analysis yeah there's no loving way you have an oltp workload on that. that's data warehousing. it's a completely different space w/ different vendors, different software, different budgets.
|
# ? Jun 14, 2014 20:33 |
|
Malcolm XML posted:hive sucks tho something we can all agree on
|
# ? Jun 14, 2014 20:33 |
|
Hellsworn Barn posted:what if demand on your services is not constant (e.g. high demand one day a month)? lol if you think you can answer these questions more effectively with a poorly-understood distributed setup running on someone else's hardware
|
# ? Jun 14, 2014 20:34 |
|
btw if your 100 tb a day is records of people clicking on poo poo, and not people actually buying poo poo, loving lol
|
# ? Jun 14, 2014 20:41 |
|
Notorious b.s.d. posted:yeah there's no loving way you have an oltp workload on that. that's data warehousing. it's a completely different space w/ different vendors, different software, different budgets. what if I told u the difference is arbitrary we run billions of transactions a day some of them are for product purchase transactions most are telemetry but all are important sorry bsd the world where oltp and olap can afford to be different is fading
|
# ? Jun 14, 2014 20:54 |
|
Malcolm XML posted:some of them are for product purchase transactions most are telemetry but all are important lol and you do monthly analysis double lol
|
# ? Jun 14, 2014 20:58 |
|
this truly is the thread for bsd
|
# ? Jun 14, 2014 21:09 |
|
Malcolm XML posted:hive sucks tho gotta post every time
|
# ? Jun 14, 2014 21:11 |
|
|
# ? Jun 14, 2014 21:14 |
|
killa beelaphants coming atcha
|
# ? Jun 14, 2014 21:14 |
|
Notorious b.s.d. posted:lol if you think you can answer these questions more effectively with a poorly-understood distributed setup running on someone else's hardware who said anything about someone else's hardware?
|
# ? Jun 14, 2014 21:31 |
Notorious b.s.d. posted:lol if you think
|
|
# ? Jun 14, 2014 21:39 |
|
Notorious b.s.d. posted:lol ya we do the last 30 days rolling as well as any multiple of 5 mins it's p great to see the last year and then drill down as granular as u want
|
# ? Jun 15, 2014 00:25 |
|
yep row stores are easier to scale vertically and column stores are easier to scale horizontally
|
# ? Jun 15, 2014 01:18 |
|
telemetry should never be "important" herp derp eye tracking studies are too expensive let's spend a million dollars a month on freshly graduated "data scientists" and try to let a clustering algo extract meaning from our garbage data
|
# ? Jun 15, 2014 02:03 |
|
tef posted:yep row stores are easier to scale vertically and column stores are easier to scale horizontally and they sustain different workloads
|
# ? Jun 15, 2014 02:04 |
|
Malcolm XML posted:ya we do the last 30 days rolling as well as any multiple of 5 mins it's p great to see the last year and then drill down as granular as u want you've reinvented RRDs with only several million dollars in infrastructure and you run oltp workloads on it good job
|
# ? Jun 15, 2014 02:05 |
|
what happened in your life to make you like this
|
# ? Jun 15, 2014 02:08 |
|
Squinty Applebottom posted:what happened in your life to make you like this i worked in the technology industry for too long let this be a warning to you
|
# ? Jun 15, 2014 02:15 |
|
Notorious b.s.d. posted:you've reinvented RRDs with only several million dollars in infrastructure and you run oltp workloads on it its more like a couple hundred k and it handles more data than we could do normally, is run by like 5-10 people as opposed to a full blown datacenter IT team + enough consultants and poo poo to make ur eyes bleed Notorious b.s.d. posted:telemetry should never be "important" Notorious b.s.d. posted:telemetry should never be "important" bsd i know ur gimmick is being a greybearded moron but for real u are a greybearded moron if "knowing more about your product" is unimportant eye tracking is expensive and we do it occasionally. small sample tests don't hold a candle to knowing the entire population data set and parameters as opposed to statistics tef posted:yep row stores are easier to scale vertically and column stores are easier to scale horizontally Notorious b.s.d. posted:and they sustain different workloads we r moving to a hybrid system since we have the problem of need both high volume low latency transaction processing AND analysis, a situation that BSD simply cannot comprehend
|
# ? Jun 15, 2014 02:17 |
|
Notorious b.s.d. posted:you've reinvented RRDs with only several million dollars in infrastructure and you run oltp workloads on it we might have to i think ofc not paying ibm is always a win e: unless u mean round robin databases, circular buffers are literally trivial the issue is when u want to keep that data around rather than losing it while maintaining quick enough access Malcolm XML fucked around with this message at 02:22 on Jun 15, 2014 |
# ? Jun 15, 2014 02:18 |
|
Blinkz0rz posted:also let ops handle it slap an index on the table
|
# ? Jun 15, 2014 03:22 |
|
a columnstore db is like an index for your whole table
|
# ? Jun 15, 2014 03:23 |
|
Notorious b.s.d. posted:oh yeah one more thing: if i were vertica i'd be scared shitless of the thousand tiny hadoop vendors except that hadoop mapreduce is really lovely at doing anything in less than a minute. hdfs isn't suiting needs either. not that i think vertica is magic but hadoop isn't much of a base to build on for beating it
|
# ? Jun 15, 2014 03:45 |
|
you should just use sql server 2014 with columstore indexes and in memory optimized tables. stored procedures are now compiled down to native byte code so you can get the expressiveness of sql with the speed of c.
|
# ? Jun 15, 2014 03:46 |
|
Squinty Applebottom posted:you should just use sql server 2014 with columstore indexes and in memory optimized tables. stored procedures are now compiled down to native byte code so you can get the expressiveness of sql with the speed of c. is this a shaggar parachute account, I can't tell
|
# ? Jun 15, 2014 03:50 |
|
Notorious b.s.d. posted:big data is a dumb fad but somebody is gonna make those big data platforms work for software markets that actually exist Malcolm XML posted:sorry bsd the world where oltp and olap can afford to be different is fading
|
# ? Jun 15, 2014 04:06 |
|
Squinty Applebottom posted:a columnstore db is like an index for your whole table i dont understand the point of it (maybe i just dont get *big data* but for some reason i have a sneaking suspicion that that's not it)
|
# ? Jun 15, 2014 04:15 |
|
vertica says this about themselves
|
# ? Jun 15, 2014 04:24 |
|
oh yeah fuuuuck ive got something good for this thread, hold on to yourbutss
|
# ? Jun 15, 2014 04:24 |
|
|
# ? Jun 5, 2024 07:59 |
|
ok so this is the site i ahve used at university to do stuff like sign a digital "i wont plagerise " form every time we do an exercise, and sometimes we get results here. (sometimes we get results via email, sometimes they are uploaded to a moodle system, sometimes they are uploaded to a trac system lol, its hilariously inconsisten) look how bad this is designed by a front end web developer with almost 9 years of apparent experience wtfffffffffffffffff
|
# ? Jun 15, 2014 04:48 |