Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Forums Terrorist
Dec 8, 2011

snype

Adbot
ADBOT LOVES YOU

nerdz
Oct 12, 2004


Complex, statistically improbable things are by their nature more difficult to explain than simple, statistically probable things.
Grimey Drawer
am i a terrible programmer for not doing tdd

Bloody
Mar 3, 2013

suffix posted:

is this :butt: or your own server?

you should figure out your replication strategy - are you going to have three beefy servers with full replicas, or have the data distributed on smaller machines?

what's the fastest you'll ever need to read all the data at once?

its my own computer

why would i have replicants

im going to do data :science: on it so the fastest ill ever need to read anything is well id like it to not be that slow

Bloody
Mar 3, 2013

nerdz posted:

am i a terrible programmer for not doing tdd

no

coffeetable
Feb 5, 2006

TELL ME AGAIN HOW GREAT BRITAIN WOULD BE IF IT WAS RULED BY THE MERCILESS JACKBOOT OF PRINCE CHARLES

YES I DO TALK TO PLANTS ACTUALLY

Bloody posted:

its my own computer
what data is this exactly

Valeyard
Mar 30, 2012


Grimey Drawer

Bloody posted:

its my own computer

why would i have replicants

im going to do data :science: on it so the fastest ill ever need to read anything is well id like it to not be that slow

wtf are you doing with that much data in your spare time

Bloody
Mar 3, 2013

coffeetable posted:

what data is this exactly

250 gigabytes of gzipped csv

Bloody
Mar 3, 2013

Valeyard posted:

wtf are you doing with that much data in your spare time

:spergin:

coffeetable
Feb 5, 2006

TELL ME AGAIN HOW GREAT BRITAIN WOULD BE IF IT WAS RULED BY THE MERCILESS JACKBOOT OF PRINCE CHARLES

YES I DO TALK TO PLANTS ACTUALLY

Bloody posted:

250 gigabytes of gzipped csv
no as in what's the data about

Bloody
Mar 3, 2013

:iiam:

coffeetable
Feb 5, 2006

TELL ME AGAIN HOW GREAT BRITAIN WOULD BE IF IT WAS RULED BY THE MERCILESS JACKBOOT OF PRINCE CHARLES

YES I DO TALK TO PLANTS ACTUALLY
because you can probably work with a random 1% of the rows and make your life a whole lot easier. assuming you define "random" well enough

coffeetable fucked around with this message at 18:00 on Sep 27, 2014

coffeetable
Feb 5, 2006

TELL ME AGAIN HOW GREAT BRITAIN WOULD BE IF IT WAS RULED BY THE MERCILESS JACKBOOT OF PRINCE CHARLES

YES I DO TALK TO PLANTS ACTUALLY
also lol at being secretive about what you're working on. someone might steal your idea!!

MononcQc
May 29, 2007

nerdz posted:

am i a terrible programmer for not doing tdd

The advantages of TDD are:

- You think about what your code should do and expose as an interface before jumping in (write it with the user in mind, not the programmer)
- Your tests are more about what the code should do than crystallizing what crappy code you wrote is doing right now
- It's a shitload more boring to write code after everything is done, than making it part of the design process

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

Bloody posted:

250 gigabytes of gzipped csv

is this the funny computer master corpus

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

MononcQc posted:

The advantages of TDD are:

- You think about what your code should do and expose as an interface before jumping in (write it with the user in mind, not the programmer)
- Your tests are more about what the code should do than crystallizing what crappy code you wrote is doing right now
- It's a shitload more boring to write code after everything is done, than making it part of the design process

- You write your code in a more testable manner rather than building up a structure that's hard to interpose things into for testing or modification

Bloody
Mar 3, 2013

eschaton posted:

is this the funny computer master corpus

yep!


coffeetable posted:

because you can probably work with a random 1% of the rows and make your life a whole lot easier. assuming you define "random" well enough

nope!


coffeetable posted:

also lol at being secretive about what you're working on. someone might steal your idea!!

:rip:

Notorious b.s.d.
Jan 25, 2003

by Reene

Shaggar posted:

it doesn't pause the process. VMware runs the same instance on both hosts simultaneously and switches the I/O. its transparent to clients. the only restrictions are that you need the hosts to be on the same disk+network. vmware uses it for both HA (same instance runs on both hosts all the time) and for migrating guests.

[...]

either xenserver is garbage (which is possible because its opensores) or the flaw is so bad it affects running guests and migrating them to a clean host wont fix it

there's a much simpler reason.

reserving the ability to migrate VMs between hosts means spending more money on network infrastructure and storage than you otherwise would.

being migrated when a host fails is not part of amazon's SLA. why would they spend a single nickel on something they never promised to customers?

Notorious b.s.d.
Jan 25, 2003

by Reene
btw it is almost certain that AWS VMs are hosted on local storage

Shaggar
Apr 26, 2006

Bloody posted:

where's the data import tool live? is it from visual studio or

its installed as part of sql server management studio so it would be in a related sql server start menu folder.

Bloody posted:

also is 10 billion rows too many to be able to use half decently at all once its loaded

should i be dumping this data in a big data

it depends on ur server, the data quality, and ur schema. also if this is a local instance of sql express theres a 12gb db size limit.

Shaggar
Apr 26, 2006

Notorious b.s.d. posted:

there's a much simpler reason.

reserving the ability to migrate VMs between hosts means spending more money on network infrastructure and storage than you otherwise would.

being migrated when a host fails is not part of amazon's SLA. why would they spend a single nickel on something they never promised to customers?

yeah but I wouldn't buy cloud services from someone who might reboot their cloud at random.

Notorious b.s.d.
Jan 25, 2003

by Reene

Shaggar posted:

yeah but I wouldn't buy cloud services from someone who might reboot their cloud at random.

so, you don't buy cloud services then?

Notorious b.s.d.
Jan 25, 2003

by Reene
lol azure's SLA is utter poo poo

quote:

For Cloud Services, we guarantee that when you deploy two or more role instances in different fault and upgrade domains, your Internet facing roles will have external connectivity at least 99.95% of the time.

For all Internet facing Virtual Machines that have two or more instances deployed in the same Availability Set, we guarantee you will have external connectivity at least 99.95% of the time.

it's no worse than amazon or google or joyent. because all of their SLAs are poo poo.

"if you stand up a service as a geographically distributed entity, some of your poo poo will be up most of the time"

Notorious b.s.d.
Jan 25, 2003

by Reene
99.95% sounds not so bad but that's half a day of downtime a year

half a day of downtime in your highly available, multi-datacenter infrastructure lol

Valeyard
Mar 30, 2012


Grimey Drawer

Notorious b.s.d. posted:

99.95% sounds not so bad but that's half a day of downtime a year

half a day of downtime in your highly available, multi-datacenter infrastructure lol

you could also look at it as 12 hours spread over a year, which is 30 seconds of downtime a day lol

Forums Terrorist
Dec 8, 2011

Why yes I will totally pay $texas for three nines of service

FamDav
Mar 29, 2008

Notorious b.s.d. posted:

99.95% sounds not so bad but that's half a day of downtime a year

half a day of downtime in your highly available, multi-datacenter infrastructure lol

its actually more like 4 1/2 hours.
but ok.

qntm
Jun 17, 2009

Notorious b.s.d. posted:

99.95% sounds not so bad but that's half a day of downtime a year

half a day of downtime in your highly available, multi-datacenter infrastructure lol

what do you get if you have no redundancy, like 97%?

Notorious b.s.d.
Jan 25, 2003

by Reene

qntm posted:

what do you get if you have no redundancy, like 97%?

no SLA at all

it can go down any time, for any period.

edit: this is how amazon defends its reboots to patch hosts. they have no obligation to you. when you spawn a VM there's no guarantees on performance, reliability, etc etc

Notorious b.s.d. fucked around with this message at 00:03 on Sep 28, 2014

Notorious b.s.d.
Jan 25, 2003

by Reene

FamDav posted:

its actually more like 4 1/2 hours.
but ok.

sorry, half a business day.

but remember, this is the SLA for everything being down. how much poo poo has to be broken that multiple datacenters are simultaneously unavailable?

under this SLA microsoft will refund you $0 and pay $0 in penalties in the event that they have a multi-datacenter failure for four and a half hours that takes your business down

under normal circumstances, like an entire datacenter going dark with no explanation, you also get $0. or amazon deciding to reboot your instances w/out permission to fix a bug.

cloud vendors don't have SLAs, they have agreements on the level of non-service

MononcQc
May 29, 2007

Notorious b.s.d. posted:

sorry, half a business day.

lol if your business isn't 24/7.


E: sorry forgot this is the safe zone/hideout

FamDav
Mar 29, 2008

MononcQc posted:

lol if your business isn't 24/7.

was going to say this. was also going to ask what you guys do wrt sla but then i found out you dont have one :)

quote:

E: sorry forgot this is the safe zone/hideout

doesnt count gas stymie should know better

suffix
Jul 27, 2013

Wheeee!

Bloody posted:

its my own computer

why would i have replicants

im going to do data :science: on it so the fastest ill ever need to read anything is well id like it to not be that slow

oh ok, then yeah lol go nuts with any sql server but mysql

for certain queries it can be faster to just read the data from cvs or whatever,
but then you have to deal with merging and handling larger-than-memroy stuff, so it's worth it to get the sql server to do it for you, make sure you have indices

if you want to run ad-hoc queries and have several beefy machines and don't care if you lose data it could be worth it to look in to elasticserach /w kibana.
on just one server it will be a lot slower than a real sql server, but it's so easy to make a cluster you can literally do it by accident

MononcQc
May 29, 2007

FamDav posted:

was going to say this. was also going to ask what you guys do wrt sla but then i found out you dont have one :)

Yeah. There's only SLA with some customers that asked for one in a contract, but we track all uptime and incidents in https://status.heroku.com/uptime and https://status.heroku.com/

Notorious b.s.d.
Jan 25, 2003

by Reene
heroku's uptime count excludes applications that aren't in fully redundant configurations

but given that heroku is backed by aws we can't very well expect them to be more reliable than the underpinning

fritz
Jul 26, 2003


actually that's a little bit worrying, if your data is so heterogenous that you cant do random sampling, are you going to be able to draw any meaningful conclusions at all

MononcQc
May 29, 2007

Notorious b.s.d. posted:

heroku's uptime count excludes applications that aren't in fully redundant configurations

but given that heroku is backed by aws we can't very well expect them to be more reliable than the underpinning

It's defined there: https://devcenter.heroku.com/articles/heroku-status#uptime-calculation

quote:

Heroku is a distributed platform spread across many different datacenters and components. During any given incident, it is rare for all applications running on the platform to be affected. For this reason, we report our uptime as an average derived from the number of affected applications.

[...]

We analyze data from a variety of logging and monitoring tools, and then use it to split each incident into segments. Each segment counts the number of affected apps for a designated period of time. Our measurement considers:
  1. The duration of each outage
  2. The percentage of running applications affected in each outage, (applications consisting only of a single idled web dyno are not considered).
  3. The total minutes of potential uptime in a month, and is calculated with the following equation:
code:
TM - SUM((OM1 * PA1) .. (OMn * PAn))
------------------------------------
                TM
TM: Total # of minutes in the month
OM: # of minutes spent in outage
PA: % of affected applications

the "single idled web dyno" that are excluded mean free applications that are automatically swapped out after 1h of inactivity and are not running at the time of the incident.

There is no running application excluded, and if your app has a single instance, runs for free, and is hit by an error, this is accounted for in the uptime metric.

MononcQc fucked around with this message at 03:31 on Sep 28, 2014

Luigi Thirty
Apr 30, 2006

Emergency confection port.

tell me about the best way to do c#/.net unit testing

visual studio has a bunch of stuff built in that is p cool and easy to use but microsoft so there's probably some framework that fellates you while you work or something

Luigi Thirty fucked around with this message at 05:19 on Sep 28, 2014

Bloody
Mar 3, 2013

fritz posted:

actually that's a little bit worrying, if your data is so heterogenous that you cant do random sampling, are you going to be able to draw any meaningful conclusions at all

Dunno yet! I don't want to make any assumptions yet. It's definitely a very hetero set in that it's a whole bunch of low dimensional but similar and potentially related independent time serieses glommed into a single format/collection (for better or worse)

I don't really expect to ever do much of anything on the entire set at once - I will be sampling from it for practically everything, just not randomly.

~Coxy
Dec 9, 2003

R.I.P. Inter-OS Sass - b.2000AD d.2003AD

Luigi Thirty posted:

tell me about the best way to do c#/.net unit testing

visual studio has a bunch of stuff built in that is p cool and easy to use but microsoft so there's probably some framework that fellates you while you work or something

get R#

then use either xunit or nunit. xunit is meant to be better but i'm not sure why exactly

use a mocking library, but don't go nuts with your mocks. if you're writing too much code to setup a mock then you can probably just write a fake class instead

don't mock out your database. some people say that this means it's not proper Unit Testing anymore but who gives a gently caress

Adbot
ADBOT LOVES YOU

BONGHITZ
Jan 1, 1970

c#f#r#

what is next im going crazy

  • Locked thread