|
Maluco Marinero posted:Atlassian's Bitbucket have private repositories for up to 5 users, if you want somewhere to host your code repo for free and privately. Most of my stuff I just do through there. This is what I do now with BitBucket. I added a GitHub repository as well though so I can share the source with potential employers without too much hassle. I really wouldn't care if it was a library or something, but it's my whole site. It'd be cool if someone wanted to look at how to do stuff like login with Steam's OpenID (which was a pain), or get useful data from SteamAPI JSON (most of the API is horribly documented), I just don't want them literally taking what I've got and putting it on the internet as is. The code probably isn't cool enough to steal anyway I guess if someone does that I can take it as a compliment anyway. In reality I'd much rather get a good development job. Hopefully I can get a code review of what I've got here in CoC and see if it's even any good. Thanks all for the replies, the help is appreciated.
|
# ? Apr 14, 2013 18:35 |
|
|
# ? May 18, 2024 12:10 |
|
When we created the wookwook server whitelist system/pubbie kicking bot for bad company 2, and then ported it to battlefield 3, I maintained the code on github through v1 - v3.5. While I didn't write much of the code, it was really helpful to show my boss and say, "look at my project management skills! this was so useful we had five or six strangers fork the code and even pushed some bugfixes! I didn't write much, but I commented the majority of the code" etc etc. Got me the job anyways. 18 months after the project effectively died, I had two different people contact me wanting updates so that they could run the software on their server community. We released everything under the wtfpl license.
|
# ? Apr 14, 2013 19:10 |
|
Hadlock posted:When we created the wookwook server whitelist system/pubbie kicking bot for bad company 2, and then ported it to battlefield 3, I maintained the code on github through v1 - v3.5. While I didn't write much of the code, it was really helpful to show my boss and say, "look at my project management skills! this was so useful we had five or six strangers fork the code and even pushed some bugfixes! I didn't write much, but I commented the majority of the code" etc etc. Got me the job anyways. 18 months after the project effectively died, I had two different people contact me wanting updates so that they could run the software on their server community. WTFPL doesn't have a warranty disclaimer, I wouldn't use it. What happens if someone uses it for some obscene reason (in a medical device?), something goes wrong (death?), and blames your software? Just go with the MIT license, it more or less states "do what you want". And as a bonus, it won't scare of potential employers in the future.
|
# ? Apr 15, 2013 02:04 |
|
Gounads posted:WTFPL doesn't have a warranty disclaimer, I wouldn't use it. What happens if someone uses it for some obscene reason (in a medical device?), something goes wrong (death?), and blames your software? I'll bite. What happens?
|
# ? Apr 15, 2013 03:06 |
|
pokeyman posted:I'll bite. What happens? They sue you because they're idiots and their lawyers advise that you are not taking legal issues seriously and may be an easy way to recoup some of their losses. After spending tens of thousands of dollars on legal fees, you probably "win" and do not have to pay them any additional money. Getting sued sucks enough that it's reasonable to make low-effort decisions such as license choice with "how likely is this to make me look like a potential target" in mind.
|
# ? Apr 15, 2013 06:40 |
|
My friend actually writes software for medical devices. From the stories he told me, there's no way my code would make it through the auditing process at the FDA.
|
# ? Apr 15, 2013 07:00 |
|
ShoulderDaemon posted:They sue you because they're idiots and their lawyers advise that you are not taking legal issues seriously and may be an easy way to recoup some of their losses. I'd like to read about such a case. Do you have a link to a write up, blog post, or judgement?
|
# ? Apr 15, 2013 07:17 |
|
pokeyman posted:I'd like to read about such a case. Do you have a link to a write up, blog post, or judgement? My attitude towards being lax on legal terms is because I know personally four people who have been sued as a result of their published open-source software. That said, none of them were sued because they lacked a warranty clause; they were all sued for much, much stupider reasons. In no particular order:
The person being sued by "hackers are stealing my computer files"-dude got away the cheapest, spending only a few hundred dollars to successfully defend himself. One person (the trademark guy) settled for less than $1000 out of court; I believe he could have successfully defended himself, but it would have cost more. The other two both wound up spending something like $2000-$4000 to successfully defend themselves. None of them "lost" in any legal sense, but they all still lost time and money and had to deal with the stress of the situation. I don't believe any of them have written their stories; as far as I know, they don't have blogs. I can ask them. There wasn't anything particularly exciting about any of the cases, it was just a waste of time and money for everyone involved. Basically, in the United States, you can get sued by any random idiot for any reason or no reason at all, and while there may not be any merit to the case and it might be thrown out of court, that doesn't mean the cost to you is zero.
|
# ? Apr 15, 2013 09:08 |
|
Thanks for writing that up. You mention that none of these incidents occurred because of a missing warranty clause. Were any of them helped by such a clause, either in their defence or in their settlement negotiation? I've read lots of free legal advice on predominantly American websites but am not American myself, so I don't often hear any meaningful anecdotes or data behind the advice.
|
# ? Apr 15, 2013 09:31 |
|
Hadlock posted:My friend actually writes software for medical devices. From the stories he told me, there's no way my code would make it through the auditing process at the FDA. You can write poo poo code and have it end up in medical devices. The FDA is looking primarily for documentation - you have to have a quality system, and the steps of the design process have to be documented. You have to have processes, and you have to follow those processes in the course of designing the product. This generally leads to a lot of forms and paperwork. The FDA is looking for the 'existence' of a process, and that we 'follow' the process, not whether or not the code is any good. At the high level, you'll have your design, then your software, then your design verification - but it's not as if the auditors are actually really testing your product, they're just making sure that a) you have a process, b) the forms corresponding with the process have been filled out, and c) your company appears compliant with the process. The design verification stage can be passed with "does this component do what it says it is supposed to do" type testing - there's no imperative (suggestions, but no imperative) that you do say, a code review, or unit testing, negative testing, or security testing, or anything really but functional testing under simulated/actual device conditions. You may recognize this as 'the bare minimum necessary to get a product out the door if you expect it to work.'
|
# ? Apr 15, 2013 12:59 |
|
At work we're looking at getting an (I assume fairly simple) app or script to automatically re-establish comms after losing a connection on a piece of broadcasting software on a laptop that's closed up and carried in a backpack. Have I given enough information to get an idea of what's needed - what languages might be suitable etc.? Is this likely to be possible and/or easy?
|
# ? Apr 15, 2013 17:13 |
|
I'm trying to figure out the best way to handle some data. Part of the output of a program I've written is essentially four ~2-10million long integer arrays. I also have a web-based viewer of the output of the data where the server needs to be able to read, format and return any arbitrary interval of this data to the client. I've gone through many iterations of the server. Originally it was just a giant text file which necessitated reading into memory so my server needed to have any potentially retrieved dataset in memory or take the time to read and parse each dataset as it was requested which lead to really long page load times and it was difficult to get Python to let go of the data so it didn't eat all the server memory. I tried splitting it into smaller files but it just got complicated reading the right files or two files if the interval was spaced wrong and still loaded slowly. I currently have all of the data saved as compressed HDF5 datasets which has the advantage of not being stored in memory and not taking up the minimum space on disk, but the way HDF5 manages memory and cache is kind of opaque and I still have a lot more memory usage than I would expect. Additionally I had previously been running this on Heroku and Appfog which had no problem loading the text files from S3 which was convenient because this was trivial in cost, but obviously you can't mount an HDF5 dataset from anywhere but a local filesystem. I know next to nothing about databases, but this just seems like the wrong kind of data to put into an SQL database. Would they actually handle tables consisting of millions of integers and queries asking for intervals decently well or is there some better system for this kind of task? Should I just stick with HDF5 and buy a real server or VPS?
|
# ? Apr 16, 2013 18:05 |
|
If you don't think a database is meant to store and retrieve millions of values, what exactly do you think they are for?
|
# ? Apr 16, 2013 18:43 |
|
OnceIWasAnOstrich posted:I'm trying to figure out the best way to handle some data. Part of the output of a program I've written is essentially four ~2-10million long integer arrays. I also have a web-based viewer of the output of the data where the server needs to be able to read, format and return any arbitrary interval of this data to the client. Use a database.
|
# ? Apr 16, 2013 18:49 |
|
Zhentar posted:If you don't think a database is meant to store and retrieve millions of values, what exactly do you think they are for? Good point, I just worry about how efficiently it will store thousands of these datasets because uncompressed they start taking up quite a lot of space and I was thinking database storage would only expand that. In my head it just seemed like storing every integer separately was drastically inflating the amount of overhead I'd have but I guess that isn't true.
|
# ? Apr 16, 2013 19:03 |
|
OnceIWasAnOstrich posted:Good point, I just worry about how efficiently it will store thousands of these datasets because uncompressed they start taking up quite a lot of space and I was thinking database storage would only expand that. In my head it just seemed like storing every integer separately was drastically inflating the amount of overhead I'd have but I guess that isn't true. 40 million rows is not really a lot of data. For work I have one table in one database that has 62 million rows. Storage could be a problem but only you will know that. If you are querying data using a RDBMS is really handy. Can you recalculate the data on the fly? If you created it once can you do it again in real-time?
|
# ? Apr 16, 2013 19:11 |
|
ineptmule posted:At work we're looking at getting an (I assume fairly simple) app or script to automatically re-establish comms after losing a connection on a piece of broadcasting software on a laptop that's closed up and carried in a backpack. Operating system? For windows some combination of powershell and/or AutoIT. Bash/Cron on Linux. It's pretty routine to monitor and restart a service via a script.
|
# ? Apr 16, 2013 19:20 |
|
gariig posted:40 million rows is not really a lot of data. For work I have one table in one database that has 62 million rows. Storage could be a problem but only you will know that. If you are querying data using a RDBMS is really handy. Not only that, it's integers, so it'll top out at 160MB (or 320MB with doubles) which in most environments isn't really a ton of data.
|
# ? Apr 16, 2013 19:27 |
|
It sounds like you have a giant set of 40 million integers, and you want to request ranges (give me the 22nd integer to the 2,543rd integer). I doubt a SQL database would be good at storing and querying that. I mean, you can certainly try it out, but you have lots of options if it doesn't work out. Like, a simple file format that stores each integer as a 32-bit binary integer, so you'd simply mmap the file, go to the right index (index * sizeof(int)) then stream out the range sounds "good enough" to me, but you lose some flexibility if you need to do anything more complicated in the future. I'm curious about what all these integers are. A bit more context about what these integers mean might let us make a better choice.
|
# ? Apr 16, 2013 20:06 |
|
No this data can't be calculated, its actually the input to the program but needs to be kept for output as well. The actual data is RNA expression at each individual nucleotide in a genome, and I am using this expression data along with the sequence information to discover various interesting poo poo. The output plots the data on a simple x-y line plot with area, so I need to be able to grab any given interval easily. Directly accessing the file isn't something I'd thought about because I'm doing everything in Python, but it seems really easy. 160MB is fine, except I already have dozens and may have thousands of these datasets which was why the HDF5 was nice, it got my 160MB datasets down to around 20 with lzma. My problem was I was thinking of this as four big arrays and then going to millions of individual datapoints seemed alarming even though I guess it isn't. If dumping this into an RDBMS won't inflate that dramatically it shouldn't be much worse than just having uncompressed flat files, although I think I can mmap and seek around a gzipped file fairly easily in Python. edit:vvv I thought about that but it seemed like the worst possible solution, since if I store things individually there will be crazy overhead and it seems to just keep a giant pile of key:values so grabbing an interval would just be grabbing hundreds of completely random values from different places. I think either using mmap if I want small fast data access is good if I don't care about flexibility, which might really suck, or just going full-on and using a RDBMS are probably my best options. OnceIWasAnOstrich fucked around with this message at 20:26 on Apr 16, 2013 |
# ? Apr 16, 2013 20:11 |
|
Have you considered using a key-value store? But yeah, I'd suggest trying a regular database first as well. On SQL Server at least you can pretty easily enable compression for your table/schema if space ever becomes an issue.
|
# ? Apr 16, 2013 20:22 |
If you do want/need to save space, but still have the datapoints indexable, what you could do is split it across multiple files, e.g. each file containing exactly 100k or 1M values, and compress those individually. Then when you need a range of them, you can calculate the set of files containing the data required for the range and decompress just those parts. That way you avoid having to decompress the entire dataset to seek somewhere into the middle.
|
|
# ? Apr 16, 2013 20:24 |
|
mobby_6kl posted:Have you considered using a key-value store? How could that somehow magically be better than anything beyond what a simple bare-bones binary file full of integers? The keys are literally consecutive integers. OnceIWasAnOstrich posted:160MB is fine, except I already have dozens and may have thousands of these datasets which was why the HDF5 was nice, it got my 160MB datasets down to around 20 with lzma. My problem was I was thinking of this as four big arrays and then going to millions of individual datapoints seemed alarming even though I guess it isn't. If dumping this into an RDBMS won't inflate that dramatically it shouldn't be much worse than just having uncompressed flat files, although I think I can mmap and seek around a gzipped file fairly easily in Python. Ah, this is Python. That might hurt things a bit. It might be worth it to step it up and make a simple C server, but I don't know your use case well enough or what kinds of goals you need to meet. I hope you do realize that you're effectively building a simple database. Navigating around a compressed stream is hard. A few simple compression formats don't use that much context-sensitive data, so you might be able to whip up something. There's something to be said about making your own naive "compression" (I wouldn't even call it that) system where you have a "dictionary" full of integer values and and have a bunch of short pairs that index into the dictionary, if you have less than 65536 total integer values. I don't know what the range of your data is. nielsm posted:If you do want/need to save space, but still have the datapoints indexable, what you could do is split it across multiple files, e.g. each file containing exactly 100k or 1M values, and compress those individually. Then when you need a range of them, you can calculate the set of files containing the data required for the range and decompress just those parts. That way you avoid having to decompress the entire dataset to seek somewhere into the middle. The smaller you get, the less gains you get from compression, as the bigger the dictionaries are and the less available context there is. There's lots of possibilities here, but I don't know what tradeoffs you need or want to make. It sounds like a decently hard problem, though. Suspicious Dish fucked around with this message at 20:56 on Apr 16, 2013 |
# ? Apr 16, 2013 20:54 |
|
Suspicious Dish posted:Navigating around a compressed stream is hard. A few simple compression formats don't use that much context-sensitive data, so you might be able to whip up something. I'm sure I'm causing some terrible performance issues by using all of these wrappers and someone should tell me how bad this is, but Python seems to make this trivial. I wrote a file containing 0..9999 as a series of unsigned 32-bit integers Python code:
OnceIWasAnOstrich fucked around with this message at 22:53 on Apr 16, 2013 |
# ? Apr 16, 2013 21:10 |
|
OnceIWasAnOstrich posted:No this data can't be calculated, its actually the input to the program but needs to be kept for output as well. The actual data is RNA expression at each individual nucleotide in a genome, and I am using this expression data along with the sequence information to discover various interesting poo poo. The output plots the data on a simple x-y line plot with area, so I need to be able to grab any given interval easily. Directly accessing the file isn't something I'd thought about because I'm doing everything in Python, but it seems really easy. Have you looked at what other programs that process similar data use? If there's commercially available software that does stuff with the same kind of data, and it requires an Oracle license, then you know that storing this kind of thing in a database can work (though they might find clever/non-intuitive ways to store it). Even better, if there's open source software that works with the same kind of data, you can poke around its code and see what kind of data structures or database access it uses.
|
# ? Apr 16, 2013 21:43 |
|
OnceIWasAnOstrich posted:edit: The performance problem is that when you scale to such a large size seeking in the gzip takes about a second. I fairly easily did it using bgzip, but this is basically just rewriting what HDF5 does so what am I even doing. Going to just try it and see whether a database is better than hdf5.
|
# ? Apr 17, 2013 00:50 |
|
I'm playing around with Vagrant and it's beginning to drive me crazy. Trying to use the chef_solo provisioner none of my `chef.add_recipe "foo"` declarations get added to the `run_list` in `dna.json` (in fact, there is no `run_list` key in `dna.json` at all). I feel like I'm missing something obvious.
|
# ? Apr 17, 2013 07:52 |
|
I'm deserializing some data from a file and building an object graph. Some entities in the data carry references to others. While reading an entity with references, I select-or-insert the "target" entities. However the nature of some of the relationships can only be determined after certain properties on an entity has been set. Since I've just created the target entity, it doesn't have those properties yet - they will only show up when I get to the object proper in the data. So now the relationship cannot initialize unless the data happens to be ordered so properties with sensitive relationships come last (which the data format does not guarantee). Are there any good methods to help with that?
|
# ? Apr 17, 2013 11:08 |
|
I'm assuming the problem isn't trivial because you're not rolling your own deserialization? (I'm not trying to be an rear end or anything - doing this in two passes just seems like an obvious answer.) e: You don't happen to be loading EnergyPlus input files, do you? Because this, exactly, is a problem when loading them. raminasi fucked around with this message at 11:14 on Apr 17, 2013 |
# ? Apr 17, 2013 11:11 |
|
GrumpyDoctor posted:I'm assuming the problem isn't trivial because you're not rolling your own deserialization? (I'm not trying to be an rear end or anything - doing this in two passes just seems like an obvious answer.) Yeah, this is homebaked. I actually did for a while, but it was clumsy and got refactored out - I only encountered sensitive files now. I'm trying to avoid putting too much hardcoded logic into the deserializer, but I suppose I can put a method on the entity superclass that returns a bool isSensitive. Then just put those off until no nonsensitives remain.
|
# ? Apr 17, 2013 11:52 |
|
the talent deficit posted:I'm playing around with Vagrant and it's beginning to drive me crazy. Trying to use the chef_solo provisioner none of my `chef.add_recipe "foo"` declarations get added to the `run_list` in `dna.json` (in fact, there is no `run_list` key in `dna.json` at all). I feel like I'm missing something obvious. I've been working on a vagrant build for LAMP stack development so maybe my example will help. https://github.com/DBell-Feins/vagrant-php-master Let me know if you have any specific questions about how I'm doing something in particular.
|
# ? Apr 17, 2013 13:28 |
|
I'm working on some batch files to automate some stuff and I had a thought. One requires user input. Is there a way to check the user input against a text file with the acceptable input, and if it's not correct, report an error? Say the text file has: dog cat fish horse and the user puts in dag it will error and ask the user to re-enter the info. If it's correct it continues. Is this possible?
|
# ? Apr 17, 2013 15:03 |
The Automator posted:I'm working on some batch files to automate some stuff and I had a thought. You should be able to use the "find" command to search your text file for the entered data, then check the errorlevel returned to see if it was found or not.
|
|
# ? Apr 17, 2013 15:12 |
|
Hadlock posted:Operating system? For windows some combination of powershell and/or AutoIT. Bash/Cron on Linux. It's pretty routine to monitor and restart a service via a script. Thanks, it's for Windows. I'll look into those options.
|
# ? Apr 17, 2013 16:44 |
|
Carthag posted:Yeah, this is homebaked. Well, that only works if you know you'll never have any circular references. The two passes I was referring to were "load all objects, but leave references unassigned" and "wire up references." The specific solution you use will depend on whatever language you're working with. (When I do this in F#, for example, I make the reference resolution lazy.)
|
# ? Apr 17, 2013 18:55 |
|
GrumpyDoctor posted:Well, that only works if you know you'll never have any circular references. The two passes I was referring to were "load all objects, but leave references unassigned" and "wire up references." The specific solution you use will depend on whatever language you're working with. (When I do this in F#, for example, I make the reference resolution lazy.) I actually used callbacks (kinda, Obj-C blocks). Whenever I encountered a ref, I'd send a block (containing code to do the wiring) to a central manager that would notice when other the entity showed up and run the block. I guess I was just looking for better ways to do it because it's hell to debug things like that. I'll think on it some more, maybe find a cleaner way to do it so the code doesn't jump around so much.
|
# ? Apr 17, 2013 19:25 |
|
Is there a reason you're connecting objects as soon as possible rather than waiting until they're all loaded?
|
# ? Apr 17, 2013 22:00 |
|
I've been slowly teaching myself programming. As I learn and play with different languages, I sometimes trip over the little differences between one language or another, often just differences in syntax but not always. So as quick reference for whatever language I'm playing with, I'd like to see the same basic program in several languages, one that is relatively simple in purpose but complex enough to include several common things. Not just printing "hello world" and not just looping to print a sequence of squares - I'd like a piece of code that runs through several common activities, a general reference for the way each language handles the basics. Things like loops, functions, conditional statements, maybe file i/o, etc. Basically I'm looking for something like "The quick brown fox jumps over the lazy dog" but for programming languages. That may sound silly, I know. I'm aware that such a program may be terrible in terms of overall efficiency or elegance. I'm just wondering if there's anything like that out there to use as a quick reference and orient myself to the little (and big) differences between languages.
|
# ? Apr 18, 2013 06:45 |
|
heap posted:So as quick reference for whatever language I'm playing with, I'd like to see the same basic program in several languages, one that is relatively simple in purpose but complex enough to include several common things. Not just printing "hello world" and not just looping to print a sequence of squares - I'd like a piece of code that runs through several common activities, a general reference for the way each language handles the basics. Things like loops, functions, conditional statements, maybe file i/o, etc. http://rosettacode.org/wiki/Rosetta_Code
|
# ? Apr 18, 2013 06:51 |
|
|
# ? May 18, 2024 12:10 |
|
GrumpyDoctor posted:Is there a reason you're connecting objects as soon as possible rather than waiting until they're all loaded? Not really. I'm back to two-passing it for now, though. heap posted:So as quick reference for whatever language I'm playing with, I'd like to see the same basic program in several languages http://en.wikibooks.org/wiki/Algorithm_Implementation That one sometimes has a bunch of implementations of different algorithms for you to compare. For example their page on Levenshtein distance has it in 31 languages.
|
# ? Apr 18, 2013 06:52 |