|
Bloody posted:idk why you'd ever use git from a command line every Git GUI I've ever used eventually gives up and tells you to run certain commands manually, or just gives up entirely
|
# ? Jul 13, 2016 15:44 |
|
|
# ? May 11, 2024 12:36 |
|
HoboMan posted:git Suits My Needs and is super good as long as you remember what the command is. i have to google how to remove a tag every time i once found a tag called "rm". still makes me chuckle sadly
|
# ? Jul 13, 2016 15:44 |
|
Power Ambient posted:for me its because every gui has loving sucked also i have a sweet mechanical kb so typing is very good same and same. somehow i've become the resident git expert at work. i think it's because i know how to rebase.
|
# ? Jul 13, 2016 15:51 |
|
qntm posted:every Git GUI I've ever used eventually gives up and tells you to run certain commands manually, or just gives up entirely stop breaking poo poo so badly then i guess
|
# ? Jul 13, 2016 15:54 |
|
I use sourcetree and it is needs-suiting
|
# ? Jul 13, 2016 15:55 |
|
GameCube posted:lol this might be it. God dammit lol plz keep us updated i assumed that your thing was different because my thing happened in ruby which is prone to errors like that
|
# ? Jul 13, 2016 16:03 |
|
also the go 'debugger' repl is so lol
|
# ? Jul 13, 2016 16:07 |
|
GameCube posted:lol this might be it. God dammit lol this is shameful wouldn't it be great if http actually already had a dedicated status code for a uri that's too long? no, surely the protocol's designers would never think of doing that.
|
# ? Jul 13, 2016 16:17 |
|
wait, this is a thing? how long is "too long"? this might gently caress me down the road
|
# ? Jul 13, 2016 16:21 |
|
jony neuemonic posted:same and same. lol same. I also know aboyt the awesome power of "git reflog" so I can magically restore everyone's broken branches when they inevitable gently caress up a rebase.
|
# ? Jul 13, 2016 16:36 |
|
afaik there's no hard-defined url length limit in clients or servers, or all those sites that work by base64ing user-generated content in a query parameter wouldn't work? it's just something you have to configure on your server end (nginx: http://nginx.org/en/docs/http/ngx_http_core_module.html#large_client_header_buffers, gunicorn: http://docs.gunicorn.org/en/latest/settings.html?highlight=limit_request_line#limit-request-line)
|
# ? Jul 13, 2016 16:41 |
|
Mr Dog posted:lol this is shameful status 414 only works if the server complains, not if (probably) the client silently truncates the URL and sends it to the server, which correctly says it is a 404 if it could support longer.
|
# ? Jul 13, 2016 16:41 |
|
sourcetree is good for git if im doing anything besides git add, git commit, git push, i usually do it in sourcetree Zemyla posted:Why is using snapshots instead of changesets a good idea? it's a lot faster to do blames and poo poo brap fucked around with this message at 17:11 on Jul 13, 2016 |
# ? Jul 13, 2016 17:09 |
|
Hey I'm trying to work with matching names between two quite large databases and I was wondering if anyone here had some tips. Firstly are their any packages for python that will make this sort of thing more painless? for removing all of the edge case prefix and postfixes that people love to enter for no reason. And secondly, whats the best way to handle slight differences in names between the databases, for example inconsistent use of middle names, last name/firstnames etc? I've seen some stack exchange answers suggesting fuzzy matching them but I'm not sure what the best way to implement this is. It seems like this would be the sort of thing that people must run into all the time, but as a p. lovely programmer I'm not really sure what I should be doing.
|
# ? Jul 13, 2016 17:59 |
|
levenshtein distance can be a decent metric for fuzzy string matching
|
# ? Jul 13, 2016 18:28 |
|
i used fuzzywuzzy for one little project and it seemed to work ok
|
# ? Jul 13, 2016 18:44 |
|
vodkat posted:Hey I'm trying to work with matching names between two quite large databases and I was wondering if anyone here had some tips. there really isn't a one size fits all solution cleaning and homogenizing data from different sources is always a huge pain. the "best" solution usually depends on what kind of errors are ok for whatever you're doing. in some cases false matches have to be avoided at all costs, so you just keep perfect matches and discard everything else. in other cases you know that one db has a tendency to have some prefixes or suffixes on names, so you just build up a list of the most common ones, do a first filter pass to remove those, and then do a perfect match between the result and the other db. or if false matches are ok with you, then yeah, doing some fuzzy matchings between the dbs and not giving much of a gently caress about it beyond that can work there's also the issue of what "quite large database" means, because dealing with a few gigabytes versus something in the hundreds of terabytes range requires different approaches. if you're dealing with names though i'm guessing it's probably the former
|
# ? Jul 13, 2016 18:53 |
|
qntm posted:
git branch branchname ?
|
# ? Jul 13, 2016 18:59 |
|
my stepdads beer posted:i have to google how to unstage every time, or view staged diffs "git status" tells you how to unstage: code:
|
# ? Jul 13, 2016 19:00 |
|
fritz posted:git branch branchname okay so now the question is why there are two commands which do the same thing and every tutorial recommends the stupidly-named one
|
# ? Jul 13, 2016 19:28 |
|
YeOldeButchere posted:there really isn't a one size fits all solution The database is just short of a gig which I guess is pretty small fry for most of the people here but as an academic and very lovely programmer its starting to test my knowledge and abilities quite a bit. Having looked at fuzzywuzzy it seems like that might be what I need use but how do you define when a match is good enough? is it a matter of simply plugging in number and seeing what the result is or is there a better way than trial and error testing?
|
# ? Jul 13, 2016 19:30 |
|
why is HEAD always written in all caps, and please tell me it's case sensitive because that would be the most unixy thing ever
|
# ? Jul 13, 2016 19:30 |
|
qntm posted:okay so now the question is why there are two commands which do the same thing and every tutorial recommends the stupidly-named one because gently caress you, that's why NihilCredo posted:why is HEAD always written in all caps, and please tell me it's case sensitive because that would be the most unixy thing ever because gently caress you, that's why
|
# ? Jul 13, 2016 19:36 |
|
vodkat posted:Having looked at fuzzywuzzy it seems like that might be what I need use but how do you define when a match is good enough? is it a matter of simply plugging in number and seeing what the result is or is there a better way than trial and error testing? that is exactly how i used it to check if user input is in a list. if the input isnt in the list i have fuzzywuzzy check the input against the list and rip out strings over a certain value. i just fudged around with the value until common spelling errors were giving back what i thought they should from the list.
|
# ? Jul 13, 2016 19:38 |
qntm posted:okay so now the question is why there are two commands which do the same thing and every tutorial recommends the stupidly-named one Those commands don't actually do the same thing. git branch foo creates a branch called foo but doesn't switch to it, whereas git checkout -b foo creates a branch called foo and switches to that branch.
|
|
# ? Jul 13, 2016 19:55 |
|
VikingofRock posted:Those commands don't actually do the same thing. git branch foo creates a branch called foo but doesn't switch to it, whereas git checkout -b foo creates a branch called foo and switches to that branch. then there should be a variant of git branch which also switches to the newly-created branch, putting it on git checkout makes no sense
|
# ? Jul 13, 2016 20:00 |
Presumably the tutorials use git checkout -b foo because it's one less command you have to type, and git made the combo command "git checkout -b" instead of "git branch -c" because thought the checkout was the more important half of the operation. Or maybe they just want checkout to do literally everything. disclaimer: git branch -c or the like might be a thing, but I'm on my phone so I can't check
|
|
# ? Jul 13, 2016 20:02 |
|
all of this git talk is really confusing to me because i think we have our own tooling on top of whatever svn already does. when i make a "branch", i get a branch on the server and then all of that is checked out into a folder named after the branch on my computer and that's where i do my work. then i commit that to the branch on the server. when it's ready to go to trunk, our internal tool merges my branch into a local copy of trunk on my computer. then i commit that to trunk on the server. how does that translate to gitspeak?
|
# ? Jul 13, 2016 20:40 |
|
my git workflowcode:
|
# ? Jul 13, 2016 20:47 |
|
vodkat posted:The database is just short of a gig which I guess is pretty small fry for most of the people here but as an academic and very lovely programmer its starting to test my knowledge and abilities quite a bit. since it fits in memory then you can do more or less whatever you want with it, so that's good if you do go with fuzzy matching stuff, then yeah, there's no way around the fact that you'll have to define some arbitrary threshold as to what constitutes a match and what doesn't. most of the time it gets chosen through a very technical empirical process called "loving around with it until it looks good enough". i mean, doing that usually means writing a bit of code to figure out basic stats like how many match a given threshold gives you or how many rows in one db match to more than one row in the other (which will skyrocket if your threshold is too lenient), but that sort of stuff is exactly why cleaning up data always sucks you should try to do as much processing to make things homogeneous before you try the fuzzy matching, though. the stuff like removing common prefixes or suffixes should be easy enough to do if you have a lot of that, and it will help with the fuzzy matching afterwards. for example if you're using edit distance (the levenshtein distance bloody mentioned earlier), then "Mr. Bob Smith" with "Mr. " removed would match right away with "Bob Smith" instead of requiring a threshold of 4 to account for the deletion of the prefix, which would also make it match with "Mr. Bob Smithwick" (4 character insertions) or "Mr. John Smith" (3 character replacements and 1 insertion) neither of which are what you want
|
# ? Jul 13, 2016 20:55 |
|
Progressive JPEG posted:lol if every commit isn't just swearing with increasing intensity
|
# ? Jul 13, 2016 21:18 |
|
YeOldeButchere posted:since it fits in memory then you can do more or less whatever you want with it, so that's good set a threshold by scoring a ton of poo poo against random strings, calculate the standard deviation, and multiply by three
|
# ? Jul 13, 2016 21:25 |
|
remember to make sure your random strings have uniform distribution!
|
# ? Jul 13, 2016 21:36 |
|
i used some semi-inappropriate word as a temporary variable name while i was testing something and then immediately regretted it when i forgot to remove it and committed it. i caught it in review and removed it. that was the first time i forgot to remove a temporary testing variable like that. it was also the first time i used something that was not just "qqqqq" as the name. immediately it bit me.
|
# ? Jul 13, 2016 21:41 |
|
Wheany posted:i used some semi-inappropriate word as a temporary variable name while i was testing something and then immediately regretted it when i forgot to remove it and committed it. i caught it in review and removed it. lol, i came here to post that i just sent a poo and fart filled test framework for code review, woops
|
# ? Jul 13, 2016 21:49 |
|
HoboMan posted:remember to make sure your random strings have uniform distribution! hmm actually shouldn't their distribution match like typical letter distribution?
|
# ? Jul 13, 2016 21:59 |
|
that's what they want you to think
|
# ? Jul 13, 2016 22:01 |
|
YeOldeButchere posted:since it fits in memory then you can do more or less whatever you want with it, so that's good also if the number of close matches is small enough you can possibly resolve them manually. assuming this is an operation you only want to do once.
|
# ? Jul 13, 2016 22:36 |
|
Bloody posted:hmm actually shouldn't their distribution match like typical letter distribution? probably ok, be sure to find the character probability of your set and then make a distribution skewed to match that probability (including average string length)! at least i think for the matching problem you want the probability of your set and not the general occurrence probability
|
# ? Jul 13, 2016 22:43 |
|
|
# ? May 11, 2024 12:36 |
|
I'm gonna open source some hobby code I wrote a while back and the poo poo I was writing even five years ago is goddamn embarassing. And it's all there in the Git history for people to point and laugh at. At least I'm in the right thread!
|
# ? Jul 13, 2016 22:44 |