Version Control Questions Megathread (SVN / git / whatever else)

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Version Control Questions Megathread (SVN / git / whatever else)

MononcQc: May 29, 2007

One architect where I work has got some kind of idea he suggested on the corner of a table a few weeks ago, basically replacing the database back-end of a web-service for text and html documents with either git or mercurial.

Past the loss of full-text search (which could possibly be done anyway by decoupling stuff) and the fact it is hackish to no end, the idea didn't sound that bad; you get versioning for your documents, can distribute and sync them, can use permissions, can edit them even if offline, etc.

Basically all the advantages of a DCVS, but applied to text documents. With caches and the behaviour of key-value storage, He judged decent speeds could be achieved for anything RESTful, except maybe for the document creation and problems of conflicts.

There are obvious downsides to this concept, and people do use all kinds of databases instead of version control for a reason. However, there are some clear upsides to it, which would push me to prototype something with a similar system for fun at home.

Have anyone ever heard of a similar project being done? I'm looking for success and/or atrocious failure stories.

# ¿ Apr 16, 2009 20:33

Adbot: ADBOT LOVES YOU

# ¿ Apr 28, 2024 05:37

MononcQc: May 29, 2007

deimos posted:

Why not use a document database like CouchDB or ThruDB? ThruDB seems specially oriented toward what you want.

I'm not looking for anything in particular, but just at the possibility of doing it "Because I can."

There would be all kinds of scaling complications like sharding, size of directories and access time (mostly what services like akamai must go through), etc. which would make it impractical for anything but a small personal blog or for a similar concept.

The whole idea is still interesting as far as toying goes, in my opinion.

# ¿ Apr 16, 2009 21:06

MononcQc: May 29, 2007

A question about Mercurial here.

I have a repo with many levels for more than one use that may look like that

code:

base
 !- docs
    !- en/
    !- fr/
 !- sites
    !- some site/
       !- static
       !- src
    !- another site/
       !- static
       !- frontend
       !- backend
          !- src/
          !- tests/

say I want to only change the stuff from sites/some site/src/.../whatever.ext. Is there any way I could only check-out/clone a single subdirectory of the whole repo? SVN or CVS would make this pretty easy, but it doesn't seem possible in mercurial. Is there any way to do it or is my best option to try and migrate to another DVCS that would let me do it?

EDIT: guys from #mercurial on freenode redirected me to http://mercurial.selenic.com/wiki/subrepos which should do the job.

MononcQc fucked around with this message at 19:51 on Jul 5, 2009

# ¿ Jul 5, 2009 19:41

MononcQc: May 29, 2007

crazyfish posted:

I'm curious as to why you would want to use a single repo for multiple sites. Is there some common shared codebase or other set of dependencies between them? If there isn't, I would venture to say that you should probably just use multiple repos.

Not my choice of structure.

# ¿ Jul 12, 2009 02:49

MononcQc: May 29, 2007

http://hginit.com
Actual nice guide to mercurial for SVN users by Joel Spolsky.

# ¿ Feb 24, 2010 18:43

MononcQc: May 29, 2007

I'm getting tired enough with git evangelism that I might go to CVS NT out of spite. Or maybe Fossil.

# ¿ May 2, 2011 16:23

MononcQc: May 29, 2007

I would absolutely love it if when using a given open source library or application, I could see that the person in charge keeps on pushing broken and retarded changes to the public, rather than having a pristine history that someone spent days refining to look good and just be surprised by all of the dumb poo poo when I actually get to use it and try to upgrade.

# ¿ Apr 3, 2013 13:43

MononcQc: May 29, 2007

Git from the bottom up [PDF] is the only git book I ever thought made sense to get the philosophy behind the tool.

Git is a leaky abstraction with a bad interface over its underlying format, and all efforts to hide this fact away end up being more confusing than anything else. Going in and understanding the underlying format itself is, as far as I can tell, the best way to ever feel comfortable with git and what it does to your repositories.

People have to stop acting as if the fact that git was distributed or not SVN was the problem. Mercurial is easy to grasp. Git is harder because the way people interact with it is bad and tries to imitate what you'd do with SVN while failing at it. The git people inserted this facade in front of it that badly hides its internals, and every power user ignores it, while others are left staring at it, trying not to see the guts behind while that's precisely what you need to get.

# ¿ Jul 8, 2013 14:04

MononcQc: May 29, 2007

Gazpacho posted:

MonocQc, I have no frigging idea what you're talking about.

I think git has a lovely interface that is unintuitive and annoying. Whereas CVS(nt), SVN, or even hg, will support similar sets of operations and transitioning from one to the other is usually easily done without needing to drop previously acquired knowledge, git will use similar terminology to represent fundamentally different operations (some examples). I find it useful to try and forget about previously acquired knowledge when dealing with git, whereas other source control tools I've used still had me able to reuse former knowledge far more easily. That's purely my opinion, though.

---

Git's way of doing things is also intimately related to how it represents commits as linked lists/trees of diffs/patches that can be applied, and not knowing about them severely limits you -- squashing, reordering commits, rebasing, cherry-picking them and whatnot are things you are expected to do that uses them.

This is probably related to how git allows you to change history: To change history, you need to understand how git represents data, in a way similar to how you need to know how to manipulate pointers in a data structure.

This is something you do not need when using SVN or mercurial, where you could happily imagine the tool works by taking a full snapshot of the entire project for every commit. You could also imagine them working the same way git works internally and disabling mutability, or you could also imagine them tracking individual files and attaching them to commits, if it felt comfortable, and use these tools fine 99% of the time. The rest is just optimization and implementation details you do not need to worry about.

Basically, CVS, mercurial, and SVN let you ignore its internal representation if you don't want to hear about it. It's possible the user has a mental model of how things work that isn't the same as reality, and that is not a problem, and will very likely never be one because the interfaces these programs present to you properly abstract these details away from you. You can usually carry that mental model across each of these tools with minimal overhead and keep being productive.

When you get to use git, though, that mental model has to go if it's not the one git uses already. To use a lot of major functionality in git, you need to understand how it represents changes, and how this representation ties in with the git vocabulary. The upside of it is that you usually get more flexibility out of the box by understanding this.

I can't exactly remember which mental model I had when I came to git, but I remember it was the wrong one. Until I managed to figure it out (through 'Git from the bottom up'), it was always a huge, huge pain to deal with commands that relied on an internal representation I felt I should not need to know about to use them.

And I kept feeling I should not need to know about this internal representation because people kept telling me git was easy and simple and "here's the command you need to use it's simple dammit". In hindsight, you definitely want to know more about the details. You can't stick to the interface in git the way you could in SVN, CVSNT, mercurial, or whatever without walling you off a major part of the features, no matter what people say.

MononcQc fucked around with this message at 18:11 on Jul 8, 2013

# ¿ Jul 8, 2013 18:07

MononcQc: May 29, 2007

Lysidas posted:

You lost me here, because this is flat out wrong. Git commits have pointers to full snapshots of the project content. Tools like rebase treat the differences between commits as diffs/patches, but Git internally does not store data in this way. The only tool that I know of that operates in this manner is darcs, but I haven't kept up with VCS news since starting to use Git.

I agree overall, though, that fundamentally understanding Git's data model is required to use it in any real capacity.

Yeah, I was overly vague, but went with a short way to mention it. I'm not actually aware of how git stores things on disk, but only of the tree/blob/commit representation as described in the "Repository: Directory content tracking" chapter of Git from the Bottom Up.

# ¿ Jul 8, 2013 19:49

MononcQc: May 29, 2007

I just git diff | vim - and work from there :sigh:

# ¿ Feb 14, 2014 16:36

MononcQc: May 29, 2007

down with slavery posted:

You can't point at advanced commands like stash and act like that's a problem, it's just another tool you can use within the git realm to manage your codebase. You never have to use it. Same with the other usages of git reset.

"I don't understand the problem. You can still pick files fine, you just don't have to use any of the other options."

# ¿ Feb 20, 2014 13:52

MononcQc: May 29, 2007

I have never used a GUI for git, but the command line feels exactly like the image I posted. A million commands to do a million things with a thousand options to give a billion possibilities.

It's powerful alright. It's just neither elegant, intuitive, nor easy to use. You have to suck it up and go plow through it, invest the time to memorize all the little elements of it.

Git people telling you it's usable if you stick to the basic commands is like Vim people telling you vim is approachable if you just start it in INSERT mode and don't press <Esc>. It hides very little of the beast behind it and you know it's still there, with all of its unfriendliness.

# ¿ Feb 20, 2014 14:02

MononcQc: May 29, 2007

hg is a very obvious candidate here.

# ¿ Feb 20, 2014 14:08

MononcQc: May 29, 2007

I've been using git daily for work for the past few years and still think it has a lovely unfriendly interface that doesn't deserve being defended. :colbert:

I actually think it's easier to learn and use git once you accept this as a fact, and stop expecting to be able to use it from a higher level like any other source control system, and just learn it from the bottom up.

# ¿ Feb 21, 2014 05:00

MononcQc: May 29, 2007

whenever I run a rebase I cp -r the entire repo to another directory so I can experiment without having to tell git "oh please undo this" in whatever incantation is required if I ever make a mistake, because it's actually simpler to do that than use git to do it.

# ¿ Feb 21, 2014 13:13

MononcQc: May 29, 2007

All of the commands you have mentioned are longer to remember and more tedious to use than cp -r. For the number of tricky rebases I may be doing (and not just the regular rebase master on a private branch), that's not even worth my time.

Hence git has a lovely user interface. It's simpler and faster to work around it to get similar results than to work with it.

Edit: if I've needed to do a complex rebase or re-merge of multiple branch say 15 times in the last year, and that running cp -r and rm -rf takes ~10s, it has taken me 1m50 to do that, and then use rebase as usual. That is over 58 minutes I haven't needed to read another git book and then looking up or trying to remember what's the reset command and hash to use.

At this pace, I would need roughly 30 years of git usage to make reading the book worth it for this use case alone. I'm usually not too lost in git and won't need to look up references that much for regular operations and slightly more complex stuff.

I remember being stumped for a long while when trying to fetch and merge a remote branch with a different name than mine while excluding a few commits, but I think I ended up cherry-picking instead for this v :shobon:

MononcQc fucked around with this message at 13:47 on Feb 21, 2014

# ¿ Feb 21, 2014 13:38

MononcQc: May 29, 2007

Volmarias posted:

If you can't spend several hours to learn how to use what should be one of your fundamental tools, I shudder to think of the code you write.

Seriously, I'm gobsmacked. I don't know what to say if you don't actually want to learn how to get better and you'd rather just whine about how git is so terrible because it's too hard to learn... reset? Do you even know about reset?

Hell, do you even know you can do rebase --abort? It's right there on the command line if there's a conflict.

I understand these aspects of git and use them. I make use of rebase --abort when I'm doing a simple rebase that goes wrong, no problem. I've used the reflog a few times to save my rear end before. 99% of the time I'm fine using git and never hit a problem. There are more annoying corners that I do not venture into, or that are always harder to figure out. What I don't want is to spend more hours learning git than I already have.

One example I remember is taking three branches, A, B, and C. branch A was to receive half the changes from B, and C, and a new fourth branch D was to receive the other half of B and C starting at the initial point of A. B and C were then to be deleted. The commits to be kept were in no sequential order although they didn't depend on each other, and some of them I wanted to squash together.

In less obvious cases like this where you're trying to untangle multiple histories and regroup them, I prefer to take a back up and go at it and see if I can do it with a) rebases, b) interactive rebases c) rebases + cherry picks d) cherry-picks only if there isn't a million commits e) actually research another solution because who knows. Nevermind trying different merge strategies.

if I get to e), I'll learn a different thing, but there's a point where I'm halfway through in poo poo (oh, half the branches are okay but now need to be brought back to their initial state) where I just don't feel like undoing the changes I've made to 2-3 branches anymore and going back from the initial repo is just simpler for me.

The assumption that not knowing or even just liking all of the git commands in that context makes me a developer who doesn't try to self-improve is ridiculous. I know enough git to get around without an issue most of the time (and I can google for things I don't remember from the top of my head, like the syntax to git force push to a remote having a different branch name than the local one, because why would you do that?), and have better things to do than dig deeper into git. Like I don't know, learning domain-specific stuff relevant to what I'm programming, for example.

MononcQc fucked around with this message at 16:16 on Feb 21, 2014

# ¿ Feb 21, 2014 16:14

Adbot: ADBOT LOVES YOU

# ¿ Apr 28, 2024 05:37

MononcQc: May 29, 2007

Mercurial has limited mutability and is somewhat safer in its approach: mutate locally, and stuff made public is immutable. It's basically what git users recommend as a convention, but baked in the tool.

# ¿ Feb 21, 2014 21:28

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Version Control Questions Megathread (SVN / git / whatever else)