|
Keep in mind that git hooks are client side so don’t enforce any real restrictions.
|
# ? Jun 29, 2023 13:19 |
|
|
# ? May 30, 2024 12:19 |
|
Anyone here ever adopt git for a data analytics team? Trying to slowly push our team to adopt more SWE best practices while not overwhelming them so it'd be interesting to hear people's adaptation of gitflow or whatever with data analytics. I'm talking here about maintaining both analytical products (jupyter notebook reports, visualizations etc) and small analytical data models (sharing reusable python transform and utility scripts for wrangling data). Can't version control our Tableau dashboards or anything (guess we could but no point for diffs or anything) but at least our notebooks and scripts. We don't technically run any ETL or anything, but our data comes from all over the place so we're often doing lots of wrangling and cleaning and because most folks don't have a dev or SWE background everyone's keeping their notebooks saved locally or swapping them via email. We use gitlab and have jupyterhub with the git extension, so I wanna write up a little SOP for a very barebones stripped down workflow of using a central repo. Thinking here to just have a dev and master branch with each person creating feature branches and merging into dev for peer review then into master on "production". Most folks will prob be using the Gitlab GUI for this, so I'm thinking to setup lots of templates and also sync it up with our tasking tickets in Jira. Also doing this out of a selfish desire to learn CI, gitlab runners and such because our work flow for "publishing" reports involves a bunch of different manual buttonology and copy paste duplication I think we could automate. Any tips on stripping a Gitlab project to it's barebones and/or a decent user guide for such a workflow I could bootstrap from? Oysters Autobio fucked around with this message at 18:52 on Aug 20, 2023 |
# ? Aug 20, 2023 18:15 |
|
Have you looked at Meltano?
|
# ? Aug 20, 2023 19:46 |
|
porkface posted:Have you looked at Meltano? I had previously but for some reason looked past it. Thanks for flagging, it looks like actually would be a very good way to approach this in a prepackaged way.
|
# ? Aug 22, 2023 02:16 |
|
I use meltano for pretty much all our extract pipelines, plus you can normally adapt an existing tap for a specific scenario and add/remove what you don't need. We are also a PowerBI shop so the latest changes with .pbir files move us a step forward to being able to change control even more. Bring it on.
|
# ? Aug 31, 2023 22:11 |
|
Slimchandi posted:I use meltano for pretty much all our extract pipelines, plus you can normally adapt an existing tap for a specific scenario and add/remove what you don't need. What changes did PBI get that support version control? My big gripe with Tableau is it sucks for version control let alone templating.
|
# ? Sep 1, 2023 01:49 |
|
I have a rookie question, what's the best way to get any changes from the main branch into my dev branch? Basically I want to go from here: code:
To here: code:
|
# ? Oct 5, 2023 15:18 |
|
There are a couple of ways to do this but the picture you drew is a rebase (which is also the correct way).
|
# ? Oct 5, 2023 15:29 |
|
So going the rebase route, it won't merge any of my changes into the main branch? (I don't want my changes merged, yet)
|
# ? Oct 5, 2023 15:30 |
|
bobmarleysghost posted:So going the rebase route, it won't merge any of my changes into the main branch? (I don't want my changes merged, yet) No. Rebasing E on C means taking the deltas of each commit in the branch being rebased -- so D and E here -- and replaying those deltas on C. It's exactly meant to do the thing you're trying to do.
|
# ? Oct 5, 2023 15:46 |
|
nvm, thanks for the help!
bobmarleysghost fucked around with this message at 18:57 on Oct 5, 2023 |
# ? Oct 5, 2023 16:18 |
|
I may be wrong but rebasing dev seems particularly dangerous. My recollection is rebasing is poo poo if you have a lot of branches or the code has been pulled to multiple locations. Pre rebase, most branches see the code as code:
|
# ? Oct 7, 2023 15:34 |
|
This is his own branch not a general dev branch. Rebasing shared branches like a general dev or the main should be kept minimal but go ham on your own branches. edit: and in that described case you also rebase X onto the new E after it was rebased.
|
# ? Oct 7, 2023 16:24 |
|
Yeah, you're right, now I remember. Doing it on your own branch is generally fine, but one time I had pulled someone else's feature branch so I could help them out, and when they rebased it ended up meaning I couldn't just pull in the branch I ended up needing to blow up some stuff and start over. I think the rule we settled on is just don't rebase if your mydev branch has been pushed or branched off of.
|
# ? Oct 7, 2023 16:55 |
|
Yeah, rebasing rewrites commit history instead of just appending to it, and any history-rewriting operation becomes a huge pain in the rear end if anyone else has a copy of the branch.
|
# ? Oct 7, 2023 19:22 |
|
I usually recommend merges instead of rebasing. That way, each commit is a record of the context in which it was written, and the merge commits themselves are a record of the conflict resolution steps - or a record affirming that there was no conflict. This may be because I started in Mercurial before Git.
|
# ? Oct 10, 2023 00:01 |
|
I managed to disconnect my Git repository from the Eclipse project. What's the best way to reconnect the two?
|
# ? Oct 10, 2023 22:49 |
|
What does git status on the command line say?
|
# ? Oct 10, 2023 23:04 |
|
spiritual bypass posted:What does git status on the command line say? I have no idea how to check that. How would I go about doing that? I've only really used the functionality that's accessible through the Eclipse GUI.
|
# ? Oct 10, 2023 23:15 |
|
I think he means type “git status” on the command line.
|
# ? Oct 10, 2023 23:20 |
|
Maigius posted:I have no idea how to check that. How would I go about doing that? I've only really used the functionality that's accessible through the Eclipse GUI. You're doing yourself a tremendous disservice by not learning at least some basic Git CLI commands.
|
# ? Oct 10, 2023 23:30 |
|
smackfu posted:I think he means type “git status” on the command line. Ok, when I do that while in the folder it shows me a giant list of files that had changed between local and remote. In Eclipse however, it's not showing the push or pull options and the project title went from project-name(branch) to just project-name.
|
# ? Oct 10, 2023 23:31 |
|
Ok, so at least that means git is still there and working
|
# ? Oct 11, 2023 01:06 |
At the very top of the git status, does it say what branch you're in? Or does it perhaps say "detached head"? Or maybe a merge or rebase in progress?
|
|
# ? Oct 11, 2023 06:22 |
|
nielsm posted:At the very top of the git status, does it say what branch you're in? Or does it perhaps say "detached head"? Or maybe a merge or rebase in progress? I still see the branch name, I was in the middle of a merge with a bunch of conflicts when I hit the disconnect option.
|
# ? Oct 11, 2023 16:25 |
|
Maigius posted:I still see the branch name, I was in the middle of a merge with a bunch of conflicts when I hit the disconnect option. It sounds like the Git part is fine, but Eclipse has set itself to ignore Git or forgot how to access git. Hope that tells you where to start looking. The git info is all stored in .git which is in the project directory, there isn't much subtle going on there, and it sounds like .git is fine or the command line would be giving you issues.
|
# ? Oct 11, 2023 16:28 |
|
StumblyWumbly posted:It sounds like the Git part is fine, but Eclipse has set itself to ignore Git or forgot how to access git. Hope that tells you where to start looking. I got Eclipse and Git talking again and most of the merging issues cleaned up. I 100% need to learn more about Git. Thanks everyone for your help.
|
# ? Oct 11, 2023 18:56 |
|
Learning Git isn't the easiest thing in the world, but there are two reasons you should: 1. It will pay dividends throughout your entire career of software development 2. You only have to learn it once. It's not like learning a programming language that gets new features every couple of years; learn Git once, know it forever. At least, that's been my experience.
|
# ? Oct 11, 2023 22:11 |
|
I will preface this by saying I do all my day-to-day git work in GitHub desktop, but it's really useful to go through some tutorials to be able to do everything you'd normally do via the command line. Git has kept a focus on backwards compatibility, to the point where some of the Git workflows don't really make any sense, and so things like GitHub Desktop (or I'm assuming your Eclipse plugin) abstract them somewhat. On a day to day basis it's easier to just use that abstraction, but it's useful to know what's happening under the hood so that when stuff gets messed up, or you need to do something outside of that tool, you're not completely lost.
|
# ? Oct 11, 2023 22:15 |
|
Tequila Bob posted:Learning Git isn't the easiest thing in the world, but there are two reasons you should: I bet that's what they told people about why to learn svn.
|
# ? Oct 11, 2023 22:32 |
|
Vanadium posted:I bet that's what they told people about why to learn svn. Undeniably true! That said, if something replaces Git, I'll eat my words. (Happily, too - I don't think Git is the best VCS, though it is very good. I prefer Mercurial's branching and UI.)
|
# ? Oct 11, 2023 22:46 |
|
Vanadium posted:I bet that's what they told people about why to learn svn. No, I'm pretty sure no one ever said that about svn. It never had even a fraction of the dominance that git has now, and there was only a year between SVN 1.0 and git's first release. Even during SVN's heyday it was clear that one of darcs, git, or mercurial were going to replace it, and it was just a question of which one and when.
|
# ? Oct 11, 2023 23:43 |
|
That’s weird, I always thought SVN was an old and established piece of software by the time git appeared, but you’re right.
|
# ? Oct 12, 2023 00:41 |
|
CVS was that, SVN was trying to be a better CVS.
|
# ? Oct 12, 2023 00:45 |
|
Did CVS and SVN actually need a server like git, or could they just work with a shared network drive?
|
# ? Oct 12, 2023 00:57 |
|
Git doesn't need a server...
|
# ? Oct 12, 2023 01:06 |
|
smackfu posted:Did CVS and SVN actually need a server like git, or could they just work with a shared network drive? A shared drive in the context of CVS/SVN isn’t exactly much different than a central server. You still have a central server: it’s just for raw files instead of the SVN protocol. And then you get to trust that shared network drive can correctly and safely handle frequently accessed _and mutated_ files across all users. I wouldn’t be surprised to hear people did that, but running a server for the tools wasn’t exactly difficult. It’s easy to forget, too, that branching in those systems _sucked_. Literally full copies of the entire trunk into another folder was a “branch”. At least with a central server if you all worked off trunk your changes only made it to others if you committed them. If it was a shared drive? Good luck.
|
# ? Oct 12, 2023 01:12 |
|
smackfu posted:Did CVS and SVN actually need a server like git, or could they just work with a shared network drive? SVN does need a server, it doesn't store the diffs locally. one of its worst features is that checking out a branch entails re-downloading all the files in the repository from the server, so if you try the typical git thing of making new feature branches constantly you will want to tear your hair out within a day. The main benefit of central server based VCs like these is the ability to acquire an exclusive write lock on files, (some call this "checking out" the file but that's confusing) so if you're loving around with some impossible-to-merge binary file you can ensure that nobody else touches it and you won't see merge conflict hell. For example in game dev Unreal Engine stores a lot of code in binary "blueprint" files. if the editor's version control integration is hooked up to an SVN or Perforce server, the editor will prompt you whenever you modify a blueprint / have uncommitted changes to lock it in the VC. Using git for this is extremely painful since you won't know that two people are touching the same file until you hit an inevitable conflict, and since the merge tools for blueprints are nonexistent one dev or the other is going to have to re-make all their changes. RPATDO_LAMD fucked around with this message at 02:06 on Oct 12, 2023 |
# ? Oct 12, 2023 01:56 |
|
necrotic posted:A shared drive in the context of CVS/SVN isn’t exactly much different than a central server. You still have a central server: it’s just for raw files instead of the SVN protocol. This was probably in the early 2000s when I was using it at work so shared windows drives were easy and already existed and running a dedicated server was a big deal.
|
# ? Oct 12, 2023 02:34 |
|
|
# ? May 30, 2024 12:19 |
|
smackfu posted:Did CVS and SVN actually need a server like git, or could they just work with a shared network drive? Git doesn't need a server. That's the "distributed" part of DVCS and one of the immediately obvious wins (that it shares with hg and other DVCS), especially for casual "I just want to version some personal projects and not be a server janitor" use, versus stuff like P4 and SVN. SVN requires a server, and requires communicating with the server for many more operations than git does (because most of the history is stored on the server, unlike git which stores a complete local copy), which makes it brutally slow if you have a slow network connection or high latency to the server. I believe that modern SVN is somewhat better about this and caches more things locally, but it still needs a server backing it. (Last time I used it there were two server implementations, one that required a full Apache install with a bunch of special modules loaded and a much easier to administer standalone server (svnserve) with a giant "don't loving use this it will destroy your data forever" warning on it. I think svnserve is now actually usable, but don't quote me on that.) CVS also requires a server, but I've never actually used it. So does P4 and for personal use it is actually way easier to janitor than SVN, I ran it for a few years until git came along and I dumpstered centralized version control in my personal life forever.
|
# ? Oct 12, 2023 02:48 |