Git. The WHAT and WHY Edition.
Lately I've seen a lot of new/aspiring developers talk about difficulties learning git. I was confused at first, because there are approximately a zillion git tutorials out there. But then I started really looking at what people were saying AND what all those tutorials contain. The problem is most tutorials start out with "here are a ton of git commands to memorize", and this is not where people are having problems. People are having problems with "what is git, why should I learn it, and what does it look like to use it in a real-life project?", and those questions are largely not answered in all those millions of tutorials. So at the risk of redundancy, I'm going to attempt to address these questions in Yet Another Article About Git.
First, I want to make it clear that
if you want to be a programmer, any kind of programmer, you need to know git. You just DO.
This is the short version of the answer to "why should I learn it?".The medium length answer is that git has become THE preeminent version control software since its release in 2005, and particularly since GitHub came about in 2008. This StackExchange question purports to show the change in number of repositories over time, with last count being 72% git, if you click through to the source of the data. And in the last StackOverflow developer survey, "Of the professional developers who responded to the survey, almost 82% use GitHub as a collaborative tool". This means the number of respondents using git is even higher, because GitHub is just one option for using git (more on this later). For context, the second most popular technology on their survey, JavaScript, comes in at only 67.7%. So git is probably the single most widely-used bit of tech for developers, above literally anything else. If you want to work on or use open source projects, if you want to work with someone else or FOR someone else, at some point somebody is going to point you to a git repo that you are going to need to be able to Do Stuff with.
The long answer is all the stuff about version control and tracking changes and how much easier it makes large projects with multiple people and why git is easier than the alternatives (no, git is not the only version control software, yes, I have used others including CVS, Subversion, and Perforce because I'm old enough that git didn't exist when I got started, and I still prefer git, as do many others judging from the overwhelming popularity of it and the incredible speed with which it took over), but when you're just getting started you're probably not worrying about any of that. The thing to focus on is that this is something that you need to know if you want to be a programmer. Git's extensive use by other programmers means that you are going to look really foolish in an interview or even a conversation with another programmer if you say you've been programming for X number of years and you don't know git, or if you're a CS student who thinks they're going to get a job out of school but "doesn't want to learn git" (which is an actual thing that one of our interns heard in recent years)... Just learn git, OK?
Now that you are hopefully convinced that knowing git is right up there in importance with knowing whatever other languages or tools you're currently working on (even for you, front-end web developers! Git should really be right on the top of that list with HTML, CSS, and JavaScript), I'll try for a simplified version of "what is git?" Git is software that lets you track changes to a collection of files (AKA a repository; it sounds fancy but it's literally just a collection of files) over time. This is not so important when you are just starting out and working on tiny personal projects, but the bigger the project and the more people working on it, the more vital it becomes. You tell the software to save the current state of the collection of files by adding and committing. Then if you screw up later you can roll back the code to a former commit; essentially this reloads the files to an earlier version when things were working. You can also create a branch, which acts like a separate copy of the collection of files; you can then manipulate those files without screwing with the original copy. Typically you do this when you're developing a new feature on an existing project, if you are doing something very experimental and you think it might break stuff, or if you're working with someone else and you'll both be making changes at the same time. The branch functions as a set of commits that you can easily keep or discard according to how things are going. There's much more to it, of course, but this is the basic stuff you need to understand to get started. All these things can be done all by yourself, on your own computer, without a GitHub or any other account.
So that is git on a very basic level. What about GitHub? GitHub is by far the most popular place to store git repositories online, on someone else's server. It has a lot of other cool features, but for the purposes of learning git, that's all you really need to know. It is not the only place to store git repositories (GitLab and Atlassian's Bitbucket are the other popular commercial options, but you can also set up your own server, which many companies do), and it's not actually necessary to start using git. You can use git on your computer and it will track all those lovely changes and branches all on its own. The reason people use GitHub/GitLab/Bitbucket is primarily to 1. create an off-site back-up of their work and 2. share their files so others can contribute to their project, or critique it, or just plain use it. Git itself, running on your computer, needs to be configured if you want to back-up or share your work. This is what people mean when they talk about remotes. You will need to tell git where the files should go when you push (the command to get your changes saved on the server), or where they should come from when you pull (the command to retrieve changes from the server and save them on your local machine). Luckily, this setup step usually only needs to be done once, and then you are good. So you can do it and forget all about it until you get a new computer and/or job.
The final thing I want to address about git is that, while git can get really, really complicated, and it can be intimidating to go to one of those tutorials and see huge lists of commands, the reality is that the vast majority of programmers are not actually git wizards, and they use a tiny fraction of all the available commands. As usual, XKCD has addressed this with a great comic:
So please, don't get intimidated by the enormity of the tool. 97% of the time the only things you will be doing are checking the status of your code, pulling, pushing, adding, committing, and branching. 6 commands. Really. Another 2 percent of the time you'll be creating pull requests (asking someone else to bring the changes you made in your branch of a project into the main branch of the project, which is usually done in a GUI anyway) and/or merging (doing the actual bringing in of changes made in one branch into another branch). The remaining 1 percent of the time, when you need to roll back a commit or discard your work before committing or start a brand new project (because in the real world projects are big long things, so you're not starting a new one every week) or when some other unusual situation comes up, you can use Google or a git cheat sheet or one of those cool mugs. To borrow Don Norman's concept, the knowledge exists in the world; you do NOT need to keep it in your head. The amazingly useful site Oh shit, git has the answers to how to get yourself out of many common messes. And of course, like the comic says, if all else fails save your work off elsewhere, delete the files on your local machine, and download a new copy from the server.
I'm a lead dev, I'm in charge of code reviews and keeping the codebase clean, all these repos that hold the code for our apps are my babies that I protect with my (professional) life, so I use git every day... But the above 100% applies to me. I've taken git classes that taught the more advanced features, I've read the docs... And I still have to Google them any time I actually need them, because it's so rare that I can't remember the commands off-hand. I don't even use merge from the command line anymore because we have our own GitLab instance and I like the GUI better due to the way it shows changes. I also start new projects from the GUI because at the same time I like to add a description and the project icon and set permissions and protect branches and......., so I also don't git init anymore (just a nice simple git clone <PASTE THE URL FROM GITLAB> after it's all set up). Since our team is so small I prefer to keep all the history from merging, so I don't even bother with rebasing.
I hope this simple overview and explanation will bring git down to earth for new developers. It's something you need to know, but it really isn't all that scary in practice. You can do it! I am planning to write another post soon about some real-world scenarios and the commands used, as well as explaining a couple of different branching models (which are essentially a way to organize/regulate how you use git within a group of developers), since this is another git area that seems to be under-explained. If you know of any other areas that are under-explained, please let me know! Git is such a ubiquitous tool, nobody should need to struggle learning or understanding it.
Thank you so much for breaking it down from the beginning!
ReplyDelete