diff --git a/content/posts/git-basics.md b/content/posts/git-basics.md new file mode 100644 index 0000000..4d1411c --- /dev/null +++ b/content/posts/git-basics.md @@ -0,0 +1,339 @@ +--- +title: "Git Basics" +date: 2020-12-07 18:54:31+0100 +draft: false # I don't care for draft mode, git has branches for that +description: "" +tags: + - git + - cli +categories: + - programming +series: + - Git basics +favorite: false +--- + +[Git][git] is a distributed version control system. Originally written by +[Linus Torvalds][linus] to be used with the development of the [Linux +kernel][kernel], it has now become the go-to way to share work between multiple +developers. + +In this article I will summarise what I feel to be the *next-step +basics* of `git`, explaining each notion along the way. + +[git]: https://git-scm.com/ +[linus]: https://en.wikipedia.org/wiki/Linus_Torvalds +[kernel]: https://www.kernel.org/linux.html + + + +I assume at least passing knowledge of `git`, and will therefore skip the +justifications for using `git` instead of flinging tarballs at one another. +I will also be skipping the explanation for the basic workflow of `git add`, +`git commit`, and `git push`. You can consider this guide to be aimed at 3rd +year students at EPITA, who have used `git` for a whole year to submit their +project but have not explored some of its more powerful features. + +## Starting out with branches and references + +To me, this is the most essential thing you need to remember when you using +`git`. It is part of what makes it special, and will be used though-out your +career. + +### Why you should use branches in `git` + +What makes `git` so useful, and so powerful, is the fact that it was conceived +from the ground up to operate in a decentralised manner, to accommodate the +Linux kernel programming workflow. + +That model de facto means that branching must be a lightweight operation, and +merging should not be hassle. Indeed, as soon as you start having people work +in parallel on a decentralised system, you end up creating "hidden branches": +each person's development tree is a branch on its own. + +If you try merging branches that do not have any conflict, the operation is +basically instantaneous: to take advantage of that fact I encourage you to use +branches in your workflow when using git. + +### Where is my HEAD + +The notion of `HEAD` in `git` can seem strange. You might first have encountered +it when checking out an older commit. `git status` helpfully tells you that you +have been guillotined: `HEAD detached at 78f604b`. + +To make it short, `HEAD` is a reference pointing to the commit that you are +currently working on top of. It usually points to a branch name (e.g: `main` or +`master`), but can also point to a specific commit (such as in the `checkout` +scenario I just mentioned). + +### Revisions + +Most of the commands I will show you need you to provide them with what `git` +calls a revision. This is usually means a way to specify a commit. + +There are multiple ways to specify a `revision`, you should know at least two +of them: `refname` and `describeOutput` which loosely correspond to branch +names and git tags respectively. Note that `@` is a shortcut for referring to +`HEAD`. + +You can also specify the `sha1` commit hash directly, or relative revisions. +A relative revision allows you to select the parent of a specific commit, +you can use the following revisions specifiers: + +* `~`: select the first-parent commit +* `^`: select the nth-parent commit (useful for merge commits) + +You can append numbers to those two specifiers, they differ in how they handle +merges. If you are applying them to a merge commit, `~2` will give you the +grand-parent of your commit, following the "*first parent*", whereas `^2` will +give you the "*second parent*" of your commit. + +## History manipulation + +Once you start using `git` for non-trivial projects, using some of the +practices that I aim to teach you, rewriting history will become your secret +weapon for productivity. + +I have to insist on one point though, which is that re-writing history that was +published and used by other people is often seen as a *faux-pas*, or worse! You +should only use it on private branches, making sure to never rewrite published +history unless absolutely necessary. + +### Picking cherries + +The easiest way to manipulate history is the `cherry-pick` command. It allows +you to "*lift*" a commit any other place in history, and plop it down in your +current branch. + +It's the easiest way to manipulate history, allowing you for example to pick a +commit which fixes a bug in another branch and apply it onto yours: simply do +`git cherry-pick `. + +It is however most likely not what you want to do if you later intend to merge +your branch with the one you lifted the commit from. Both sets of commits will +have the exact same change, and `git` will not be able to resolve the conflict. +In those cases, consider merging from a common branch whose purpose is applying +the fix. In that case, `git` will happily merge your branches later on without +making a fuss. + +### All your rebase are belong to us + +This is probably the single best command in all of `git` in my mind. Having the +access to `git rebase` allows you to commit as you work, without caring about +atomicity, commit messages, or even having working/compiling code. + +Rebasing allows you to make various changes to your branch's history: + +* Rewording a commit's message. +* Reordering commits +* Removing commits +* Squashing: merging a commit into another one + +This tool allows you to work on your own, commit early and commit often as you +work on your changes, and keep a clean result before merging back into the main +branch. + +#### Fixup, a practical example + +A specific kind of squashing which I use frequently is the notion of `fixup`s. +Say you've commited a change (*A*), and later on notice that it is missing +a part of the changeset. You can decide to commit that missing part (*A-bis*) +and annotate it to mean that it is linked to *A*. + +Let's say you have this history: + +```none +42sh$ git log --oneline +* 787dd36 (HEAD -> master) Add README +* 8d08529 Add baz +* 7188fb1 Frobulate bar +* 961d8fb Fix foo +``` + +And notice that missed a change that belongs to `Add baz`. You can `add` it to +your staged changes, and issue `commit --fixup @~`. This will create a commit +named `fixup! Add baz`. + +```none +42sh$ git log --oneline +* 92912ee (HEAD -> master) fixup! Add baz +* 787dd36 Add README +* 8d08529 Add baz +* 7188fb1 Frobulate bar +* 961d8fb Fix foo +``` + +If you then rebase using `-i --autosquash` will result in this interactive +rebase screen. + +```none +pick 961d8fb Fix foo +pick 7188fb1 Frobulate bar +pick 8d08529 Add baz +fixup 92912ee fixup! Add baz +pick 787dd36 Add README +``` + +After applying the rebase, you find yourself with the complete change inside +`Add baz`, which can be confirmed with another `git log` + +```none +* 0174e54 (HEAD -> master) Add README +* b0a47ae Add baz +* 7188fb1 Frobulate bar +* 961d8fb Fix foo +``` + +This is especially useful when you want to apply suggestion on a merge request +after it was reviewed. You can keep a clean history without those pesky `Apply +suggestion ...` commmits being part of your history. + +### Lost commits and the reflog + +When doing this kind of history manipulation, you might end up making a mistake +and lose a commit that was **very important**. + +Obviously, `git` has a way to save us in this situation. If we look at the man +page for `git reflog`, we can read the following sentence: + +```none +Reference logs, or "reflogs", record when the tips of branches and other +references were updated in the local repository. +``` + +What does this mean exactly? Simply put, you can use it to checkout a previous +version of your repository, in the state it was in before you manipulated the +history. Let's illustrate with a small example. + +#### Mapping lost commits: a practical example + +Let's say you have this repository state at the beginning. + +```none +42sh$ git log --oneline +* 524de22 (HEAD -> master) Documentation update +* d60ddb5 USELESS COMMIT +* e81b5fb Remove baz dependency +* 44cea7d VERY IMPORTANT COMMIT +* 58eb2d9 Use foo without bar +* dab7792 Simplify frobulation +``` + +And decide to drop `c581d4d` (**`USELESS COMMIT`**), but inadvertently drop +`377921c` (**`VERY IMPORTANT COMMIT`**) at the same time. For this example, +I simply `dropped` both commits in a `rebase` operation. + +I notice now that I am missing my **`VERY IMPORTANT COMMIT`** in my history: + +```none +42sh$ git log --oneline +* ec8508b (HEAD -> master) Documentation update +* 3866067 Remove baz dependency +* 58eb2d9 Use foo without bar +* dab7792 Simplify frobulation +``` + +If I now use try to see what happened to my `HEAD` reference using `reflog`, +I can find the last update I did before starting my `rebase` to cancel the +whole operation. + +```none +42sh$ git reflog +ec8508b (HEAD -> master) HEAD@{0}: rebase (finish): returning to refs/heads/master +ec8508b (HEAD -> master) HEAD@{1}: rebase (pick): Documentation update +3866067 HEAD@{2}: rebase (pick): Remove baz dependency +58eb2d9 HEAD@{3}: rebase: fast-forward +dab7792 HEAD@{4}: rebase: fast-forward +612e6f5 HEAD@{5}: rebase (start): checkout 612e6f5a055280aac1d7608af2dd2443aed6875c +524de22 HEAD@{6}: commit: Documentation update +d60ddb5 HEAD@{7}: commit: USELESS COMMIT +e81b5fb HEAD@{8}: commit: Remove baz dependency +44cea7d HEAD@{9}: commit: VERY IMPORTANT COMMIT +58eb2d9 HEAD@{10}: commit: Use foo without bar +dab7792 HEAD@{11}: commit (initial): Simplify frobulation +``` + +By reading the `reflog`, I can see that my `rebase` started at `HEAD@{5}` +(reads: *`HEAD`'s fifth prior value*). If I want to return to the state of my +repository before starting that rebase, I can simply do `git checkout HEAD@6` +which will take me back to the state prior to the `rebase`. + +```none +42sh$ git checkout HEAD@{6} # Checkout my `HEAD`'s 6th prior value +42sh$ git log --oneline # Are we back before the rebase? +* 524de22 (HEAD) Documentation update +* d60ddb5 USELESS COMMIT +* e81b5fb Remove baz dependency +* 44cea7d VERY IMPORTANT COMMIT +* 58eb2d9 Use foo without bar +* dab7792 Simplify frobulation +``` + +Now, I want to make sure that I have my `master` branch back to that state too, +and not simply my disembodied `HEAD`. + +```none +42sh$ git branch -f master # Change where `master` is pointing at +42sh$ git checkout master # Checkout `master` branch +42sh$ git log --oneline # Is everything in order? +* 524de22 (HEAD -> master) Documentation update +* d60ddb5 USELESS COMMIT +* e81b5fb Remove baz dependency +* 44cea7d VERY IMPORTANT COMMIT +* 58eb2d9 Use foo without bar +* dab7792 Simplify frobulation +``` + +And voila! I can now try my `rebase` again, and be careful not to lose **`VERY +IMPORTANT COMMIT`** this time. + +## Tips and tricks + +Here are some basic pieces of knowledge which don't really belong to any other +section, which I think needs to be said. + +### The importance of small commits + +You might have noticed that people keep saying that commits should be kept +**atomic**. What does that mean and why should it matter? + +Keeping commits atomic means that you should strive to commit your changes in +the smallest unit of work possible. Instead of making one commit named *WIP: add +stuff* at the end of the day, you should instead try to cut your work up into +small units: `add tests for frobulator`, `account for foo in bar processing`, +etc... + +This way of working has multiple things going for it once you start taking +advantage of `git`'s power: you can more easily reason about a line of code by +using `blame`, you can more easily squash bugs using `revert`, you can more +easily review the changes in an MR and keep its scope narrow. + +One very useful command you can add to your tool belt is `git add -p`, which +prompts you interactively for each patch in your working directory : you can +easily choose which parts of your changes should end up in the same commit. + +### Miscellaneous commands + +Here's a list of commands that you should read-up on, but I won't be presenting +further: + +* `git bissect` +* `git rerere` +* `git stash` +* and more... + +## Going further + +I advise you to check out [Learn git branching][learn-branching] to practice a +few of the notions I just wrote about, with a nice visualization of the commit +graph to explain what you are doing along the way. + +Furthermore, the [Pro Git book][pro-git] is available online for free, and +contains a lot of great content. You can read it whole, but I especially +recommend checking out chapter 7 (*Git Tools*) and chapter 8 (*Git +Configuration*). If you want to learn about the inner workings of `git` and how +it stores the repository on your hard-drive, checkout chapter 10 (*Git +Internals*). + +[learn-branching]: https://learngitbranching.js.org/ +[pro-git]: https://www.git-scm.com/book/en/v2