pa g e 1 of 15 · git · leandro facchinetti git Lecture notes Leandro Facchinetti ⟨[email protected]⟩ Object-Oriented Software Engineering Johns Hopkins University 2016-09-19 the goal of this lecture is to answer three questions: What does Git do? How do you start using Git? Where do you learn more about Git? The intention is not to turn you into a specialist, nor to have you memorize commands, but to teach you the basic underlying concepts and how to perform the most used operations. After the lecture and reading these notes, you should know all necessary to work on the group project. Accompanying slides are available at: pl.cs.jhu.edu/oose/resources/git.m4v. w h at d o e s g i t d o ? g i t s o lv e s the problem of version control. The name might be unfamiliar, but problem is not: keeping track of the changes on a project as it evolves, and sharing and collaborating on it with other people. The version control problem arises whenever someone copies a file to modify it and compare versions, sends a project as email attachment, or loses work due to a corrupted or accidentally deleted file. Git is a version control system, which solves the version control problem by tracking the project’s history. As more companies and free-software projects use Git, knowing it becomes a valuable skill. It can also improve your personal life: version diary entries, recipes, lecture notes or any kind of personal project. Not only you will never lose work for technical issues again, but you will also have a rich history of the progress and more freedom to experiment, exploring and comparing ideas on different versions the project. I use Git with almost everything I do on the computer: from keeping my favorite vegan recipes to the preparation of this lecture. The person below does not. how do you start using git? there are two main ways to learn Git: install a Graphical User Interface (g u i ) that acts as a front-end for Git, and click around; or learn how to use the Command-Line Interface (c l i ) that comes with Git. For daily usage, a g u i can be more productive, but the lecture is based on the c l i because it is closer to the underlying concepts. Also, sometimes the g u i does not support the an operation, so it is important to learn the c l i even if not using it most of the time. The following sections cover the most common use cases for Git. First the underlying concepts, then practical examples on the command-line. The use cases are divided in three overarching goals: (1) using Git alone on a local computer; (2) using Git with multiple remote computers; and (3) using Git for collaboration. Most of my use of Git is through a GU I that comes as an extension to my text editor— Magit, in Emacs. If your preferred editor has Git support, give it a try. Otherwise, use a stand-alone Git G UI. See git-scm.com/downloads/guis. an a s i de a b ou t githu b i t i s c o m m o n for people to use the words Git and GitHub interchangeably, but they are not the same thing. Git is a tool and GitHub is a service provided by a company that facilitates the use of the tool. This class covers both Git and GitHub—which is the service the staff chose for the course—and it is important to know the difference. Git is to email as GitHub is to Gmail. It makes the use easier, but is not essential. It is possible to send emails from providers other than Gmail, and it is possible to use Git without GitHub. pa g e 2 of 15 · git · leandro facchinetti an a s i de a b ou t git comm an d s g i t o p e r at i o n s are available through the git executable. The format of the command lines resemble natural speech: start with the sentence “Git, please add the file cookies.txt to the index;” then note its essential parts: “Git, please add the file cookies.txt to the index;” finally, remove the rest to write the command: By convention, command lines are prefixed with $, commentary with #, and verbs and objects-and-options are highlighted. $ git add cookies.txt In general, a Git command follows the pattern: $ git verb objects-and-options … l o c a l s et u p one of git’s features is to keep track of who did which work. In order for it to do that, you have to identify yourself by running the following commands that alter configuration files: $ git config --global user.name "Bugs Bunny" $ git config --global user.email "[email protected]" there are files created by the operating system, text editors and other tools that should not be under version control, as they are not related to the project. For example, Apple’s os x creates .DS_Store files to store custom folder attributes. To teach Git that it must ignore such files: $ $ # $ echo ".DS_Store" >> ~/.gitignore_global echo "*.text-editor-temp" >> ~/.gitignore_global … git config core.excludesfile ~/.gitignore_global first git command the single most useful Git command asks it to report what it thinks the world looks like. Run it after setup is complete: $ git status fatal: Not a git repository (or any of the parent directories): .git The result is a fatal error: Git cannot find a repository. The next section explains what a repository is, and how to create one. r e p o s i t o ry start by th inking of the analogy that the operating system g u i makes: work on sheets of paper on the desktop and organize them in folders. Suppose it is necessary to keep track of the history of a project: in the physical world, one solution is to copy the papers after changes. But that results in a lot of paper—to manage it, one could group the pieces that belong together on a paper tray and put the sheets in boxes, label the boxes and store them in a cabinet. To find the boxes later, keep index cards, similar to those in libraries. Finally, to distribute your documents, use a fax machine. Installation procedures are different depending on the operating system. Go to the office hours to get individual assistance if you are having trouble installing Git on your machine. It is important that you choose an email address that you will own forever. Institutional emails are bad choices because, after the affiliation ends, the email address could be reassigned and the new owner would gain credit for all work associated with it. This is more of an issue on contributions for public projects, but it is cumbersome to have multiple profiles and distinguish between personal work and institutional work. So, unless an institution insists on the use of their email address, avoid it. pa g e 3 of 15 · git · leandro facchinetti Git extends the computer’s file system with equivalents of a cabinet, paper tray and fax machine, and provides boxes, labels and index cards. All those analogies are covered in the following sections; for the moment, it suffices to know that Git calls working directory the folder in which the project lives, and the cabinet is the repository. Git as an extension to the desktop metaphor. The working directory is the existing folder. The new elements are a cabinet, a paper tray, labeled boxes of changes, a Rolodex of index cards and a fax machine. On the command line, create a new folder to contain a project and a new repository in it: $ mkdir recipes $ cd recipes/ $ git init Initialized empty Git repository in …/recipes/.git/ Git created a hidden folder called .git in the project’s directory. It is the cabinet—use Git commands to modify its contents, do not do it manually. The status has changed: $ git status On branch master Initial commit nothing to commit (create/copy files and use "git add" to track) Git is no longer complaining about the repository not existing, but the output mentions two unknown concepts: branches and commits. The next sections address those terms. fine points about repositories i t i s u p to debate where to draw the line when creating a repository. A project that is composed of a front-end and a back-end should be in a single repository, separated in two directories, or in two repositories? Practical matters such as keeping changes in sync and using tools that integrate with version control come into play—there is no right answer. To help on the decision, keep in mind that creating repositories is cheap and easy, so they may contain as much as a single file, if it stands on its own. A hidden folder is one whose name starts with a dot. It receives the name because file browsers usually do not show it, but it is not not special in any other way. pa g e 4 of 15 · git · leandro facchinetti commit c o n t i n u i n g w i t h the office metaphor, a typical workday looks like: work on documents, make copies—do not use the originals, to allow for history tracking—and organize them in a paper tray, put the group in a box, label the box with information that helps finding it later, store the box in the cabinet and make a note about it on the index card. The Git workflow is similar—the paper tray is is called index or staging area; the box is a commit; the box label is the commit message; and the cabinet is the repository. The index cards are references, they are subject of a later section. On the command-line, start by doing some work and check the status: $ echo 'Delicious recipe' > vegan-cookies.txt $ git status On branch master Initial commit Untracked files: (use "git add <file>..." to include in what will be committed) vegan-cookies.txt There is a gross over-simplification is in the metaphor. Storing whole copies of files on the boxes over and over would waste resources, because most of the content remains the same. So Git does not work with the concept of files, but that of changes. A change can be the addition or deletion of a line on a file, the creation of a whole new file, and so on. That is why the boxes are depicted with modifications, not files, in them. To recreate a point in the history, Git follows a sequence of boxes and replays their changes—either forwards or backwards. nothing added to commit but untracked files present (use "git add" to track) Git is saying that the file vegan-cookies.txt is untracked—i.e., Git has not been introduced to the file, it has never been in the cabinet. Git also says what to do next: $ git add vegan-cookies.txt $ git status On branch master Initial commit Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: vegan-cookies.txt Now the file is on the paper tray, the staging area. Git teaches how to remove it from there, to unstage. But the changes are fine, so proceed to put them in the box that is going to the cabinet: $ git commit # Write commit message in text editor. ".git/COMMIT_EDITMSG" 10L, 250C written [master (root-commit) cc5d34f] Add cookie recipe 1 file changed, 1 insertion(+) create mode 100644 vegan-cookies.txt $ git status On branch master nothing to commit, working directory clean Commit (noun): The box with changes. Commit (verb): The act of creating a box with changes. On most machines, the default text editor is Vim. It is possible to configure a different editor with: $ git config --global core.editor "text-editorexecutable" Git repositories are append-only. I.e., once a commit is in the repository, it is there forever. However, it is possible to lose the access to a commit. This is covered on a later section regarding references. pa g e 5 of 15 · git · leandro facchinetti When issuing git commit, Git is going to open a text editor. Use it to write a message describing the commit, it is going to compose the box label along with other information such as your name and email—configured on setup—, the current time and which was the previous box on the chain. Once closing the text editor, Git finishes the commit—closes the box, puts it in the cabinet, updates the index cards, and gets ready to start over. fine points about commits g o o d c o m m i t m e ss a g e s are important when looking for a particular commit in the history. The same way you would not keep your cabinet messy, you should inform the reader of the commit message not only what is in the commit, but also why it exists. Write about the motivating problem, how the changes solve it, what would be alternative solutions and where to find more information. It is common for the commit to change a single line on the project and for its message to be several pages long. The convention for writing Git commit messages is to start with a 50characters long title, leave an empty line and write prose wrapped on 78 characters. This format allows history-visualization tools to better show the repository contents. For example: Add vegan cookie recipe The fact that a commit stores information about which were the previous commits in the history is fundamental. This way, it is only necessary to have a reference to one commit, and from it the whole history can be retrieved by traversing each commit and following the information in it. Keep this in mind when reading the later section regarding references. Do not confuse the index—the metaphorical paper tray, a Git concept also called staging area—with the index cards from the metaphor, which are Git references. The flax-seed meal mixture is called flaxeggs. The recipe is real—and it tastes great! After several experiments, we discovered the best replacement for eggs in vegan cookies is flax-seed meal mixed in water. Before that, we tried … To allow for this level of commit-message quality, it is necessary to group the changes that belong together. They might be a single line or come from files on the whole project. Commit early, commit often, do not wait for many changes to accumulate. av o i d c o mmi tti n g code that does not compile or does not pass the test suite. This confuses the readers and renders unusable a Git feature that finds the commit that introduced a bug—git bisect. Adding all the changes on a file to the index is not the only—nor the best—way to organize changes. It is possible to select line-by-line what goes in the commit. git add --interactive and git diff allow for this level of precision, but a G UI is better at the task. these are high standards of commit quality. At first, focus on getting the basics right, then work the way up to following the rules. Start by committing all the time; when comfortable with Git, learn how to rewrite history and craft better and better commits. r e a d h i s t o ry the point of carefully keeping track of the project’s history is to read it later. The simplest way of doing that is: $ git log commit cc5d34f5a53278aba79dd056ebd560d1db13da01 Author: Leandro Facchinetti <[email protected]> Date: Wed Sep 14 14:27:28 2016 -0400 Add cookie recipe Visualizing history is another task in which G UI shines. Text alone is limited. pa g e 6 of 15 · git · leandro facchinetti This command shows the latest commits, with their messages, authors and unique identifier string. To see the details of a commit, including the changes that went into it: $ git show cc5d34f5a53278aba79dd056ebd560d1db13da01 commit cc5d34f5a53278aba79dd056ebd560d1db13da01 Author: Leandro Facchinetti <[email protected]> Date: Wed Sep 14 14:27:28 2016 -0400 Add cookie recipe The unique identifier string is also called SHA -1, after the hashing algorithm used to generate it. Any prefix of the unique identifier string that remains unique works as identifier as well. This allows cc5d34f5a53278ab… to be abbreviated to cc5d34f5, for example. diff --git a/vegan-cookies.txt b/vegan-cookies.txt new file mode 100644 index 0000000..faa3136 --- /dev/null +++ b/vegan-cookies.txt @@ -0,0 +1 @@ +Delicious recipe another approach to reading history is starting from a file and asking what were the modifications that led into it. This might help finding a bug by pointing the commit that introduced a suspicious line—the commit contains a time stamp, a message written by the author and the identity of the person. Because this might start fights, the name of the command is git blame: $ git blame vegan-cookies.txt ^c8917a9 (Leandro Facchinetti 2016-09-14 14:27:28 -0400 1) Delicious recipe r e t r i e v e h i s t o ry once reading the history reveals a commit of interest, it is possible to open the box and put a copy of its contents on the desktop. I.e., it is possible to go back in time on the project and have the working directory reflect what it was at the time of a commit: $ git checkout cc5d34f5a53278aba79dd056ebd560d1db13da01 Note: checking out 'cc5d34f5a53278aba79dd056ebd560d1db13da01'. Because git checkout changes the working directory, it has to be clean—i.e., no files changed. Use git status to check and commit if necessary. If worried about maintaining a neat history, rewrite it later or learn about git stash. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b <new-branch-name> HEAD is now at cc5d34f... Cookie recipe It is also possible to git checkout file-name, in which case only that particular file is affected—it is restored to the condition it was on the last commit. This is helpful when finishing a quick experiment on a file and wanting to discard the changes right away, leaving no trace behind. pa g e 7 of 15 · git · leandro facchinetti Now the files on the working directory are the same as they were at the time of this commit. But Git also output a warning about a detached HEAD, which is a bit gross. The next section explains why there is nothing to worry about. reference a reference is a pointer to a commit—the index card on the desktop metaphor. HEAD is a special reference pointing to the commit that represents the state of the working directory. Git updates this reference on git checkout, git commit and many other operations. The HEAD can point directly to a commit or do so indirectly, via a branch—more on branches on the next section. The former is the detached HEAD state. If the plan is just to look at the files—or compile and run them—, then being on detached HEAD state is fine. On the other hand, when modifying the files, it is better to have another reference pointing to the current commit. Otherwise the work might be lost upon the next git checkout, when HEAD points to another commit. One other kind of reference available in Git is the branch, the subject of the next section. git checkout HEAD does nothing. branch a branch i s a named reference—thus, a pointer to a commit as well. When a repository is created, there are no commits—the cabinet is empty—so there are no branches. Then, upon the first commit, Git automatically creates a branch called master, pointing to the first commit. From then on, whenever there is a new commit, Git advances the branch along with HEAD. To create a new branch: $ git branch brownies $ git branch * (HEAD detached from cc5d34f) brownies master $ git status HEAD detached at cc5d34f nothing to commit, working directory clean The created branch points to the commit at which HEAD is pointing at the moment, but it does not automatically associate HEAD with it. That requires an explicit git checkout: $ git checkout brownies Switched to branch 'brownies' From now on, when committing, the brownies branch is advanced, but master is maintained where it was: $ echo 'Chocolate is vegan' >> vegan-brownies.txt $ git add vegan-brownies.txt $ git commit # Write commit message in text editor. [brownies 1b0657d] Add vegan brownies 1 file changed, 1 insertion(+) There is nothing special about the name master, it is just the default. The idea that branches are just references—not copies of the whole project, for example—makes creating them easy and cheap. This was one of the features that set Git apart from other version control systems and led to its popularity. Alternatively, git checkout -b brownies creates a branch and associates HEAD with it in one command. pa g e 8 of 15 · git · leandro facchinetti create mode 100644 vegan-brownies.txt $ ls vegan-brownies.txt vegan-cookies.txt $ git checkout master Switched to branch 'master' $ ls vegan-cookies.txt $ git checkout -b cookies Switched to a new branch 'cookies' $ echo 'Less flour' >> vegan-cookies.txt $ git add vegan-cookies.txt $ git commit # Write commit message in text editor. [cookies 60a0d3d] Fix cookies recipe---less flour 1 file changed, 1 insertion(+) $ cat vegan-cookies.txt Delicious recipe Less flour $ git checkout master Switched to branch 'master' $ cat vegan-cookies.txt Delicious recipe $ git checkout brownies Switched to branch 'brownies' $ cat vegan-cookies.txt Delicious recipe $ ls vegan-brownies.txt vegan-cookies.txt Only when the brownies branch is checked out the vegan-brownies.txt file is available on the working directory. Similarly, only when the cookies branch is checkout out the fix to the cookies recipe is available. This independence allows work to occur concurrently on different branches: test an idea on a branch, check another branch out, fix a bug on it, and so on. The history starts to look like a tree of commits, thus the name branch. A tree—sort of—beginning to form. On these illustrations, the commits point to their parents and the tree grows upwards as new commits happen. If familiar with Directed Acyclic Graph (D AG ), then it helps to think of the repository history as one. It is as easy to delete a branch as it is to create one: $ git branch non-vegan-recipe $ git branch -d non-vegan-recipe Deleted branch non-vegan-recipe (was cc5d34f). Besides the regular operations that update branches to point to other commits, it is possible to force a branch to point to a particular commit: pa g e 9 of 15 · git · leandro facchinetti $ git checkout -b moved Switched to a new branch 'moved' $ git reset --hard 60a0d3d HEAD is now at 60a0d3d Fix cookies recipe---less flour git reset is the first example of potentially destructive command. Double-check the working directory is clean before running it. fine points about branches beware that, even though Git repositories are append-only, if there are no references to a commit—or one of its children, as commits know their parents—, there is no way to retrieve it. So, before running a dangerous command—such as the ones that rewrite history, covered on a later section— it is advisable to create a temporary branch and delete it later. In case of emergency, to retrieve a commit that was recently checked out but has no other references, try git reflog. g i t i s a f l e x i b l e t o o l by design. This means it can adapt to different workflows, but it also means that it can be hard to find guidance on how to start. One of the aspects that can be confusing is to what warrants the creation a branch. The most common practice is to create a branch for each feature, bug fix, idea or exploration. Keep in mind that branches can be created from other branches arbitrarily—but avoid having complex workflows and branching structures that get in the way of the actual work. If the repository is huge—gigabytes in size—and storage space is a concern, there are garbage collection and compression routines that delete inaccessible commits forever. ta g a tag is a named immutable refere nce. It is similar to a branch, except that it forever points to the commit on which it was created. Tags are useful on software releases, for example. Creating a tag is similar to creating a branch: Tags can be digitally signed with GPG to guarantee the precedence of the released code. $ git tag cookbook-1.0 merge w h e n t h e w o r k on a branch is complete, it is time to merge it back into the main development line. The process can happen in one of two ways: either only references are updated, or new commits are necessary as well. The former happens when no work took place since the branch went off— i.e., the merged branch is a descendant of the merging branch. Git calls this fast-forward. Fast-forward—a merge happened by changing the commit pointed by the master branch. $ git checkout master Switched to branch 'master' $ git merge brownies Updating cc5d34f..1b0657d Fast-forward vegan-brownies.txt | 1 + 1 file changed, 1 insertion(+) create mode 100644 vegan-brownies.txt pa g e 10 of 15 · git · leandro facchinetti $ git branch -d brownies Deleted branch brownies (was 1b0657d). If some work happened since the branch went off—i.e., the merged branch is not descendant of the merging branch—, then there are changes on both sides. They need to be reconciled, which requires the creation of a new commit. There are two special characteristics to the commit resulting from a merge: it has two parents and contains the changes from both of them. After the merge, it is safe to delete the merged branch to keep the repository clean. Merge that requires the creation of a new commit—it has two parents and the changes from both sides. $ git merge cookies ".git/MERGE_MSG" 7L, 250C written Merge made by the 'recursive' strategy. vegan-cookies.txt | 1 + 1 file changed, 1 insertion(+) $ git branch -d cookies Deleted branch cookies (was 60a0d3d). fine points about merges w h e n t h e c h a n g e s on both parent commits are around the same lines of the same files, Git is unable to automatically reconcile them. A conflict happens and manual intervention is required. git mergetool integrates with text editors to show the differences and allow the resolution. Conflicts on some kinds of files can be hard to resolve; for example, binary files and big xml files generated by programming tools—e.g., iOS storyboard files from XCode. The best strategy is to coordinate the work on those files in a manner to avoid conflicts arising in the first place. the right moment to merge is another issue open to debate. Teams differ on their notions of ready—some only require working code, others insist on tests and documentation. The recommendation is to avoid longrunning branches that progress independent of the main line of development. They results are merge conflicts and frustration. r e w r i t e h i s t o ry as p reviously stated , repositories are append-only, but developers are fallible, and it is common to have to rewrite some of the history. Git’s solution is to create new commits with the modifications and update the references accordingly—in effect, it looks like history has changed. It is possible to arbitrarily manipulate the repository’s history tree, but there are two risks to consider: the first is to lose all references to original commits Having to coordinate the work on a few files is annoying and defeats part of the purpose of using branches. But it is a necessary evil because these files, by their nature, are hard to handle. At least the problem is localized on a few files— most files on most projects are text-only and tractable. pa g e 11 of 15 · git · leandro facchinetti that are still useful—this can be mitigated by creating branches before rewriting. The more serious concern is when working with other people: collaborators may have based their work on the commit before the change in history, which would void their commits. The solutions to this problem are to never rewrite commits that are visible by other people—see later sections about collaboration—, or to coordinate the changes. Having separate branches for each task helps isolate the work and minimize the issues. The most useful history rewrite is to amend the last commit—either to modify its message, or add or remove some changes. Use git commit --amend and Git rewrites the last commit instead of creating a new one. The next most useful history rewrite is to change the commit on which a branch is based. This is necessary when working on a branch that is outof-date in relation to the main development line—it is a solution to the long-running branch problem. The command to use is git rebase new-base. As is almost always the case with Git, there is a way to get out of the situation in which an ancestor commit changes. It involves git rebase, covered next. Note that some history rewrite operations have to conciliate changes from multiple sources, so they are subject to conflicts—similar to merges. git mergetool is useful in this case as well. git rebase brings the branch up to date with the main development line. A fast-forward can happen if the new base descends from the rebased branch— similar to what happens on git merge. Finally, the last common kind of history rewrite is to construct an organized history out of a series of commits. During normal work, commits should happen early and often—this means committing broken code, failing tests, works in progress and ideas that do not make to the end of the development cycle. It is not a history worth keeping around, and rewriting it is the purpose of git rebase --interactive base-branch. After running the command, the text editor pops up, showing each commit in the branch at a line and asking how to proceed. It is possible to completely remove commits from the history, edit them, reorder, or squash them together—that is, turn several commits into one. rewriting history can be hard for Git beginners, so do not worry about it at first. Once past that stage, try to write small, simple notes to self on the commit messages during development and use them to craft highquality commits when the work is ready to be merged into the main development line. an a s i de a b ou t k e ys the next section covers setup to work with Git on multiple machines. To keep privacy and control access to information, it is important that machines are able to identify each other over the network. They do that It is common for maintainers of freesoftware projects to ask for contributors to squash the commits together before accepting the code. This keeps the project’s history clean and avoids people trying to claim more credit than they are due by artificially climbing up the chart of commits per contributor. pa g e 12 of 15 · git · leandro facchinetti git rebase --interactive allows for carefully crafted commits after development is complete. by using a mechanism analogous to the following scenario: suppose Alice has the opportunity to meet Bob once, and later, on a second meeting, she has to prove her identity. What Alice can do is to give Bob a padlock and keep the key—when they meet again, she opens the padlock with the key that only she owns. In cryptography lingo, the key to the padlock is called private key and the padlock itself is known as public key. The private key, as the name implies, should be safely stored, away from other people. The public key can be copied and distributed freely—after all, what could attackers do with a locked padlock? remote setup the first step is to create the pair of private and public keys—see the previous section for more on that: $ ssh-keygen -t rsa -b 4096 -C "[email protected]" The private key is the contents of the file ~/.ssh/id_rsa and the public key is the contents of the file ~/.ssh/id_rsa.pub, both created by the command above. Keep ~/.ssh/id_rsa safe and take note of ~/.ssh/id_rsa.pub, as it is necessary later on. Now, decide what the remote is: it can be any machine accessible over the network via ss h. On this class, the staff chose GitHub as the remote— create an account at github.com and add the contents of ~/.ssh/id_rsa.pub the list of ss h keys. r e m o t e r e p o s i t o ry c r e at e a r e p o s i t o ry on the remote. On GitHub, click on the New repository button—for the group projects for the class, the staff creates the repository, it just shows up on your account, so skip this step. Then, grab the u r l for the repository, which follows the pattern [email protected]:<user-or-organization>/<repository>.git. This is time to introduce the last element on the desktop metaphor: the fax machine. It is used to share commits over the network to other computers —it handles the authentication protocol based on the private and public keys and sends data. It also comes with a list of frequently-called numbers. To add GitHub’s number, run: The padlock and the key—the principle of how computers identify each other. An alternative to holding a key is remembering a secret, a password, like that of padlock that requires a number combination. In practice, this is uncomfortable and insecure, because it requires typing in at every use and shorter secrets. For the purposes of this discussion, think of GitHub as a G UI over a machine with SSH access and repositories created with git init --bare—i.e., a repository that lacks the working directory, or a cabinet without a desktop. For free private repositories on GitHub and other goodies, sign up for the Student Developer Pack at education.github.com/pack. It is also possible to create a remote to host free private repositories by running the git init --bare on an empty folder of any machine accessible via the network—for example those of the undergraduate network. The UR L is going to be of the form ssh:// <user>@ugradx.cs.jhu.edu: /home/<user>/ <path-to-repository>. pa g e 13 of 15 · git · leandro facchinetti $ git remote add origin <remote-url> In this command, remote add is a verb phrase, origin is the name added to speed-dial, and the <remote-url> is number. From that point on, it is possible to refer to <remote-url> by the name origin. send commits There is nothing special about the name origin—it is just the conventional name for the remote that is the authoritative source of truth for the project. When using git clone, Git creates a remote with that name—more on it later. the key feature of distributed version control systems—among them, Git—is that repositories are copied to remotes. This means that all nodes on the network have a complete copy of all the history. Thus the fax machine analogy: both sender and receiver have access to the transferred document. The commits go over the network to the other computer and the local and remote repositories are equivalent. The command is: $ git push origin master Total 0 (delta 0), reused 0 (delta 0) To [email protected]:<user-or-organization>/<repository>.git * [new branch] master -> master origin is where to send and master are the contents of the fax. Git is smart enough to figure out which commits it needs to send in order to bring the remote up to date with the local branch, avoiding repeated work. The local repository sending copies of commits to the remote. Note the remote is a repository without a working directory. There is only a cabinet without a desktop. receive commits the converse of the above operation is asking for the fax machine to call the other side and request updates, if any. It is necessary to retrieve collaborator’s work. The command is: $ git fetch origin remote: Counting objects: 4013, done. remote: Compressing objects: 100% (15/15), done. remote: Total 4013 (delta 7), reused 0 (delta 0), packreused 3998 Receiving objects: 100% (4013/4013), 726.89 KiB | 0 bytes/s, done. Git can also work as a protocol to transfer projects around. Deployment tools—e.g., Heroku—receive the code to execute via Git and there are package managers that work on top of it. pa g e 14 of 15 · git · leandro facchinetti Resolving deltas: 100% (2800/2800), done. From github.com:<user-or-organization>/<repository> * [new branch] master -> origin/master # … $ git checkout master Switched to branch 'master' $ git merge origin/master # … Note that git fetch brings the commits to the local machine, but does not automatically update the branches. It does, however, automatically update references of the kind <remote>/<branch>—e.g., origin/master. The next step is to update the local branches according to those references—which requires either git merge or git rebase. The difference is git merge might create a new merge commit and git rebase tries to rewrite history. In most cases, a fast-forward happens and the two are equivalent. To streamline the sequence of git fetch and git merge, there exists the git pull command. The shortcut is the most convenient—but keep in mind what is happening under the hood. new c ol l aborators can start their repositories with: $ $ $ $ $ $ As is always the case involving git merge or git rebase, conflicts that require manual resolution are possible outcomes. A configuration is available to make git pull behave as git fetch followed by git rebase. This avoids spurious merge commits when multiple people commit on the same branch. mkdir <project> && cd <project> git init git remote add origin <remote-address> git fetch origin git checkout master git rebase origin/master The process happens frequently enough that Git provides the shortcut git clone <remote-address>. c o l l a b o r at e as stat ed a few times thus far, Git is a flexible tool—many workflows exist around it and teams should feel free to adapt it for what is best for them. The suggested workflow is one increasingly used in companies and free-software projects: push the commits to a branch on the remote and open a pull request. The pull request is a proposal to merge the contribution back into the main line of development and the start of a conversation. Contributors review the code and comment on it, the developer changes the code some more, and so on. A network of collaborators. The arrows represent remotes—i.e., the source has the destination on speed-dial. Note how there is no central node: that is what puts distributed in distributed version control system. Developers git push and git pull from other developers, GitHub, deployment services such as Heroku and anywhere to which they have access. pa g e 15 of 15 · git · leandro facchinetti Even when working on a project alone or on a group that meets in person, it still makes sense to use the pull request workflow. It documents the progress on a higher level than commit messages. Similar to pull requests, GitHub also has a feature called Issues. They are exactly like pull requests, except that they contain no code—their sole purpose is to start a conversation. Issues are used for bug reports, feature requests and support, depending on the project. Using issues and pull requests is not a requirement for group projects, but helps the graders. Before opening an issue or pull request, check the project’s guidelines—some use other tools to accept contributions and GitHub only to host the repository. where do you learn more about git? besides the features already mentioned, Git can do a lot more: hook into events and run procedures, stash commits for later, bisect the history looking for a commit that introduced a bug, arbitrarily rewrite the project’s history, handle repository dependencies as submodules, the list goes on. There are also other ways to collaborate: sending and applying patches, pull requests via email instead of GitHub, and more workflows defined by particular teams. The authoritative source of information about Git is its manual— available via the man command and online. But, because it is complete, the manual can be hard to navigate. The best way to become a Git expert is to read the Pro Git book, available for free at git-scm.com/book. When trying to solve an specific issue, try Stack Overflow—the most popular questions are about Git—and GitHub’s help on help.github.com. For beginners, there are tutorials available online. Some work on the browser and do not require installing Git on the machine, for example try.github.io. Check gitimmersion.com and codeschool.com/courses/try-git out, too. To learn more about writing Git commit messages, go to tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html and read the repository history of Linux and Git, they are examples to follow. For examples not to follow, head to whatthecommit.com and refresh the browser a few times. c o n c lu s i o n the lecture and these accompanying notes started stating the problem of version control, proceeded showing how Git can solve it and directed to sources to learn more. It covered the basic underlying concepts and contained examples of the most useful operations. This knowledge should be enough for the group projects and to contribute to free-software projects. colophon the lecture notes were composed by the author on os x Pages and the accompanying slides on Keynote. The serif font is Iowan Old Style, designed by John Downer and released by Bitstream in 1990. The sans-serif font is Source Sans Pro, an Open Font designed by Paul D. Hunt and released by Adobe Systems. The typewriter font is Source Code Pro, created as part of the Source Sans project. The page design is inspired by the works of Matthew Butterick and Edward Tufte. Colors for the slides come from the Solarized colorscheme, by Ethan Schoonover. The illustrations are from Lingo, by The Noun Project. ◼ “I’m an egotistical bastard, and I name all my projects after myself. First Linux, now Git.” —Linus Torvalds, creator of Git.
© Copyright 2026 Paperzz