This lesson introduces you to the basics of the Git version control system.
After completing this lesson, you should be comfortable...
git init
, git add
, git commit
, git branch
, and git merge
This tutorial assumes that ...
Scenario 1: You make a change to your code late into the night and saved before stumbling into bed. When you wake up in the morning and test things you realize something broke (but what?).
Scenario 2: You're working on a project with teammates and you keep emailing scripts back and forth (is crawler-final-final-v3-final.py
really the latest version or was it crawler-ulimate-final-marvin.py
? What change did Marvin make again?)
Scenario 3: You use multiple development machines on different networks. A shared drive works when you have a stable internet connection, but you're thinking to unplug from email, escape to the beach, and finish a project without distractions. Or maybe you have a 10+ hour international flight ahead of you and the in-cabin wifi isn't working.
In situations like these, you may have gotten by using something Google Drive or DropBox and its rewind feature. Those are forms of version control systems, but they're not designed to manage code.
Still not sold? Keep in mind that industry jobs related to software almost always involve contributing to a shared code base which will necessitate the use of some kind of version control system.
At a high level, Git is a tool to record changes to some directory (a repository) over time and keep those changes in sync with remote "copies".
Git is a popular distributed version control system (DVCS) that can used for both solo and team projects. When compared with centralized version control systems (CVS), Git has a number of advantages:
For a detailed summary of differences between DVCS and CVS, see this link.
Though Git is designed to manage code (large codebases), some use it to manage reports and even books!
With a version control system (VCS) such as Git, you can record groups of edits and "time travel" (Great Scott!) in your project's history.
Git can be installed on most operating systems. To install on Ubuntu, enter the following command in the terminal:
sudo apt-get -y install git
To test your installation on Ubuntu, enter the following command in the terminal:
git --version
You should see the installed version returned.
# set your name
git config --global user.name "Your Name"
# set your email
git config --global user.email "your@email.com"
If you've installed VS Code and wish to use it as your preferred editor for git
, run the following:
git config --global core.editor "code -w"
NOTE: Don't forget to include -w
Finally, check your settings:
git config --list
Let's take a look at some of the key concepts involved in Git...
A Git repository is a directory of files (project) with information about version history (i.e., who edited what when). The history is stored in a hidden .git
directory and composed of commits.
You can think of Git as some kind of cosmic tree that stretches forward and backward into time. The trunk of that tree represents the "main" timeline. In Git, this timeline is considered to be a branch and it is usually called master
. The repository may have other branches. Later in this tutorial, we'll look at the popular feature-branch workflow.
Each file in a Git repository will have one of the following states:
.gitignore
file. git add <filename>
. git add <filename>
. git commit -m "<informative commit message here>"
. Let's take a closer look at different file states. In order to do so, though, we first need to create a repository...
You can create a new empty repository using git init
.
First, navigate to the directory that you want to use to house your repository:
REPO=~/repos/git-tutorial
mkdir -p $REPO
cd $REPO
Now initialize your repository:
git init
git add
)With Git, changes are "saved" in two steps: staging and committing. Staging tells Git what edits to one or more files should be considered as a group and committing tells Git what those changes represent.
echo "# Git good at using DVCS" > hello.md
You can of course create a file using your preferred code editor (ex. VS Code, Vim, Emacs, etc.).
git add hello.md
TIP: If you've made a bunch of changes to a file, but want to split them between multiple commits, use git add -p <filename>
to interactively track certain changes.
git commit
)git commit -m "Added hello.md"
If you message is very short and only a single line, using -m
to provide the message in line works. If not, you may want to use your preferred code editor for the task. In order to use your preferred edit, first ensure that you've configured Git with this information. For instance, to use VS Code for all commits, you could run the following command:
git config --global core.editor "code"
To use Vim, you would run the following version:
git config --global core.editor "vim"
If you want to use a particular editor only for the current repository, you would simply omit the --global
flag in the previous command.
Running git commit
will open the editor with a template commit message for you to complete.
Above all, commit messages should be informative and to the point. Remember, these are for the benefit of future you and any collaborators. Things may be perfectly clear when you're in the "zone" coding, but they might not be nearly so a day, month, or year later. Ideally, you'd want to be able to understand the changes made to a repository by inspecting the commit messages.
Here is an example of a bad commit message:
Added a file
What file? Sure, one could inspect the details of the change compared to the previous and/or following commits (diff), but why make things difficult for yourself and others?
Here is an example of a better commit message:
Added README.md
Much better. We now at least know what file was changed, but why did you add that file? What purpose does it serve in the project?
Here is an example of an even better commit message:
Added README.md
This file provides an introduction to the project (instructions for installation, running tests, and an overview of modules).
The first line in the commit message will be used for the summary. Think of it as a (short!) title. What follows on subsequent lines in the example above is a more detailed description. While this extended description is not always necessary, it is often useful.
git init
)mkdir -p ~/repos/git-basics
cd ~/repos/git-basics
git init
touch README.md
Check the status of the repository:
git status
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
README.md
nothing added to commit but untracked files present (use "git add" to track)
git add README.md
Let's see how the status has changed:
git status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: README.md
git commit -m "My first commit"
Let's see how the status has changed after committing:
git status
On branch master
nothing to commit, working tree clean
You can see a summary of all commits so far using git log
.
Use your up and down arrow keys to navigate forward and backward through history. Hit q
to exit this view.
git reset
command. This is especially useful if you decide you want to split a bunch of changes into a few smaller commits to explain everything clearly.
After completing this tutorial, see the GitHub tutorial for an example involving remotes
The feature-branch workflow is commonly used to address a specific task, such as developing a new feature (ex. extending a tokenizer to cover a specialized domain) or fixing a particular bug.
First, you branch off of a working version of your code in order to address your specific task by making and committing changes. Working on a branch allows you to make isolated changes without risk of breaking things on the master
branch (which is expected to be stable and functioning).
After completing development of your feature, merge your changes back into the code from which you branched (ex. master
).
Once you've successfully merged your changes, you can safely delete the feature branch. Those changes have become part of the commit history of the master
branch.
We can create a branch and switch to it in a single step:
git checkout -b "new-feature"
Make and commit changes to your local branch, new-feature
, (ex. improving your tokenizer, fixing a particular bug, etc.).
# prepare to commit all changes to the current directory
git add .
git commit
Once you're satisfied and want to bring those changes into the "mainline" of your code, you would merge your changes:
# assuming master is the name of your "core" branch
git checkout master
git merge "new-feature"
Once your changes have been successfully merged, you can safely delete or "prune" the new-feature
branch:
git branch -d "new-feature"
The -d
corresponds to delete.
At first glance, Git may seem quite complicated. Stick with it, though. The payoff is worth the effort.
git
cheatsheet (quick reference) in your preferred language: https://github.github.com/training-kit/