Skip to content
Pepe Barbe edited this page Sep 17, 2016 · 10 revisions

Fundamentals

Traditional Version Control Systems (VCS), like SVN, have a very simple semantics where information changes recorded are linear. In contrast, Git provides non-linear semantics and an user interface (UI) that is not very intuitive, which can create confusion when we try to apply the same model as a linear VCS. Therefore, it is important to understand some basic concepts about Git in order to understand better what the different operations are doing to the repository.

The Git commit tree

Git is a non-linear VCS, information stored within the a Git repository is representable as a graph where each node is product of performing some operation on the graph. The database that stores the graph is immutable and append-only.

Example Git Tree

In the image above we see an example repository, displaying each commit alongside it's corresponding node in the commit tree and and it's unique hash which give a way to address each node.

Git references

References (or "refs") exists to make interaction with the commit tree easier. A reference is just a human readable label that contains a pointer to right commit hash. Branches, tags, remotes are all forms of refrences.

💡 References, do not hold the information in Git database; it is held within the commit tree, and as we mentioned previously the commit tree is immutable. Sometimes, we might operate on the Git repo and leave it in undesirable place from which we will want to back track. It is important to remember that old state is still inside the tree and all we have to do is change the references to right commit address. Later we will show a few techniques to achieve this.

Git provides a special reference named HEAD for the current address that refers to the state that is checked out inside the working directory (See below)

Git repo state

A Git repo is composed by three different states that are independent but related:

  • Working Directory: when a new Git repository is cloned what the user sees is within a directory all the files that Git repository contains.
  • Staging Index: provides an intermediate space where the user can add changes from the working directory, without adding them to the commit tree.
  • Commit Tree: when changes in the staging index are ready, they are added to the commit tree and given a hash address.

Basic Git operations

Clone

Create local copy of a repo. It is important to highlight that the local copy is complete and independent from its source. Git supports various protocols and can perform copies from local filesystems to remote servers via the network:

git clone [<options>] <repo> [<dir>]

If [<dir>] is missing, Git will create a new directory in the current working directory with the same name as the repo. The repo name depends on the protocol being used.

Useful examples:

# Local filesystem clone
git clone /Path/To/Git/Repo/Dir

# Remote HTTPS clone from GitHub
git clone https://github.com/radiasoft/devops.git

# Remote SSH clone from GitHub
git clone ssh://[email protected]:radiasoft/devops.git

git-clone man page.

Branch

Branches provide a way to track different set of changes on the same repo, without having to deal with possible conflicts arising from concurren modifications to the same areas in the repo. As we explained previously, a branch is only a reference within the repo, that points at the latest commit that has occurrent within a branch of the commit tree.

In our example repo, in the image above, we start with two branches, my_branch and master, both initially pointing to the same address (2d52a68). After changes in each branch occur separately we see they diverged into addresses 243742d and 04d25ed respectively.

The git branch command allows to create a new branch or list the available branches.

Useful examples:

# Create new branch challed 'branch_name' pointing 
# to the same address 
# that points to HEAD
git branch branch_name

# List local branches
git branch

# List remote branches
git branch -r

# Delete branch named 'branch_name'
git branch -d branch_name

# Rename branch 'branch_name' to 'new_branch_name'
git branch -m branch_name new_branch_name

# Make the current branch track 'branch_name' branch 
# within the 'radiasoft' remote
git branch -u radiasoft branch_name

git-branch man page.

Checkout

Change the HEAD reference to point to a different address, affecting the working directory accordingly.

💡 Checkout can also be used to undo changes in the working directoy.

git checkout [<options>] <branch>

Useful examples:

# Checkout the latest commit within the master branch
git checkout master

# Checkout an address within the git repo, and label it 
# as a branch called 'new_branch_name'
git checkout -b new_branch_name 2d52a68

# Force a checkout, throwing away local modifications
git checkout -f master

# Revert changes in file README.md
git checkout path/to/README.md

# Revert README.md to the sate in branch `my_branch`
git checkout my_branch -- path/to/README.md

git-checkout man page.

Staging

Adds changes from the working directory to staging index. The git add provides the interface do staging.

Useful examples:

# Add a new file to be tracked, or add new changes to 
# an already tracked file to the staging index
git add path/to/file

# Add to the staging index all the changes of already 
#tracked files
git add -u

git-add man page.

Commit

Store changes within the commit tree.

💡 Changes con come either from the was is in the staging index or directly from the working directory, depending on how the command is invoked.

💡 Each commit requires a commit message to document the changes being recorded. If a commit message is not provided on the command line, Git will provide an editor where the commit message can be written.

Useful examples:

# Commit the staging index
git commit

# Commit the staging index and provide a commit message
git commit -m 'this is my commit message'

# Commit all changes in tracked files
git commit -a

# Commit changes within a specific file
git commit /path/to/file

git-commit man page.

Remote

Provide alias for an external repository to track.

💡 When a repo is cloned, Git will assign by default the name origin to the repo

Useful examples:

# Add a remote repo in the same filesystem as 'fs_remote'
git remote add fs_remote /path/to/repo

# Add a github repo as 'radiasoft'
git remote add radiasoft ssh://[email protected]:radiasoft/devops.git

# Rename remote from 'old_name' to 'new_name'
git remote rename old_name new_name

# Remove remote 'to_remove'
git remote remove to_remove

# List registered remotes with their URIs

git-remote man page.

Fetch

Synchronize locate commit tree state with a remote repository.

💡 Git is a Distributed VCS (DVCS), which means each copy is complete and indepent from one antoher. Fetching will retrieve any new changes contained in the remote repo and make them available within the local commit tree.

💡 When a new remote is fetched, all the branches of the remote will be available locally using the following naming schema remote_name/remote_branch_name.

Useful examples:

# Fetch remote named 'radiasoft'
git fetch radiasoft

# Fetch all remotes
git fetch --all

# Fetch tags from remote
git fetch --tags

git-fetch man page.

Merge

Creates a new commit in local tree that is product of joining the state of HEAD and the state pointed by another address or reference in the tree.

⚠️ Git will use some logic to try to complete the merge automatically, but sometimes the changes can confuse this logic. It is up to the user to fix the conflicts and complete the merge manually.

Useful examples:

# Merge local branch named 'my_branch' to `HEAD`
git merge my_branch

# Merge branch named 'their_branch' from remote
# `radiasoft` to `HEAD`
git merge radiasoft/their_branch

git-merge man page.

Pull

Performs a fetch and merge in one step.

Useful examples:

# Pull the remote tracking branch into the current
git pull

# Pull 'branch_name' branch into from 'radiasoft' 
# remote into the current
git pull radiasoft branch_name

git-pull man page.

Push

Send changes from the local branch to a remote repository.

Useful examples:

# Push to the remote tracking branch
git push

# Push and set as tracking branch
git push -u radiasoft destination_branch

# Git push to 'radiasoft' remote the contents from
# local branch 'my_branch' into the remote branch
# 'their_branch'
git push radiasoft my_branch:their_branch

# Delete remote branch named 'their_branch'
git push radiasoft :their_branch

git-push man page.

Example Workflows

See the following links for some example operations:

Commit Messages

Make sure you checkin a "subject" which is up to 50 characters followed by a newline if you need more space to describe the checkin. Don't just write a long line. You can hit return if you are on the command line, e.g.

git commit -am 'sometimes amazon screws up contents
Had to check numberInner to see if it exists'

This will show up as this in GitHub:

sometimes amazon screws up contents  …
robnagler committed 4 minutes ago

Latest git on CentOS

rpm -U http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el6.rf.x86_64.rpm
perl -pi.bak -e '!$first && s/(?<=enabled = )0/1/ && $first++' /etc/yum.repos.d/rpmforge.repo
yum install -y git

Merging a branch into master

Branch xyz needs to be merged into the master, and there are no conflicts, that is, xyz is ahead of master and not "behind":

$ git checkout xyz
$ git checkout master
$ git merge xyz master
$ git push
$ git branch -d xyz
$ git push origin --delete fix

If there are conflicts, you'll need to do more work:


$ git checkout xyz
$ git checkout master
$ git merge xyz master
Automated merge did not work.
$ git checkout xyz
# fix the conflicts so a merge will succeed
$ git commit -am 'fix merge conflicts from xyz to master'
$ git push
$ git merge xyz master
Trying simple merge with fix
Already up-to-date with master
Merge made by octopus.
$ git push
$ git branch -d fix
$ git push origin --delete fix

When you’ve prepared the index you want, use git commit to store it as a new commit. Use git status first to check the files involved, and git diff --cached to check the actual changes you’re applying. git diff alone shows any remaining unstaged changes (the difference between your working tree and the index); adding --cached (or the synonym --staged) shows the difference between the index and the last commit instead (i.e., the changes you’re about to make with this commit).

git reset --patch

With git reset --patch you can be even more specific, interactively selecting portions of your staged changes to unstage; it is the reverse of git add -p. See “Discarding Any Number of Commits” for other options.

  1. Use git add (with various options) to stage a subset of your changes.
  2. Run git stash --keep-index. This saves and undoes your outstanding, unstaged changes while leaving your staged changes in the index alone, resetting your working tree to match the index.
  3. Examine this working tree state to make sure your selection of changes makes sense; build and test your software, for example.
  4. Run git commit.
  5. Now, use git stash pop to restore your remaining unstaged changes, and go back to step 1. Continue this process until you’ve committed all your changes, as confirmed by git status reporting “nothing to commit, working directory clean.”

git log --graph --oneline

*   f4dcb5b Merge remote branch 'origin/xyz' into xyz
|\
| *   2bc662f Merge remote branch 'origin/xyz' into xyz
| |\
| | * dfdac18 a new file
* | | f9d4689 a new file
|/ /
* | 7d482cf remove
* | 158f30e merge conflict
* | 3ddd434 cause conflict with xyz
|/
* 30f4b62 cleanup
* de74c31 fixed merge xyz to master
* 8224c26 fixed merge xyz to master

git config --global branch.autosetuprebase always

http://documentup.com/skwp/git-workflows-book http://chimera.labs.oreilly.com/books/1230000000561/index.html http://scottchacon.com/2011/08/31/github-flow.html http://git-scm.com/book/en/v2

Undoing

Unstaged changes

git checkout path/to/modified/file will drop all the outstanding changes in a file and revert it to the state pointed by HEAD or the value in the staging index.

Staged changes

If you want to remove previously staged changes while still keeping the changed file intact, you can do git reset HEAD path/to/file.

Incomplete merges

git merge --abort allows you to drop an incomplete merge and return to the previous HEAD value before the merge started.

Undoing a commit at Working Directory level

git revert <reference or sha> will generate a new commit on top of the current HEAD .

Undoing a commit at a File level

Changing a single file to different commit requires the following steps:

git reset <reference or sha> -- path/to/file
git checkout path/to/file

Advanced Topics

Reflog

When performing history rewriting, typically git reset, we overwrite the reference pointers to a new Git commit address. When going back in history, git will change HEAD to new address and discard, from the point of view of references, all the future changes.

For example, given a repository with the following structure, for branch master:

* 1d7261c new
*   2512b58 Merge branch 'branch'
|\
| * 8ecb709 Change a and c
* | f3b84b3 Change b and c
* |   44dfdec Merge branch 'branch'
|\ \
| |/
| * 153740e add c
* | 7d7d0a2 add b
|/
* 46de418 add a

If we perform a reset to return the repo to a previous commit:

git commit reset --hard HEAD~3
HEAD is now at 44dfdec Merge branch 'branch'

The repo will show the following tree for master:

*   44dfdec Merge branch 'branch'
|\
| * 153740e add c
* | 7d7d0a2 add b
|/
* 46de418 add a

Comparing with the previous graph, we can see that there is a lot of state missing. We have made HEAD for master point to 44dfdec.

💡 As we have mentioned previously, the Git commit tree is immutable, so while we cannot see the future state after we effectively traveled back in commit-time with git reset, the state is still stored within the repo.

If we need to recover some of that state, we can use git reflog, which will return a list of the previous commands issued within the repo and the HEAD value for that command:

git reflog
44dfdec HEAD@{0}: reset: moving to HEAD~3
1d7261c HEAD@{1}: reset: moving to 1d7261c
2512b58 HEAD@{2}: checkout: moving from f3b84b38591c2b51f79cf0dc354d024dbfb5d7d4 to master
f3b84b3 HEAD@{3}: checkout: moving from master to HEAD~1
2512b58 HEAD@{4}: reset: moving to HEAD~1
1d7261c HEAD@{5}: commit: new
2512b58 HEAD@{6}: commit (merge): Merge branch 'branch'
f3b84b3 HEAD@{7}: commit: Change b and c
44dfdec HEAD@{8}: checkout: moving from branch to master
8ecb709 HEAD@{9}: commit: Change a and c
153740e HEAD@{10}: checkout: moving from master to branch
44dfdec HEAD@{11}: checkout: moving from branch to master
153740e HEAD@{12}: reset: moving to HEAD~1
6d94462 HEAD@{13}: checkout: moving from master to branch
44dfdec HEAD@{14}: reset: moving to HEAD~1
dd78fce HEAD@{15}: checkout: moving from branch to master
6d94462 HEAD@{16}: commit: update c
153740e HEAD@{17}: checkout: moving from master to branch
dd78fce HEAD@{18}: commit: update c
44dfdec HEAD@{19}: checkout: moving from 7d7d0a244a58a95c4017c497199305011010a335 to master
7d7d0a2 HEAD@{20}: checkout: moving from master to HEAD~1
44dfdec HEAD@{21}: merge branch: Merge made by the 'recursive' strategy.
7d7d0a2 HEAD@{22}: checkout: moving from branch to master
153740e HEAD@{23}: commit: add c
46de418 HEAD@{24}: checkout: moving from master to branch
7d7d0a2 HEAD@{25}: commit: add b
46de418 HEAD@{26}: commit (initial): add a

We can see that the HEAD@{0} in the reflog points to our current HEAD. The previous entry, HEAD@{1} is HEAD before issuing the last command; we can match the SHA addresss, 1d7261c to the first graph in this section.

If we wanted to return that point in commit-time, we can just issue git reset --hard 1d7261c and our current HEAD will go back to where we were initially.

git-reflog man page.

Useful links

Vagrant, Windows, and Git

If you are running Vagrant with Linux (guest) on Windows (host) and you want to share your git folder, you will need to turn off filemode checking. If you don't, Git will see all files as modified, because shared files get permissions 777, always.

To fix this, you'll need to:

git config --global core.filemode false

You'll need to run this on each of the shared repos:

cd <some-repo>
git config --local --unset core.filemode

You could potentially just set filemode to false on each of the repos, but it's easier to set it globally in Vagrant.

Git only tracks the execute bit so if you need to set that, you can do so explicitly:

git update-index --chmod=+x <some-file>

Useful tools

Clone this wiki locally