Skip to content

Latest commit

 

History

History
729 lines (447 loc) · 19.7 KB

version-control-with-git.markdown

File metadata and controls

729 lines (447 loc) · 19.7 KB

title: Version Control with Git

%# Version Control with Git !SLIDE main

Who am I?

  • Jeroen Budts
  • PHP & Drupal Developer
  • At Inuits - An Open Source Consultancy Company
  • http://budts.be
  • Twitter: @teranex
  • Father

lotte-git

Overview

  • Basic Git usage
  • How Git stores it's stuff
  • More advanced Git usage (branching, merging, rebasing, remotes, ...)
  • Some cool Git tools %* Git and Drupal

What is Git?

"Git is a free & open source, distributed version control system designed to handle everything from small to very large projects with speed and efficiency."

  • Distributed
  • Open Source
  • Fast
  • Created by Linus Torvalds (Linux)

Basic Git Usage

First initialize the git repository:

git init

Then add files

git add -A
# or
git add .
# or
git add myfile.txt
git add myotherfile.rb

And commit the files

git commit

The Index (Staging Area)

The index contains the changes to will be added to your next commit. Your commit will /not/ contain the changes in your working directory. Only the changes that were added to the index!

git-02

The Index (Staging Area)

If you add a file, or a part of a file, to the index, a copy is made of that file. When you commit, it is that copy which ends up in the commit.

Important: if you make changes to the file after adding it to the index, those changes will not end up in your commit, unless you add the file again to the index!

`git diff`  
# shows the diff of your working copy

`git diff --cached`
# show the diff of your index

If you modify files, you need to 'stage' them again, by running git add.

If you only modified files (not added new ones) you can skip staging with:

git commit -a

You can also add only parts of a modified file:

git add -p myfile.txt

The Index (Staging Area)

git status # show the status of your repo

git-18

Viewing the history

git log

git-13

Viewing the history

git log has many options. The following:

git log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset \
%s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative

Will give you this, which gives a lot more information

git-11

Tip: add this in your global gitconfig as an alias: ~/.gitconfig

[alias]
    l = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s 
    %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative    

Undoing changes

Checkout

Checkout a single file. Notice the dashes: git checkout is also used in other cases, this makes it clear to Git that you are pointing to a single file.

git checkout -- /path/to/file    # restore version from index
git checkout HEAD /path/to/file  # restore to latest committed version

Reset

Reset your branch to a commit and all the changes in your working copy

git reset [--hard]

Revert

Write a reversed version of an existing commit. This is very useful if you have already pushed the commit

git revert [commit-sha]

Rebase

We'll discuss this later

How Git stores your data

Knowing how Git stores all your data will give you a better understanding of the fundamentals and Git will become a lot more predictable

The most important point to remember:

Basically a Git repository is a database of objects

How Git stores your data

The most important types of objects:

  • blobs
  • trees
  • commits

Let's study those in a bit more detail

Blobs

Git stores the contents of a file in a 'blob'.

  • does not contain any meta data
  • a blob never changes
  • a hash is calculated as the blob name
  • the hash will always be the same for the same contents

some examples:

$ git hash-object hello.txt
ce6c1fd146f65c899e6b10e46c89097c644e3229

$ git hash-object say-hi.rb
a8784b043f12b4b0c9114c55ebf33f5c9b44ce8f

Trees

A tree is a like a directory

  • Can contain references to 1 or more blobs
  • Can contain references to 0, 1 or more other trees
  • Contains the meta data about the files
  • Itself also identified by a hash

git-03

Commits

If you think about Git, think about commits!

  • Contains a reference to one tree object
  • Contains references to one or more other commits
    • one exception: the very first commit in the repo
  • A commit usually points to it's parent commit
  • In case of a merge, it can point to two or more commits (one commit for each branch which you merged together)

Commits

git-04

Did I mention that you should think in terms of commits, when working with Git?

If you understand commits, you basically understand Git.

Git References or "refs"

Because a commit hash is very difficult to remember and not really useful to work with, Git uses references to point to specific commits.

HEAD

One such reference is HEAD

  • It points to the commit which is currently checked out into your working directory

master

Another important reference is master

  • master is the 'default' branch in Git
  • when working on the master branch, the master reference and the HEAD reference point to the same commit

special

  • HEAD^ & HEAD^^: The commit before HEAD, two commits before HEAD
  • master~7: 7 commits before master reference

And that brings us to...

Branches

Branches in Git are basically just references to other commits

  • very easy to create
  • once you get the hang of it, you will probably create branches for almost every feature you want to implement
  • a branch has a name
  • this name is also the reference to the most recent commit for that branch
  • a commit can be shared by branches
  • The point where a branch is split of from another branch, can be found by following the parents of all the commits until you find a common commit

Branches

Creating a branch and switching to it is easy:

git branch mywork

git checkout mywork

The two previous commands can be combined:

git checkout -b mywork

This will create a branch and immediately check it out.

Once you have merged your work on a specific feature back into your master branch you can delete your feature branch:

git branch -d mywork

Branches

  • We are working on a branch named 'origin'
  • At commit C2 we decide to split off a new branch named 'mywork'.
  • Both branches originate from commit C2; commit C3 and C5 have the same parent

git-05

Merging

When you have been working on a (feature) branch for a while you will probably want to merge those branches back together.

# to merge the origin branch back into your mywork branch (to bring it up to date)
# checkout the target branch
git checkout mywork
# merge the branch into the current branch
git merge origin

Now your repository looks like this: (Notice that commit C7 has two parents) git-06

Merging

When merging two or more branches there are two possibilities:

  • Merge commit
  • Fast Forward

Merge commit

  • When both branches have new commits a merge commit is created.

  • Git automatically proposes a commit message:

    Merge branch 'mywork' into master

  • The commit has two or more parents

Merging

Fast Forward

  • When the target branch does not have new commits
  • No merge commit is created
  • In fact nothing much happens
  • Except: The reference for the target branch is simply changed to point the same commit as the source branch
  • This is an ideal situation and can never go wrong
  • Obviously you are not always this lucky (→ rebasing)

Merging

Fast Forward

git-07-before Merge the 'mywork' branch into origin git-07-after The origin branch is simply fast-forwarded

Rebasing

Instead of merging (with merge commits) you can also rebase (so you can then fast forward)

Some people will tell you that this is very harmful, it can break your repository and destroy the universe.
This is NOT TRUE. (At least if you know what you are doing)

So what is 'rebasing'?

By rebasing your commits you can actually rewrite your history:

  • Edit a commit message
  • Add missing files to a commit
  • Reorder your commits
  • Modify the parent of a commit
  • Merge a few commits together into a single commit
  • Delete commits from the history
  • ...
  • And break your repository if you want :)

Rebasing

How not to break your repo?

Do not rebase commits which have already been pushed to other people

Do not rebase commits which have already been pushed to other people

Do not rebase commits which have already been pushed to other people

(I'l explain how to push later)

Each commit which is rebased will get a new, different, hash. People (and Git) which pull this new hash will get confused.

If you do rebase a commit which was already pushed, Git will refuse the new commit, unless you use the --force option.

Rebasing

Amending changes

  • The easiest and 'safest' kind of rebase
  • Only possible for the most recent commit
  • Let's you add missing files and modify the commit message

After modifying your index again:

git commit --amend

Rebasing

... on another branch

This is an alternative approach to merging, with a merge commit.

Let's reuse the example:

git-05

Now you want to merge 'mywork' into 'origin' without creating a merge commit

Rebasing

... on another branch

What we did before:

git checkout origin
git merge mywork

What we will do now:

# on the mywork branch
git rebase origin
# fix any merge conflicts
git checkout origin
git merge mywork

Rebasing

... on another branch

# on the mywork branch
git rebase origin

git-08

Rebasing

... on another branch

git-09

Rebasing

... on another branch

git-10

Rebasing

Interactively

With interactive rebasing you can really rewrite history the way you want it to be. ...And break your repository.

Rebase the commits since

git rebase --interactive <commit-hash>

Suppose we have the following commits

git-11

Rebasing

Interactively

To rebase the most recent 3 commits:

git rebase --interactive 4efd195

git-12

Working with remotes

To share your work with other people you have a few options. One of these is to work with remotes

Easiest method to set this up is by cloning an existing repository instead of initializing your repo.

git clone http://git.drupal.org/project/drupal.git

git-14

This will set up everything for you

Working with remotes

Manually adding remotes

Sometimes you will want to add a remote

  • Because you had already created the repository
  • Or maybe because you want to add one or more additional remotes

This is done with the git remote command

git remote add github [email protected]:teranex/git-talk.git

This will add my Github repository for this presentation as a remote with the name github

Working with remotes

Pulling and pushing

When you have cloned the repository:

git pull

This will pull in the changes from the current branch on the origin

git push

Will push your changes to the origin

However: This will only work for 'tracking' branches.

Working with remotes

Remote Branches

It is important to think about remote branches as just branches

By default, Git does not know, nor care, about relationships between branches!

  • local branch: master

  • remote ('github') branch: github/master

  • Git just sees two branches

    git pull github master
    git push github master

  • By default, no local branches are created for remote branches

Working with remotes

Remote Branches

You can get a good overview of all your local and remote branches and how they are trakcing with: git branch -avv

img-19

Working with remotes

Tracking branches

To create a local branch based on a remote branch:

git checkout --track -b mywork github/mywork

To link an existing local branch to a remote branch:

git branch --setup-stream github/mywork

You can verify this in the git config file (.git/config) in your repository:

git-15

Working with remotes

Merging while pulling

To better understand pulling, let's see what actually happens. Instead of using git pull, you also do (while on the master branch):

# pull in the new objects
git fetch github

# merge the remote branch with the local branch
git merge github/master

This works exactly the same as merging two local branches!

Working with remotes

Avoiding useless merge commits

Merging a remote branch in your local branch can create a useless merge commit:

git-16

You can avoid this by rebasing instead of merging:

git pull --rebase

or, if you want to do it manually:

git fetch github
git rebase github/master

Some really cool tools

  • Gitk: Part of the official Git distro, but it's UGLY UGLY UGLY
  • Giggle & GitX: better looking, for Gnome & Mac

git-17

Some really cool tools

  • Fugitive and gitv plugins for Vim tools-2

Some really cool tools

  • tig: commandline. Also very useful to stage/unstage files
  • meld: mergetool for linux, has support for git
  • bash prompt information
    Include $(__git_ps1 ' %s ') in your $PS1 tools-3

Git and Subversion (and friends)

Git has plugins available to migrate from and/or integrate into other versioning systems as well. One such plugin is git-svn.

With git-svn you can:

  • migrate an existing subversion repository to a git repository, including all the history.
  • use Git locally to do your work, but push to a central Subversion server.

Git and Subversion

To locally use Git and push to a central Subversion server:

First 'clone' the Subversion repository into a local Git repo

git svn clone -s http://svn.example.com/myproject
# the -s means the subversion repo has a standard layout (trunk/ etc)

Now you can work as usual with your Git repository. Except... instead of running git pull to get the changes from other people, you now do:

git svn rebase

And you don't do git push, but:

git svn dcommit

Git Flow

git-20

Git Flow

To use the 'Git Flow' branching strategy a Git plugin is available: git-flow. This plugin makes it really easy to follow Git Flow:

First initialize it to configure the names. (I recommend to use the default) git flow init

To start a feature branch:

git flow feature start my-exiting-feature

To publish the feature (push it to the remote)

git flow feature publish my-exiting-feature

To finalize the branch:

git flow feature finish my-exiting-feature

Working with submodules

In a Git repository you can 'link' other repositories to subdirectories. This can be useful, for example when using external libraries or when building your VIM configuration, to pull in all the plugins.

But submodules do not really work the way you would expect them to work, at first:

  • submodules must be registered in the repository
  • submodules must be pulled-in separately for each repo.
  • submodules are not updated automatically
  • submodules are not pushed automatically
  • a specific commit is referenced in the parent repo

Working with submodules

First you need to add a repository as a submodule

git submodule add git://path/to/repo path/in/repo

This will modify the .gitmodules file, which is part of the parent repository, and record the exact commit which is checked out in the submodule. Now you can commit, push, etc.

When another repository pulls in the change the following steps are required:

git submodule init
# This will update the .git/config file for that repository
# and register the submodule in the repository

git submodule update
# This will check-out all the submodules to the correct commit

To update you can checkout another commit in the submodule (for example by pulling) and commit the reference to this new commit in the parent repository.

To update all the modules, you can use something like:

git submodule foreach git pull origin master

Git Bisect

Sometimes somebody will introduce bugs into your software or brake previously working features. Who knows, maybe even you! While trying to find the source of the problem, it can be useful to know which commit exactly introduced the troubles in paradise. That is exactly what Git bisect is for.

  • Inform Git about one 'good' commit
  • And about one 'bad' commit
  • Git will checkout a commit for you
  • Verify that commit
  • Tell Git whether that commit has the problem ('bad'), or not ('good')
  • Finally the 'bad' commit will be found

Git Bisect

Let's say you know the feature was already broken in the previous commit and 10 commits ago the feature still worked.

Start Git Bisect and inform Git about this:

git bisect start
git bisect bad HEAD^
git bisect good HEAD~10

Git will checkout the commit in the middle, so you can test:

Bisecting: 5 revisions left to test after this (roughly 3 steps)

After testing you inform Git about the result

git bisect good # when the feature worked
git bisect bad  # when the feature is b0rken

After you have found the bad commit, reset your repository:

git bisect reset

If you write a script which can verify each commit, you can let Git run it for every commit!

Git Stash

Sometimes you will want to reset your working copy to a clean state, without committing your work, nor deleting it. In this case you can stash your changes:

git stash
git stash save "something fancy i was working on"

This will create something like a commit and hard-reset your working copy.

Later you can retrieve your work:

git stash list  # list all your stashes
git stash pop   # re-apply the latest stash, this will also delete the stash!
git stash apply # like pop, but do not delete the stash

I often use the stash when I want to git pull --rebase, while I have uncommitted changes (git will refuse to do it in that case):

git stash
git pull --rebase
git stash pop

% # Git and Drupal % % For the Drupal infrastructure (Update status, testing bot, etc) it is important to following the following conventions: % % ## Branch names % * 7.x % * 7.x-1.x % * 7.x-2.x % % development releases will be created twice a day (7.x-1.x-dev) % % ## Tags % * 7.x-1.3-beta6: beta 6 release % * 7.x-3.1: final release % * valid: unstable, alpha, beta, rc % % You can create other branches for features etc
% % If you don't have commit access on drupal.org, you can contribute by creating git patches (with git diff or git format-patch)

Suggested reading and resources

Thanks!

This presentation is licensed under the Creative Commons Attribution-Non Commercial-Share Alike 3.0 license

This presentation is available on github: https://github.com/teranex/git-talk

Sources of images and inspiration: