Dive into git

21 avr. 2024

Git has been initiated in 2005 by Linus Torvald and developed with the Linux community. It is distributed version control system such as Subsversion (svn), Mercurial or even Darcs.

Distributed what ?

One of the problems we face when working in a team on a project is how to coordinate, exchange information and keep up to date with the code we’re working on together. Here comes the Version-Control System (VCS), like GNU RCS. Basically, it is a centralized server with a shared repository. The versonning is on the server. Howerver, centralized versionning in one place is rather dangerous if you don’t be cautious enough: You could lose everything in a snap.

Distributed Version-Control System (DVCS) fix this issue: the versionning is on the server but also locally for each person . Git is a DCVS, one of its strong points.

Integrity

Another strong point of git is its integrity. Git uses checksum. It is a 40-character string composed of hexadecimal characters, calculated with SHA-1 hash based on the contents of a file or directory structure. Checksums ensures data integrity.

Start me up

Here the uselful git commands to start your project (or retrieve an existing one):

git config # set config for git (see help page for more details)
git config --global user.name "[name]"
git config --global user.email "[email]"
git config --global credential.helper cache # store credentials
git init # init git in your project
git clone # clone a remote repository locally

Tree of life

As written in the Git book:

Nearly every VCS has some form of branching support. Branching means you diverge from the main line of development and continue to do work without messing with that main line.

By default, a branch called master (or main) is created (when you run git init). It represents the main line of development.

git branch myBranch # creating a branch
git checkout myBranch # switch to the new branch
git checkout -b myBranch # creates and switch to the new branch
# Since Git version 2.23
git switch myBranch # switch to the new branch
git switch - # switch to previous branch
git switch -c new-branch # creates and switch to the new branch

How does Git know what branch you’re currently on? With a pointer called HEAD.

git log --oneline --decorate # to show where the pointer

Commit

A commit is kind of a snapshot of the project at time T.

Here the structure of a commit:

size
tree (checksum pointing to the tree)
parent (checksum pointing to parent commit or nil)
author
committer (may differ to author sometimes)
commit message

To see in details the commit structure:

git cat-file -p <commit>

By the way, if you check the structure of , you’ll see this:

tree size
blob (checksum)
blob (checksum)
...

where blob represents a commited file.

To me, git commit is the most important command with git. But before commiting, there are few more steps…

When you modifiy a file, it goes through 3 states:

modified

Git notices that you modify a file/directory and put it in modified state.

You can see the changes with the following command : git diff

If you’re sure you want to update the server with your changes, then you’ll need to add these changes first : git add <file>

staged

Once it is added, the file is on staged state.

You can commit (git commit <file>) or if you finally changed your mind : you can unstage with git reset <file>.

commited

Once, your file commited, you can push it the remote repository thanks to the command git push <shortname> <file> (see working with remotes section for more details).

Long short story:

git status # status of the branch (modified files pending, etc.)
git add <file> # add file to your next commit  (stage)
git add --all # add all files
git reset <file> # unstage file
git reset -- <file> # unstage file (alt)
git checkout -- <file> # discard changes
git restore <file> # discard changes (alt)
git diff # diff for unstaged files
git diff --staged # diff for staged files
git commit <file> # commit file
git commit -m <file> "commit message" # commit with commit message
git push <shortname> <url> # push to remote repository

By the way, you can create an alias to unstage easily:

git config --global alias.unstage 'reset --'
# with this alias you'll be able to do :
git unstage <file>

A bit of gardening with the branches…

git branch # list the branches
git branch -a # list remote and local branches
git fetch # fetch down all the branches from the remote
git pull # fetch and merge any commits from the tracking remote branch
git merge <branch> # merge remote branch to your current branch
git merge --abort
git rebase <branch> # merge your current branch ahead specified branch
git push <shortname> <url> # push to remote repository
git branch -d <branch> # delete branch LOCALLY
git push origin --delete <branch> # delete branch REMOTELY

Hey, I want to merge but I don’t want to commit yet before the merge ! Well, you can save temporarily your changes before doing any merges…

git stash # save and stage modified files
git stash pop # write working from top of stash stack
# another commands:
git stash drop # discard stashed files
git stash clear # pretty obvious init ?
git stash list # so I'll let you guess this one, eh

Check the history

git log # history of commits
git log --pretty=format:"%h - %an, %ar : %s" # verbose history
git log --follow <file> # history of given file
git diff branch1..branch2 # show diff between branches
git show <commit> # show details of the given commit

Abort the mission

# You've edited file1 and file2 
git add file1
git commit 
# Oh my you forgot file2
git add file2 
git commit --amend --no-edit # Add to the most recent commit
# Actually, you're not satisfied with this commit
git reset <commit> # reset the current branch head to commit
git reset --hard <commit> # rewrite working tree from given commit

Remember the HEAD pointer ? Well, it can be useful to reset commits.

git reset --soft HEAD^ # undo the last commit + staged
git reset --hard HEAD^ # undo the last commit + discard changes
git reset --hard HEAD^^ # undo the last 2 commits + discard changes
git reset --hard HEAD~1 # HEAD~1 is the commit right before HEAD.

HEAD~ or HEAD~1: HEAD's first parent
HEAD~2 : HEAD's grandparent, HEAD~1’s parent
HEAD^ or HEAD^1: HEAD's first parent
HEAD^2: HEAD's second parent

FYI, git reset does know five “modes” : soft, mixed, hard, merge and keep.

Be careful when amending or reseting, it has an impact to the code of other people

Working with Remotes

A remote is a simple short name like origin or upstream. Instead of writing a loooong URL, you use the short name.

(Basically a shortname is an alias).

## Working with remotes
# URL somehow looks like this: 
# https://github.com/OWNER/REPOSITORY.git
git remote add <shortname> <url> # set remote URL
git remote -v # view existing remotes with urls
git remote set-url <shortname> <new url> # set new URL
git remote rm <destination> # remove a remote URL from your repository 

# origin is the default shortname for the remote
git remote show origin # show details about origin

# you can set another shortname
git remote add <shortname> <url>
# example: git remote add pb https://github.com/OWNER/REPOSITORY.git

# to push your commit on remote branch
git push <shortname> <branch>
# example: git push pb feature/myfeature

# git pull command to automatically 
# fetch and then merge that remote branch 
# into your current branch.
git pull <remote>

# The command goes out to that remote project 
# and pulls down all the data from that remote project 
# that you don’t have yet. 
git fetch <remote>

Having different shortnames associated to different urls can be useful when you fork an existing a project.

One more thing

Plenty resources are available, but the most efficient and reliable resource is the book : Git - Book. Besides those resources, git obviously provides a help page (git help).