Graduate Program KB

Git

Data storage

  • Git stores a tree pointers (sha1 hashes of content)

  • Each object is stored in a file based on the sha1 hash (index)

  • Identical content is only stored once (results in the same hash)

  • Git knows files have changed because the hash will be different

  • Example of object hashing (note defaults to blob type)

    echo 'Hello, World!' | git hash-object --stdin
    
  • Writing a blob

    echo 'Hello, World!' | git hash-object -w --stdin
    
    • -w write blob
    • Will be written to a file like .git/objects/8a/rest of hash
  • Object types

    • blob - Compressed file contents
    • tree - Represents the filesystem hierarchy
    • commit - Stores commit data (message, tile, date, etc...) and a pointer to the saved root
  • Commit is a code snapshot

    • What the project looked like at the time
    • Combination of changed files in staging area and previous commit
  • Object metadata

    • type of pointer (blob, tree)
    • filename (or directory)
    • file mode
  • Blob object format (| is a concat)

    "blob" | size | \0 | content

  • Tree objects

    "tree" | size | \0 type | pointer (sha1) | file name type | pointer (sha1) | file name

  • Commit objects

    "commit" | size "tree" | pointer "parent" | pointer "author" | name "message" | msg

  • Git objects are compressed

    • Contents of files remain mostly similar with a few changes - only need to store changes
    • Compress together in a packfile
      • Stores object and deltas
    • Packfiles are generated when
      • Too many objects
      • GC
      • Push to remote

References

  • Pointers to commits
    • tags
    • branches
  • HEAD points to current commit
  • Changing branch changes the HEAD pointer
  • ./git/HEAD
  • ./git/refs/heads/branch-name
  • ./git/refs/tags
    > cat ./git/HEAD
    ref: refs/heads/master
    

Git Areas

  • Working area
    • Not in staging area, not handled by git, untracked
  • Staging area
    • What files will be part of the next commit
  • Repo
    • Files git knows about
    • Contains all commits
  • git add
  • git rm
  • git mv
  • git add -p
    • Stage commit interactively, asks for every file
  • Moving files between areas
    • Working area -(add)> staging -(commit)> repo
    • Repo -(checkout)> working
  • git ls-files -s

Git Stash

  • Store current modification in a stash safe from destructive operations
  • By default will only stash tracked files
    • git stash --include-untracked
  • git stash list
  • git stash show stash@{0}
  • git stash apply
  • git stash apply stash@{0}
  • git stash --all
    • include ignored
  • git stash save "name"
  • git stash branch <optional name>
    • branch with stash applied
  • git checkout <stash> -- <filename>
    • check out file from stash
    • overwrites whatever is in file
  • git stash pop
    • remove last applying changes (if there's no conflict, otherwise errors)
  • git stash drop
    • remove last stash
  • git stash clear
    • remove all
  • git stash show stash@{0}
    • lists files in stash

Git Branches

  • Pointers to commits
  • HEAD - current branch
    • How git knows what the next parent will be
    • Can point to a commit instead (detached HEAD)
    • Moves when
      • User commits
      • User checks out something else

Git tags

  • Lightweight tags
    • pointer to commit
    • git tag <name>
  • Annotated tag
    • Stores: user, message, date
    • git tag -a <name> -m <message>
  • git tag
    • list tags
  • git show-ref --tags
    • list tags and where they point
  • git tag --points-at <commit>
    • list tags that point to given commit
  • git show <tag-name>
  • Tag vs branch
    • Commit tag points to doesn't change

Detached HEAD

  • When checking out a specific commit or tag
  • Commits will dangle, no branch pointing at commits so git will GC them
  • If you make commits in a detached head
    • Make a new branch
      git branch  <name> <commit>
      
    • Use last detached commit you made - points to parents
    • Now they are referenced from a branch and won't be GCd

Merge commits

  • Have multiple parents
  • Marker of when new changes were merged
  • Fast forward
    • If there's a clear path from the branch to the commit being merged git will just move the pointer
    • Lose where the changes were made in the history as there's no marker
  • git merge --no-ff
    • Want to retain history
    • Forces a merge commit

Merge conflict

  • git rerere
    • Reuse recorded resolution
    • Saves how you resolved a commit
    • next conflict that matches will use the saved resolution -useful for
      • long lived feature branch
      • rebasing
    • git config rerere.enabled true
      • --global for all

Useful commit messages

  • Future tense
  • Short subject, blank line, description
  • Description of current behaviour, summary of why fix is needed, mention side effects
  • max 72 character long lines

Git log

  • Show history
  • git log --since="yesterday"
  • git log --since="2 weeks ago"
  • git log --name-status --follow -- <file>
    • file with same content, different name
  • git log --grep=name --author=nina --since=2.weeks
  • git log --diff-filter=R --stat
    • R = renamed
    • M = modified
    • A = added

Commit reference syntax (suffixes)

  • ^ = ^1 = parent commit
    • For merges - multiple parents, first parent, 2nd parent, etc...
    • Across parents
  • ~ = ~1 = commit back
    • Back up tree

Git show commit

  • git show <commit>
    • See commit info
  • git show <commit> --stat
    • Shows number of changes in file instead of listing the full diff
  • git show <commit>:<file>

Git diff

  • git diff
  • git diff --staged
    • Can also give file args
  • git diff A B
    • Changes that are on b but not a
  • Diffing branches
    • git branch --merged master
      • Which branches have been merged
    • git branch --no-merged master
      • Which haven't

Git checkout

  • Restore working tree files or switch branches
    1. change head to point to new branch
    2. copy commit snapshot to staging (staging is also called index)
    3. update working area with branch contents
  • git checkout -- file
    • Replace working area copy with version from current staging
    • Overwrites working directory without working
    • without -- could think it's a branch, not a file
    • git checkout <commit> -- <file>

Git clean

  • Deletes untracked files
  • Will not delete unless forced
  • -f To force
  • -n Dry run
  • -d Recurse into untracked directories if no path specified

Git reset

  • Different behaviour depending on args
  • Checkout moves head, not branch reference
  • Reset moves branch reference
  • For commits
    • Moves HEAD, optionally modifies files
  • For files
    • Does not move HEAD, modifies files
  • --soft
    • Moves HEAD
    • Can consider commit moved from to be dangling
  • --mixed
    • Default
    • Moves head, copies file to staging (from repo)
    • Unstage command
  • --hard
    • Same as both above
    • Copies to working as well as staging
  • Can change history
    • Undo commit, commit after
    • Don't push changed history to shared or public repo
  • git reset -- <file7gt;
    • Doesn't change HEAD
    • No flags
  • Accidental reset
    • Git keeps previous head in ORIG_HEAD
    • git reset ORIG_HEAD

Git revert

  • Saves the reset
    • A commit that undoes the previous changes
  • Original commit stays in the repo
  • For if the commit has already been shared
  • Does not change history

Git amend

  • Quick shortcut to make changes to previous commit
  • Doesn't actually edit commit - (hash will be different - different time)
  • Makes a new commit with a different hash
    • Copy of original with changes
    • Old commit will dangle and be GCd

Git rebase

  • Pull last changes from master
  • Apply our commits on top by changing parent of our commits
  • Rewinds to master
  • Replay commits, modifying them
    • edit, remove, combine, re-ordered, inserted
  • git rebase -i ^<commit-to-fix>
    • Interactive
  • git commit --fixup <SHA>
    • Marks commit as a fix which will be recognised by rebase and autosquash
  • git rebase -i --autosquash <SHA>^
  • git rebase --abort
  • Could be a good idea to backup branch to another before rebase
    • git branch backup
    • git reset backup --hard

Git remotes

  • git remote -v
  • git remote add upstream
  • git checkout -t origin/feature
  • git push -u origin feature
  • git branch -vv
    • What you're tracking, how far behind
  • git fetch
    • Pulls changes, doesn't change local repo
  • git pull
    • Does a fetch and a merge
  • git cherry -v
    • Commits which haven't been pushed
  • git pull --rebase
    • Replays your changes on top of base branch

Danger Zone

  • Local destructive operations
    • git checkout -- file
      • overwrites file if in staging
    • git reset --hard
      • Overwrite changes in staging and working area
    • Use git stash --include-untracked
  • Remote
    • Rebase
    • Amend
    • Reset