Version control
Git resources
- Pro Git book by Scott Chacon. Available online (gratis) or paper. I suggest working through at least chapters 1–3. We’ll also pick up some things from chapters 7–8 later in the course.
- Understanding Git Data Model article by Zvonimir Spajic. Great intro to the three types of objects: blobs, trees, and commits. Part of a series.
- Oh Shit, Git! is a fun overview and printable cheat sheet/booklet available for $10. (There’s also a version without explicit language, if you prefer.)
- Beginner’s Guide to
git bisect
by Tony Rost. Incidentally, here’s the invocation that I used to automate the search:git bisect run sh -c "! grep --count car test.txt"
It would be run after doing
start
and marking the initialgood
andbad
commits.
Purpose
- We have files that represent code, configuration, documentation.
- Need tools to manage modifications
- Team environment means that multiple devs may edit the same file(s). Need to be careful about integrating changes.
- However, VC even useful for a lone developer: to see previous versions, undo changes, redo, manage different configurations, etc.
History
Although git’s model of snapshot-based, concurrent, and distributed version management is now dominant, it can be useful to understand some of the other design points that were used in the past.
Snapshots vs deltas
- One way that VC tools differ: do they manage snapshots of your
files, or do they manage changes (aka deltas or diffs) to files?
- Either store original and forward deltas,
- Or store most recent and reverse deltas.
- Git (and friends) instead store snapshots – every version of every file. Faster to find old versions compared to applying deltas.
- Does take up more space than delta-based versions. Can use compression to reduce space.
Centralized vs distributed
- Centralized means there’s some designated server that keeps all the history.
- Centralized also means browsing the history or adding to it requires network access to the server.
- Distributed means each developer has their own copy of the entire history.
- Distributed also means I can work while disconnected and then later push/pull.
- A distributed VC can also be a much-improved centralized VC.
- GitHub/GitLab are central servers for a distributed tool.