Introduction to Git – Core Concepts

Introduction to Git – Core Concepts


Hello, I’m David Mahler, and this is an introduction
to Git. In this video we will cover some core concepts
of the version control system git, In the video, we will: Gain a better understanding of central git
concepts and terminology Utilize two basic diagrams to help our understanding
of what is happening when we run git commands. The diagrams are the git commit graph and
the three conceptual areas for files in git. We will create a git repository and run some
core commands. We will map all those commands to our two
diagrams to help our understanding. The next two videos I put out will cover branching
and merging and working with remote reposistories. However, I feel it is critical when learning
git to have solid footing on some core concepts first That will make branching and remote repositories
much easier to understand. References The primary reference for this video is the
fantastic Pro Git book by Scott Chacon which has been made available for free under a creative
commons license. https://git-scm.com/book/en/v2. Another site I’ve found immensely helpful
is the Visual Git Reference posted by Mark Lodato https://marklodato.github.io/visual-git-guide/index-en.html. This site maps the most common git commands
to diagrams just like I’ll do in this video. Let’s get started now, by defining git. What is Git? Git is a type of version control system. A Version control system (VCS) enables you
to record changes to files over time. These files could be device configs, software
code, documentation, anything really. However, git works especially well with text
files. Some of the tasks you can perform with git
are: Take snapshots of files over time
Restore earlier versions of files from snapshots you’ve made
Work on multiple versions of a file in parallel. For example, perhaps you are adding new features
to some code. While that work is in progress, you don’t
want to interfere with the known working production code. Therefore, you can work on separate versions
of files and keep them separate until you are ready to merge your new changes into the
production version. Now before we dive into a real repository,
let’s cover two core concepts via diagrams. The git commit graph and the three conceptual
areas of a git project. The git commit graph
Git tracks changes to files over time. It does this by enabling you to take snapshots
of files at any time. These snapshots are called commits. We can represent these commits with a basic
graph. Let’s imagine we are setting up a new git
repository more often referred to simply as a repo. We start with a standard directory on our
filesystem. Once we have git installed, we can turn this
directory into a git repo. Starting the repo is done with the git init
command. Let’s say we add two files to our directory,
F1 and F2. After adding these new files, we decide to
take our first snapshot or git commit. The commit is represented by this oval here. Now that we have our first snapshot, we can
always go back to this point, to how these files looked when we took the snapshot. You might think of this a little bit like
saving your place in a video game. Let’s say we edit F1. Maybe we add some new lines to it. After our edits, we can capture our new changes
to F1 with a second commit. So we have two commits in our commit history
now. Our first commit has F1 and F2 as they were
when we first created them. Our second commit has the updated version
of F1. The second commit also has F2, but it hasn’t
changed since the first commit. Let’s say we decide to delete F2 and make
a 3rd commit which includes the removal of F2. Now we have built up a history of 3 commits. We are currently here at our 3rd commit. At this commit, we have F1 in its updated
state, the same way it was in our 2nd commit. Also, we no longer have F2. On our filesystem, we will see F1. We won’t see F2 since we removed it. Since we have snapshots, we can restore earlier
versions of files. For example, let’s say we decide we want F2
back. We can retrieve it from the 2nd commit here. In git parlance, we would checkout F2 from
this commit. In the same vein, maybe we decide we want
F1 back the way it was here at our first commit. We can retrieve or checkout that original
version of the file, effectively discarding the edits we made here at the second commit. Something else that is important to note. Every commit is logged by git. git will log
who made changes and when. You can image how useful this can be on a
large project with many contributers. Thats it for a quick first look at the commit
graph. As we work on our project, we change files. When we are ready, we make a commit or take
a snapshot of files. A commit saves the state of the files at a
particular point in time. Usually, we make a commit when a logical unit
of work is done. For example, when we add a new feature. We may commit all changes related to that
feature in a single commit. Now, let’s start talking about the three conceptual
areas of a git repo. 3 areas of a git repository In git we have 3 logical areas in which we
work with our files; we have the working tree, the staging area, and the git history. The working tree is what we see on our filesystem. When we add, delete, and edit files, we do
that in the working tree. The git history is equivalent to the commit
graph we just saw. This history is kept in a hidden directory
dot git. The dot git directory holds an object database
and metadata that makes up our repo. In fact, if we sent our dot git directory
to someone else that person would have our complete git project, and it’s full history. They would have access to all the versions
of files in any commits we made. As we are working on our project, we make
changes in the working tree. We add, remove, and edit files in the working
tree. Git gives us full control over which changes
from our working tree we put into our next commit. For example, maybe we edited three files. However, we only want the new versions from
two of those files snapshotted for our next commit. The way we get this control is through the
staging area. We would add the two files we want to the
staging area. When our staging area is right, we make a
commit. Only the changes in the staging area are put
into the next commit. The file we left out can be included in a
future commit. The staging area is also known as the index. For consistency, in these git videos, I’ll
stick with calling this the staging area. Thats a overview of our two diagrams. Let’s get started with our own git project. These diagrams should start making more sense
as we work through a real repo. Start a new git repo On my system, I’m using a Linux Ubuntu VM. To install git, I ran “sudo apt-get install
git.” For other systems, please search online for
git installation documentation. We start with a standard directory on our
filesystem. I’ll create a directory on my filesystem. Since I’m a network engineer, I’m going to
set up a phony network automation project with YAML based text files in it. Keep in mind the type of project I make is
not relevant for the purposes of the video. All that matters is that we have a directory
and that we are adding and editing text files in that directory. Now let’s move into the new directory and
add a file named S1. I’ll use vi, but you can use any text editor
you prefer. I’ll paste in some text data to represent
some variables for a network switch named S1. S1 has a management IP, some VLAN names mapped
to VLAN numbers, and some switch ports with assigned VLAN numbers. Let’s save that data and exit. Now we have a single file S1 in our new directory
netauto. We can make this directory hold a git project
with the git init command. After running this command, we get back a
message. “Initialized empty Git repository With the git init command, git added a new
hidden subdirectory, .git into the netauto directory. Let’s look at our three conceptual areas diagram
to visualize what we’ve done. We created a directory named netauto on our
filesystem. That directory is now our git repo’s working
tree. The working tree is where we add, remove,
and edit files for our project. The working tree has our first file S1 in
it. Next, we ran the git init command. That command created a .git subdirectory in
our netauto directory. As we discussed before, this dot git directory
is what holds our repo. We haven’t made any commits yet, so we don’t
have any versions of files referenced there, but we will soon. Before we get on with our first commit, we
have a quick administrative task. Configuring your git username and email
Whenever we make a commit, git includes our name, email and a timestamp with the commit. This is important for tracking when changes
where made to a project and who made them. First, we will set our name . Then we will set our
email with . git config –list shows us our name and email are set. Since we used the –global flag with the previous
commands, our name and email will be used for any git repo we have on this system. So we shouldn’t have to do this again. If you happen to need a different name and
email for a particular repo, you can use the –local flag instead of –global. Now that we’ve set our name and email, let’s
make our first commit. Making a git commit In our working tree, we have the new file
S1. Git will call S1 “untracked” since it is a
new file. Remember, git tracks changes to files over
time. Git isn’t doing any tracking for S1 just yet. However, it will once we add S1 to our staging
area. Before we stage S1, let’s run the git status command. The git status command tells us how things
stand in in our working tree and in our
staging area First, we see that we are on the master branch. We will talk about branching in the next video. We are about to make our initial commit. Git status tells us we have an untracked file
S1. It also tells us how to get S1 into the staging
area. We stage S1 with the git add command. Git status also informs us the same git add
command will make our untracked files turn into tracked files. Let’s run git add now. git add s1 Let’s run git status again. When we ran git add S1, that moved S1 into
the staging area. Git status expresses this by saying we have
“changes to be commited”. In the staging area, we have our “new file:
S1”. Also, note that git status no longer says
that S1 is untracked like it did before. Now, git is tracking S1. It’s time to make our first commit “git commit -m “add file s1” . The git commit command creates a commit
with whatever is in the staging area. For us, it’s just S1. Also, we used the dash m option. With dash m we provide a short message describing
what is being changed. We now have the first commit for our project
with the single file S1. Let’s take a look at the three areas diagram
again to review what we’ve accomplished. Previously, we made our netauto directory
– this is our working tree. In this directory, we put the file S1. We used the git add command to put S1 into
the staging area. Finally, we created our first commit with
the git commit command. Let’s go back to the CLI and see what says
now. Git says we have “nothing to commit, working
tree clean”. Nothing to commit means everything in our
staging area is already commited. Working tree clean means there is nothing
new in our netauto directory. Everything that’s in the netauto directory
we have put into a commit. At the moment this is only the S1 file. If we look back at the output from running
git commit, we also see part of a hash. Git performs a SHA-1 hash of every commit
that’s made. It takes in the directories, files, and some
metadata to create this hash. Every commit we make has a unique hash value. What we see here is the first 7 hexidecimal
characters of a 40 character hash. Let’s check in with our commit graph. We have a single commit so far. It has a unique SHA-1 hash with the first
7 characters of that hash shown here. The commit has our name, email, a timestamp
and our commit message. Let’s see how to get this same information
at the CLI. At the CLI, the command tells us about our
commit graph. The output shows that we have a single commit. First, we see the full 40 hexidecimal character
hash. Then we see the author name and email. Finally, we see the message we provided. Let’s build on this by working with a 2nd
file S2. Making our second commit
In our working tree, we can make a file S2. We can just copy S1 Let’s edit S2 so that
it’s not the same as S1. I’ll give S2 a new IP address and I’ll make
some changes to what VLANs are on what switch ports We can add VLAN 20 to port 1, and we can remove
VLAN 20 from port 3. Let’s also edit S1 . We can add a 3rd VLAN
green with VLAN id 30. Also, we can put port 1 in the new VLAN and
save that. Let’s check in with our three areas diagram. In our working tree, we have a new file S2. Since it is new, it starts as untracked. We also modified S1. S1 is being tracked since we previously commited
it. Our staging area has not changed yet. It still has S1 the way it was before our
new edits. Also, S2 is not present in the staging area
yet. If we run git status , this confirms the same. S1 is modified, but not staged. S2 is a new untracked file. Back to our diagram. Often, we want to see the differences between
tracked files in the working tree and the staging area. We can see this by using the git diff command. Here is git diff. We see how we’ve updated the file S1. We added the green VLAN, and we moved port
1 from VLAN 10 to VLAN 30. Note how git diff is not saying anything about
the new file S2. That is because S2 it is not tracked yet. Again, git diff shows the difference between
tracked files in the working tree and the staging area. Let’s stage S1 and S2. One option is to run git add S1 S2. Instead of that, let’s do git add . The dot means to add all new and modified
files to our staging area. So that will add both S1 and S2. We also could have run git add S* for the same effect
using a wildcard. If we run git status, we see S1 and S2 are now both
in the staging area. A quick check of our diagram shows this. Now our working tree and our staging area
match, they both have the modified S1 file and the new S2 file. A moment ago we saw how the git diff command
shows the difference between tracked files in the working tree and the staging area. If we run git diff with the dash dash staged
option, it will show a diff between the staging area and our most recent commit. In other words, it shows us what we are about
to commit. Back at the CLI let’s run git diff –staged. In the output,
we see our changes to S1, and we see all the lines for the new file S2. This looks good, so let’s commit these changes
now. git commit -m “add S2 and edit S1”. We’ve made our 2nd commit. We get a unique hash for the 2nd commit. Let’s look at our commit graph. We created our git repo with git init. We added S1 and made our first commit. We added S2 and edited S1 and made our second
commit. At the shell git log shows the same thing. Our most recent commit is at the top. Our first commit is below that. If we add the -p option to git log, we can
see what actually changed with each commit. Let’s now look at a quick way to remove a
file from our repo. Remove a file Probably, the easiest way to remove a file
is to use the git rm command. We can remove S2 with git rm S2. The git rm command did two things at once. It removed S2 from our working tree. It also staged this removal. Therefore S2 is removed from the staging area
as well. At the CLI git status shows the same, we have a staged
removal of S2. We can commit this removal of S2. Again, we use git commit to make a commit. This time we will leave off the dash m option. When you leave off the dash m option, you
are taken to your default file editor. For me, this is nano. From here, we can make a more detailed multiline
commit message than we would have done with the dash m option. Let’s do that. remove S2 Switch S2 was decommissioned Below our message is some info from git. It tells us the following: It asks us to enter a commit message like
we just did It tells us that hash marks are used for comment
lines It informs us that we can back out of this
commit by not providing a message at all Finally, it tells us what branch we are on
and what we are about to commit. I’ll save this file and exit. We have completed our 3rd commit. Checking our commit graph, we see the new
3rd commit with S2 removed. Flipping back to the CLI, git log shows the 3 commits. We can see in our most recent commit, that
we have a multiline commit message. We were able to do this since we left off
the dash m option with the git commit command. Now that we’ve built up a short git history
let’s see how to undo some changes. Undo a working tree change Let’s start by making a change to S1 again. . We can add a bogus VLAN, ‘badvlan’ with
id number zero. We will save that change. Looking at our 3 area diagram, we see S1 has
changed in the working tree. However, it hasn’t been updated in the staging
area yet. We didn’t stage this bad config. With the git checkout command, we can replace
the new S1 in the working tree, with the previous version of S1 that is still in the staging
area. Effectively, we would be discarding the new
working tree changes. At the CLI let’s check git diff. Again, git diff shows the diff between the
working tree and the staging area. So we see the new ‘badvlan’. git status shows S1 is modified
but not staged. In the output of git status, git tells us
how to undo the modification. git checkout — S1. Now, S1 is back to how it was before we added badvlan. git diff returns nothing since our working tree and
staging area match. git status shows our modifications are no longer there,
our working tree is clean. Finally, more s1 proves we have in fact discarded the badvlan change. Keep in mind; we can’t recover the changes. We never commited ‘badvlan’, so it doesn’t
exist in git history, and it’s not coming back That’s how to undo a working tree change. Let’s see how to unstage a file. Undo staging of files. Let’s edit S1 and stage the change. We will put port 1 in the red VLAN instead
of green. We save that and exit. git diff shows our changes. We can stage this now, git add S1 git diff doesn’t show anything Our working tree and staging area match. However, our last commit doesn’t have the
new snapshot of S1 yet. So we can use the dash dash staged option
with git diff. git diff –staged This shows the diff between the staging area
and our most recent commit. git status shows S1’s newest changes are staged and ready to be commited. In our diagram, we see we the modified S1
in the working tree. We added it to the staging area with git add
S1. We used git diff with the staged option to
see how the staged S1 is different from the latest commit. Now we can unstage S1. We do this with the git reset command. More specifically git reset HEAD S1. This will restore S1 from the lastest commit. The term HEAD in this context refers to the
most recent commit. We will talk more about HEAD in the next video
on branching and merging. Back at the CLI. git rest HEAD S1 We’ve unstaged S1. Keep in mind we only restored S1 in the staging
area. Our working tree still has the new version
of S1. git status confirms this. We have the modified S1, and it is not staged. We can also restore the working tree now with git checkout –S1
. Now let’s look at one more undo action. Earlier we deleted the file S2. Let’s recover S2 from a prior commit. Restore a file from an earlier commit S2 isn’t in our working tree anymore, and
it’s not in our staging area. We deleted S2 after our 2nd commit. Let’s get S2 back from this commit, before
it was deleted. with git log — S2 With we can see commits that affect the file
S2. From our commit messages we see where we added
S2, it was our second commit here. Let’s get S2 back from that snapshot. git checkout and then commit hash –S2 We are checking out S2 from the commit that starts with these five characters. ls shows we have S2 back now. git status shows we not only put S2 back into the working
tree, but we put it into the staging area as well. Let’s go ahead and commit this. git commit -m “restore S2” Checking our 3 area diagram. We retrieved the file S2 from the 2nd commit
in our git history. We put it in both the working tree and the
staging area with the git checkout — S2 command. Finally, we commited the restore of S2 with
the standard git commit command. OK, before we conclude this video, let’s wrap
up by talking about the dot gitignore file. .gitignore Often in a Git project, there are files you
don’t need Git to track. Some examples could be log files, compiled
code, or other non critical artifacts. Let’s put a couple of files like this in our
working tree. We can make a file myapp.pyc here . Also,
let’s create a logs directory. In the logs directory, we can put a couple of log files. Now shows the pyc file and the new logs directory
as untracked files. We don’t want git to pay attention to these
files, and we can do this with a dot gitignore file First the compiled file. We can put the whole file name here like this myapp.pyc. Instead, of that let’s use a wildcard. We know we don’t want any file with the pyc
extension tracked, so we can use asterisk dot pyc. Our logs files are all in the logs directory
so we can ignore the entire logs directory like this . Let’s save that and exit now. Running git status again, we see neither myapp.pyc or
our logs files are showing up. We do have the .gitignore file which we need
to commit to our repo like any other file. Let’s do that here git add . and git commit -m “add .gitignore file” So that’s last topic I wanted to cover in
this git core concepts video. Wrap up and review
Let’s wrap up here by doing a quick high-level review of what we accomplished through one
final look at our diagrams. The git commit graph shows our history of
commits. Our first commit was here at the bottom when
we added S1. Our last commit was a moment ago when we added
the gitignore file. Every commit has a unique 40 character sha-1
hash. They also have a timestamp, our commit messages
and our name and email associated with them. Using the git log command, we can see the
git history from the CLI. Here is the 3 conceptual areas diagram. The working tree is our netauto directory,
as well as any subdirectories if they are there This does not include the .git subdirectory. We make our changes in the working tree. When we are ready, we stage the changes we
want in our next commit. That is done with the “git add” command. When our staging area has the snapshots we
want for a commit, we use the “git commit” command to make a commit. Right now our working tree, staging area,
and our latest commit all match up so we are in a clean state. The git status command lets us know the state of files in our working tree and our staging area We saw how to overwrite an unstaged working
tree change. That was with the git checkout — filename
command. We saw how to unstage a snapshot with the
git reset HEAD command. We also saw how we can remove a file from
our repo with the git rm command. Finally, we had a quick look at how we can
use the gitignore file. The hidden gitignore file enables us to disregard
files we don’t want to be tracked in our repo. OK, that wraps up this video on git core topics. In the next video, we will talk about branching and merging, a crucial piece of working with git After that, I will have a video on working
with remote repositories. That video will also include ways to alter
your commit graph like ammending commits, rebasing, and cherry-picking. THATS IT I hope you found this video helpful for your
work. If you’d like to support this page, please
subscribe for more content like this. Also, as always you can connect with me and
say hello on LinkedIn at www.linkedin.com/in/davidmahler. Of course, comments on the video are welcome
as well; I try to answer every comment or question to the best of my knowledge. Thanks for watching!

100 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *