# Getting Started with Git
This document is a work in progress.
Use Piazza to suggest further Git tips and tricks we might include.
The purpose of this handout is to give you an idea of how to
effectively use Git. The emphasis here is on "effectively"---the goal
is to get you to think of Git as a tool that is actually useful (both
in 6.005 and maybe even in your other projects), and not just a
cumbersome series of magical incantations.
This document is _not_ intended to be a stand-alone introduction to
Git. We expect you to have played around with it a little bit and have
some idea of what it looks and feels like.
# A bit of context
## The command-line
One of the things that makes learning Git hard for many students is
that it's a command-line program. If you're not familiar with the
command-line, this can be confusing, especially because it's hard to
understand what's specific to Git and what's not.
A command-line is just an interface to your computer, totally
analogous to Finder or Windows Explorer, except that it's
text-based. As the name implies, you interact with it through
"commands"---each line of input begins with a command and might have
one or more arguments, all separated by spaces. The command-line keeps
track of what directory (folder) you're in, which is important to many
of the commands you might be running. Here are some common ones:
* `cd directory-name` (stands for "change directory") --- Switches to the directory `directory-name`.
* `pwd` (Stands for "print working directory") --- Prints out the current directory.
* `ls` --- Lists the files in the current directory
### Arguments
Most commands you type in are actually other programs. When those programs get launched, the command-line passes in the arguments to the program so that it can do something with them. In Java, this is what gets stored in the `String[] args` argument passed to the `main()` method of your program.
## Working locally
Git is also just another program, so when you type in something like `git subcommand whatever...`, the `subcommand whatever` part gets sent to Git, the program. Git, in turn, just manipulates a directory in your repository called `.git` (you can't normally see it when you run `ls` because it's "hidden"; if you say `ls -a` at the top level of your repository, it will appear).
### Configuration
Before you go on, it's a good idea to configure Git to be a little bit nicer.
#### Who are you?
Every Git commit has an author, the name and email address of the person who wrote the code.
Especially when working on a team project, you should make sure your Git commits include your correct name and email.
git config --global user.name "Your Name"
git config --global user.email username@mit.edu
#### Commit Messages
or, "I do git commit and then I can't type!"
When you run "git commit," you will be presented with a text editor that lets
you edit the contents of the commit.
Unfortunately, Git may choose a default text editor that is unexpected and
unintuitive.
Before making your first commit, try running
nano
in the terminal.
The result should be a simple editor with instructions at the bottom of the screen; quit with `ctrl-X`.
If that worked,
git config --global core.editor nano
will configure Git to use the nano editor.
The commands to use the text editor (like copy, paste, quit, etc.) will be shown on the bottom of the screen.
The `^` symbol represents the `ctrl` key.
For example, you can press `ctrl-O` to save (Nano calls it "write out") and then `ctrl-X` to quit.
#### Adding some color
Out of the box, it can be hard to see and understand all the output that git prints out at you. One way to make it a little easier is to add some color. You can run the following commands to make your git output colorful:
git config --global color.branch auto
git config --global color.diff auto
git config --global color.interactive auto
git config --global color.status auto
git config --global color.grep auto
### Basic workflow
The basic building block of data in Git is called a "commit". A commit
represents some change to one or more files (or the creation of one or
more files).
When you first create a file or change a file, that data is
unknown. To add it, run
`git add file.txt` (where file.txt is the file you want to add)
This "stages" the file. Once you've staged all your changes, run
`git commit`
This will pop up an editor that will give you a chance to write a _commit message_. When you save and close the editor, the commit will be created.
### Getting the status of your repository
Git has some nice commands for seeing the status of your repository.
The most basic of these is `git status`. You can run this at any point
to see which files Git sees have been modified and are still unstaged
and which files have been modified and staged (so that if you `git
commit` those changes will be included in the commit). Note that the
same file might have both staged and unstaged changes, if you changed
the file more after running `git add`.
When you have unstaged changes, you can see what the changes were
(relative to the last commit) by running `git diff`. Note that this
will _not_ include changes that were staged (but not committed). You
can see those if you run `git diff --staged`.
You can see what the last commit actually was with `git show`. This
will show you the commit message as well as all the modifications (as
if you had run `git diff`).
You can see the list of all the commits you made (along with their
commit messages) with `git log`. If you do `git log -p`, it will show
you the full commit history, including the changes each commit
made. In other words, this is as if you ran `git show` on each commit
in your history.
Note that `git show` and `git log` might place your command-line in a
state where you can't type more commands. Instead, there will be a
little colon (:) symbol at the bottom. This indicates that there is
more data than there was room on your screen and that you can scroll
with the arrow keys. You can leave this mode by pressing `q`.
### Commit IDs
Every Git commit has a unique ID, which is the long string of letters
and numbers that you see when you type `git log` or `git show`. This
is what's called a "hash" of the contents of your commit. One neat
feature is that this ID is unique not just within your repository, but
actually within the _universe_ of Git commits. In other words, if your
commit ID is something like `ab1312313febc241...`, that commit is
(extremely likely) to be the _only_ commit in the world with that
name.
You can reference a commit by its ID (or frequently just by the first
8 characters). This is most useful with something like `git show`,
where you can look at a particular commit, rather than just the most
recent one.
## Working remotely
So far, all the commands we've been running have only been operating
_locally_; that is, they haven't gone past your computer. This is
still pretty useful, but sometimes you want to go further.
### Remotes
Unlike other similar systems, Git doesn't have built-in a notion of a
"central repository." Instead, any repository can push to any other
repository by specifying it as a "remote." A "remote" is just a pair
of a name (which can be anything) and a URI, which is a string
indicating how it can find the other repository. The URI might look
something like this:
`ssh://username@athena.dialup.mit.edu/afs/athena.mit.edu/course/6/6.005/git/sp13/psets/ps0/username.git`
Breaking that down:
* `ssh://` --- this specifies the _protocol_ git should use to
transfer the data. SSH is a protocol that lets you send data securely,
which is useful to us because we have to type in a password. But in
principle this is totally analogous to, for example, the http:// which
you see in web browsers (HTTP is a protocol commonly used for data on
the Web).
* `username@athena.dialup.mit.edu` --- this actually has two
parts. The `username` is the username you use to log in to the
server. The `athena.dialup.mit.edu` is the address of the server
itself. `athena.dialup.mit.edu` is the name of an Athena server
IS&T runs. It accepts Kerberos logins, so your `username` can just
be your Kerberos name.
* `/afs/athena.mit.edu/course/6/6.005/git/sp13/psets/ps0/username.git`
--- this is the path on the server where the repository is stored. The
only noteworthy thing here is the `username.git` part at the end. In
6.005, for our convenience, we specify your repository with your
username. This doesn't have to be the case, though, and isn't
always---for example, in your projects, the name is actually the
usernames of all three members of your group. Perhaps most
importantly, there's nothing in Git to say that the username you log
in with (the thing before the `@` sign) and the username at the end of
the path have to match. In 6.005, we just set it up that way.
Now, even thoguh Git doesn't have the idea of a central repository,
it's very useful for 6.005. Thus, in 6.005, all of your repositories
are actually created by _cloning_ a remote repository which we create
(and which acts as the "central" repository). You've done this with
the `git clone URI directory` command a bunch of times now. This
actually does a couple of things:
1. Create an empty directory called `directory` (i.e. the last argument to `git clone`).
2. Initialize it as an empty Git repository.
3. Add a remote with the URI you specified and the name `origin`.
4. Download the data from the remote.
So for those of you who were wondering, that's what the `origin`
means. It's just the default name of the remote repository that you
cloned your repository from.
### Pushing
After you've made some commits, you might want to push them to a
remote repository. Again, in 6.005, you really only have one remote
repository to push to, called `origin`. To push to it, you run the
command
`git push origin master`
The `origin` in the command specifies that you're pushing to the
`origin` remote. The `master` refers to the `master` branch. Branches
are an advanced feature of Git that we're not going to be using in
6.005, but since Git has them, you do have to specify a branch. For
now, just include this part when you push.
Once you run this, you will be prompted for your password and
hopefully everything will push. You'll get a line like this:
`a67cc45..b4db9b0 master -> master`
Sometimes, though, things will go wrong. You might get an output like
this:
`! [rejected] master -> master (non-fast-forward)`
What's going on here is that Git won't let you push to a repository
unless all your commits come after all the ones already in your remote
repository. If you get an error message like that, it means that there
is a commit in your remote repository that you don't have in your
local one (probably because a teammate pushed before you did). If you
find yourself in this situation, you have to pull first and then push.
### Pulling
To perform a pull, you should run `git pull origin` (again, the
`origin` tells Git that you're pulling from the `origin` remote). When
you run this, Git actually does two things:
1. It downloads the changes and stores them in its internal state. At this point, your repository doesn't appear any different---it just knows what the state of the remote repository is and what the state of your repository is.
2. It incorporates the changes from the remote repository into the new repository via a process called _merging_.
#### Merging
If you made some changes to your repository and you're trying to
incorporate the changes from another repository, you need to merge
them together somehow. In terms of commits, what actually needs to
happen is that you have to create a special _merge_ commit which
encompasses both changes. How this process actually happens depends on
the changes.
If you're lucky, then the changes you made and the changes that you
downloaded from the remote repository don't conflict. For example,
maybe you changed one file and your partner changed another. In this
case, it's safe to just include both changes. Similarly, maybe you
changed different functions of the same file. In these cases, Git can
do the merge automatically. When you run `git pull`, it will pop up an
editor as if you were making a commit---in fact, this is the commit
message of the merge commit that Git automatically generated. Once you
save and close this editor, the merge commit will be made and you will
have incorporated the changes. At this point, you can try to `git
push` again and hopefully it will work this time.
Sometimes, you're not so lucky. If the changes you made and the
changes you pulled edit the same part of the same file, Git won't know
how to resolve it. This is called a _merge conflict_. In this case,
you will get an output that says `CONFLICT` in big letters. If you run
`git status`, it will show the conflicting files with the label `Both
modified`. You now have to edit these files and resolve them by hand.
First, open them up in your text editor (probably Eclipse for
6.005). The parts that are conflicted will be really obviously marked
with obnoxious <<<<<<<<<<<<<<<<<<, ==================,
>>>>>>>>>>>>>>>>>> lines. Everything between the <<<< and the ====
lines are the commits you made. Everything between the ==== and the
>>>> lines are the commits you pulled in. It's your job to figure out
how to combine these. The answer will of course depend on the
situation. Maybe one change logically supercedes the other, or maybe
they can be merged somehow. You should edit the file to your
satisfaction and remove the <<<>>> markers when you're done.
Once you have resolved all the conflicts (note that there can be
several conflicting files, and also several conflicts per file), `git
add` all the affected files and then `git commit`. You will have an
opportunity to write the merge commit message (where you should
describe how you did the merge). Now you should be able to push.
### Big caveat: pulling without committing!
One thing you should be very careful about is to commit all your
changes before doing a `git pull`. If you don't do this, what's going
to happen is that Git will download all the files, but then refuse to
try to do a merge because it's worried about overwriting your
changes. If you make a commit and then try `git pull` again, it might
say `Already up to date` even though the changes haven't been
incorporated. If you accidentally run into this situation, `git merge
master` will force the merging process to happen.
## Errors
### No repository
If `git clone ssh://...` reports that it "could not read from remote repository", check your repository URL for typos.
If you are sure you have the correct URL, and especially if you registered late, contact the staff to make sure a repository has been created for you.
### Can't clone
Because your origin Git repository is stored on Athena and accessed with SSH, certain Athena customizations can conflict with Git's ability to clone and push.
If running `git clone ssh://...` reports a "protocol error" or simply asks for your password but then hangs and does nothing, you should review any changes you made to your Athena dotfiles, especially `.bashrc.mine` and `.bash_environment`.
Ask a TA for help if you are not familiar with Athena.
## Verifying that your code is on Athena
- Use `git status`, which will report versions you have not pushed to Athena
as "commits ahead of origin/master".
- Use `git log` to review the versions you have committed.
- If Didit ran a build for a version, that version is on Athena.
- In your clone, run `git log --decorate` and look for a version labeled
`origin/master`; that's the version on Athena, as far as your clone is
aware.
- This one is not a commonly-used command, but in your clone, run `git
ls-remote` to connect to Athena and output the current version there.
----
### Have fun in 6.005!