Section 2 An Introduction to Git And GitHub

Git is a version control system that helps manage projects, especially projects that involve multiple people. GitHub is a hosting service for these Git projects (in GitHub-speak, projects are housed in repositories). Here, we will outline how to set up Git, create your own GitHub account, and connect them to you local R environment to easily publish and share the code you develop. A great article on the benefits of using Git and GitHub has been written by Jennifer Bryan, which you can find here and we highly recommend reading!

In fact, Jennifer Bryan has developed what we feel is the best guide for using Git and GitHub out there, Happy Git and GitHub for the useR. So as not to reinvent the wheel, this outline will mostly just direct you to that body of work.

2.1 Setting up Git and GitHub

Matt made an excellent YouTube video (also based on Jenny Bryan’s work) on how to set up your own GitHub account, install Git, and allow them to interact with your personal R environment. It’s also a great primer for how Git and GitHub work in R Studio.

embed_url("https://www.youtube.com/watch?v=Y4sOTFV3FAM&t=1s") %>%
  use_align("center")

Key takeaways:

  1. Create a GitHub account. (see https://happygitwithr.com/github-acct.html)
  2. Install Git on you local computer. (see https://happygitwithr.com/install-git.html)
  3. In the R Studio Terminal, connect your GitHub account to your local computer. (see https://happygitwithr.com/hello-git.html)

2.2 Developing, viewing, and editing repositories

As mentioned in Matt’s video (~7:53), creating a repository is the first step when starting a brand-new Git project. Essentially, a repository is the main folder in which all the files and code related to a project live. For any repository on GitHub, there are two ways of copying it to your local computer: cloning and forking.

2.2.1 Cloning

Cloning a repository maps a local version of that repository to your computer, allowing you to sync both the local (your computer) and remote (what’s on GitHub) versions. This type of copying is useful for workflows where you are the only one working on them. If you clone a repository that you did not create yourself, you will be unable to make changes to the ‘main’ repository hosted on GitHub unless the developer has given your account collaborator approval, but it will allow you to refresh and update what’s on your local computer if changes have been made to what’s on the ‘main’ repository on GitHub.

However, for a repository that you have created yourself or are a collaborator on, cloning allows two-way syncing; you are able to make changes to the GitHub repository as well as update what’s on your local computer if the GitHub repository changes.

After editing, adding, or removing pieces of the project, you can re-sync those changes by committing and then pushing the project back to GitHub. A commit essentially packages and saves the changes you made; it also allows you to comment on what exactly you changed in the repository. Meanwhile, the push command sends those changes that were packaged by the commmit to GitHub and updates the remote repository on GitHub accordingly. Because cloning creates a direct connection between the GitHub ‘main’ repository and what you’ve cloned to your local computer, you can only push changes to repositories that are located in your own GitHub account, or to repositories that the developer of the repository has explicitly made you a collaborator on.

How to clone and update your GitHub:

  1. Create a repository on GitHub. (https://happygitwithr.com/new-github-first.html#make-a-repo-on-github-2)
  2. Clone the repository to your computer. (see https://happygitwithr.com/rstudio-git-github.html#clone-the-test-github-repository-to-your-computer-via-rstudio)
  3. After changes have been made, commit and push them to your GitHub. (see https://happygitwithr.com/rstudio-git-github.html#make-local-changes-save-commit)

2.2.2 Forking

Unlike cloning, forking a repository creates a totally separate, parallel copy of a repository that is housed within your own personal GitHub account. Specifically, if you’ve made changes to the code in a forked repository then committed and pushed those changes, it does not automatically make those same changes to the ‘main’ repository that it was forked from on GitHub. Instead, all changes are saved and stored only within the repository hosted within your own GitHub account.

Forking essentially creates a line of defense for changes in a workflow that would otherwise disrupt the workflows of other people working within that ‘main’ repository project. For the ‘main’ repository to be updated with the changes you made in your forked version, you must submit a pull request.

A pull request is an invitation to the owner of the ‘main’ repository to review the changes you’ve made to the code and potentially merge them into their ‘main’ repository. If the owner/reviewer finds that the code you’ve developed would be a good addition to the workflow and/or that the code has not diverged/disrupted others’ workflows, they can approve the pull request, and the ‘main’ repository will be updated accordingly.

Matt and a former grad student, Bryce Pulver, made another excellent video that walks through an example collaborative workflow using forking:

embed_url("https://www.youtube.com/watch?v=a_YY_PIDeq8") %>%
  use_align("center")

How to fork and submit a pull request

  1. Fork the repository on GitHub, then clone the forked repository to your local environment. (see https://happygitwithr.com/fork-and-clone.html#fork-and-clone-without-usethis)
  2. Create a direct connection with the main repository to keep your personal repository up-to-date. (see https://happygitwithr.com/fork-and-clone.html#fork-and-clone-finish)
  3. After changes have been made to the repository on your local computer, commit and push them to the forked repository on your personal GitHub. (see https://happygitwithr.com/rstudio-git-github.html#make-local-changes-save-commit, or Bryce at ~25:45)
  4. Go to GitHub in your web browser. Submit a pull request by going to the main repository and selecting and creating a pull request. (Bryce does this at ~31:40)

2.3 Making changes to repository visibility

When you create a new repository you can choose whether you’d like its visibility to be public or private. Within the lab our default expectation is public, in order to make our work open and reproducible. However, in certain circumstances repositories will be need to be private or visibility will need to change during the lifecycle of a project.

Before you make a change to the visibility of a repository, it’s important to make sure that all of your forks are up to date and to warn any of your collaborators that you are going to make this change. Changing visibility severs connections between all forks and the repository, so this is important to prepare for.

2.3.1 Reconnecting severed repositories

You can follow these steps if the connection between your fork and the upstream repository is severed:

  1. Rename your fork. This will enable you to have this (old) fork exist at the same time as a new fork in step 2.

  2. Create a second fork of the repository.

  3. Remove all files from the new fork.

  4. Clone your existing fork locally using this command in the command prompt: git clone --bare https://your_original_fork_url.

  5. Move into the newly created folder from step 4 using cd your_original_fork_folder.

  6. Run the following command. It will push the branch and commit history into the new fork: git push --mirror https://new_fork_url.

Instructions adapted from this post and input from B Steele.

2.4 Additional resources

Below is a list of additional resources about GitHub and collaborative workflows.

• GitHub website (https://docs.github.com/en).

• Forked workflows (https://www.atlassian.com/git/tutorials/comparing-workflows/forking-workflow).

• Using Git and GitHub for team collaboration (https://medium.com/anne-kerrs-blog/using-git-and-github-for-team-collaboration-e761e7c00281).

• Starting a new group project on GitHub (https://www.digitalcrafts.com/blog/learn-how-start-new-group-project-github).