When I first started to work as a data scientist, I was very intimated to commit code to Github, especially to the repo others are managing. The idea of my clumsy action my break other people’s code terrifies me.
Now I finally overcame the fear and I thought I’ll share how my GitHub workflow looks like on my daily job.
Scenario 1: you got assigned to a new task with a Jira or Clubhouse ticket number DS-1234.
A good habit I learned from my developer coworkers is that always start a new branch with your ticket number. This way, you will be able to point to the lineage of the background, the story, and the scope of work. This is going to be very helpful not only for your code reviewer, your PM (who might friendly ask you what have you shipped), and most importantly, for your future self.
# Create and switch to a branch named after the Jira ticket
$ git checkout -b DS-1234
You can check your current branch by below command
$ git branch
Now you can start work on your branch DS-1234
. You can add new code, edit existing code and once you are happy with it, you can add
and commit
your work by doing this:
$ git add new_algo.py # a new script called new_algo.py
$ git commit -m "added a new algo"
$ git push origin DS-1234
Now your code is updated on both local DS-1234
and on Github remote branch DS-1234
.
Scenario 2: After your first commit, it’s always a good idea to create a Pull Request (PR) and tag your co-worker for code review.
You can create a PR for branch DS-1234
on Github by visiting:
You can also find this path when you run git push origin DS-1234
Say your co-worker give you some feedback and comments, and you made some small edits on readme
directly on github and now you want to further update your new_algo.py
locally.
You can first pull the readme
changes from github by running below on your local terminal
$ git pull origin DS-1234
Then after updating new_algo.py
, you can push your updated code to github remote branch by running
$ git push origin DS-1234
Once you confirmed that your coworker merge your updated branch DS-1234
to master
branch, you can safely delete DS-1234
locally and remotely.
# delete local branch
$ git branch -d DS-1234# or you can run
$ git branch -D DS-1234# delete remote branch using push --delete
$ git push origin --delete DS-1234
Scenario 3: if your coworker trusts you enough and ask you to merge by yourself, or if you need to merge other people’s branch into master, you can do the following:
# assuming you are to merge DS-1234 to master branch
$ git merge --no-ff -m "merged DS-1234 into master"# or if you are to merge other people's bug fix branch bugfix-234
$ git merge --no-ff -m "merged bugfix-234 into master"
Note that the no-ff
flag prevents git merge from executing a “fast-ward” if it detects that your current HEAD is an ancestor of the commit you’re trying to merge.
Scenario 4: you want to remove a file.
$ git rm test.py # remove test.py file
# commit your change
$ git commit -m "remove test.py file"
What if you changed your mind or deleted the wrong file? Not to worry, you can do this
$ git checkout -- test.py
Now test.py
is back!
Scenario 5: remove untracked files
Sometimes you made a bunch of changes that you may not want to keep, and you can run the below command to clean them all at once.
# dry run to see which files will be removed
$ git clean -d -n# remove them
$ git clean -d -f
Scenario 6: [Advanced Scenario] I found myself start to make changes on master
branch before creating the DS-1234
. Actually, this happened to me multiple times 😂.
We have two solutions.
Solution 1: git stash
# step 1: save the changes you made on master branch
$ git stash # step 2: create and switch to DS-1234 branch
$ git checkout -b DS-1234# step 3: transfer the changes using stash pop
$ git stash pop
Solution 2: cherry-pick 🍒
This is a trick I learned from my previous manager. It’s similar to git cherry-pick
but it is more intuitive in my opinion since it shows the path very clearly.
Now, imagine that on your DS-1234, you worked many pieces of code. And tomorrow is the deadline to merge your code to the release branch. And it looks like only one piece of code is ready to merge. You can choose to pick this one to merge by running the below commands:
# step 1: pull all recent changes from remote
$ git pull# step 2: checkout your branch and release branch
$ git checkout DS-1234 # pick from this branch
$ git checkout release-2021-07-01 # to this target branch# step 3: checkout the code ready to merge
$ git checkout DS-1234 new_algo.py# step 4: add this code to release branch
$ git add new_algo.py# step 5: commit
$ git commit -m "cherry pick changes from DS-1234 to release"
romantik69.co.il says
Very nice write-up. I definitely appreciate this site. Thanks!
hotshot bald cop says
Wonderful views on that!