Over the last few years, I have had the incredible opportunity to be a core team member of the Wagtail project. In that time, I have reviewed many new pull requests, and I’ve also had the chance to submit many of my own across Wagtail and many other projects.
I do enjoy the challenge of improving code, reading others’ code and learning how to write my own code better. I also love helping new developers get their first few pull requests over the line and become contributors. As we lead into Hacktoberfest I thought it would be a good chance to write down the things that I see time and time again when new contributors get involved and hopefully can help with some tips to give your contributions a better chance of being reviewed.
When you want to make something enticing for others, it not only needs to taste great but you need to think about presentation and also communication. You may have made the most amazing flat white (coffee), but if you leave out the epic latte art, it will not be as good. If your coffee is not what the customer ordered or is the wrong size, you have to start again. When you are able to tell a story or clearly articulate the decisions that went into the menu, it is always a more enjoyable experience.
These ingredients apply to not just open source contributions but any code contributions at work or even in your own personal projects.
1. Read the [development] instructions
When you want to contribute, introduce yourself to the community, or even work out how to get started, take the time to read the project's development docs.
First, look at the project's readme for a Contributing or maybe a Development section. This may give you some basic instructions and hopefully point you in the direction of some documentation links.
Read the entire section, note all links, then read the content on each link. Sometimes, it is easy to skim through these because you want to get started quickly. But it's better to take 20 mins now to read the documentation instead of spending two hours later to start all over again.
About ‘fork the code’
Many contribution sections gloss over the mammoth task that can be a single line in the documentation similar to “fork the code and get it running locally”. This, on its own, can be a daunting task if you are just getting started, so let’s dive a bit deeper or if you know what this means skip to item two.
No single blog post like this can cover the nuance of this for every single project but it is good to call out some principles when it comes to getting started.
- A. Basic understanding of git vs GitHub and/or GitLab git is the version control tool, it is something you install on your device and runs usually in the command line (terminal) or via some GUI application.
- GitHub & GitLab are two prominent websites that provide a web user interface for repositories using git. Mozilla has a great guide that helps to explain Git and GitHub
- B. How to clone a remote repository and what that actually even means
- On GitHub or Gitlab, you will not be allowed to directly create branches or changes in a repository (project) that you do not have access to.
- However, you can make a copy (clone) of this repository using your own account, this clone will have all the branches and history that the original repository had.
- This is also called ‘fork’ in some cases, as your repository will be a branch of its own that forks the original repository.
- See the GitHub docs explain forking or how to fork a repository on GitLab.
- C. How to go about getting the code running locally Once this is done, you still will not have any code locally on your device, you still need to create yet another fork on your own machine. Yes, it’s forks all the way down, there is no spoon.
- This is where the git command line comes into play and the term clone makes more sense. On your local machine you will git clone some-website.git which will create a new folder with a copy of the project inside.
- See Atlassian’s docs on git clone for more details.
Bonus tip: Add a profile image to your GitHub profile and the community chat profile (Slack/Discord/Zulp). It does not need to be your image, but make it unique.
2. Read the issue and comments
Hopefully, at this point, you have a good sense of the purpose of the project and are still keen to contribute. Once you have the code forked and running locally, you will probably want to start looking for what to contribute.
Finding something to contribute is not always easy, especially if you are new to the project. Once you have a few candidate issues to investigate, be sure to read the entire issue description, all comments and all linked issues or pull requests. Often, you will find that someone else may have started or finished the issue, or sometimes there are clarifications in the comments about how to approach the problem or whether the problem is even something worth solving.
Bonus tip: If an issue has a pull request linked, but not merged, read that pull request and the discussion on it. Maybe the previous contributor got stuck or lost momentum, in which case you could pick up where they left off (assuming it's been enough time).
3. Create a fresh branch for your contribution
Before writing any code, take a moment to get your git hat on. When you clone the project locally, you will be checked out at the main branch (sometimes called master, or development depending on the project). This branch is not suitable for you to make your changes on. It is meant to be the branch that tracks the core development of the project.
Instead, take a moment to create a new branch. You can use the command line or install one of the many great git GUI tools. Don't listen to anyone that says you're not doing it right unless you use the command line. Reduce the things you need to learn today and focus on the git command line interface later. If you have a Mac, I recommend Fork, otherwise, the Github GUI is good enough.
This new branch name should have some context as to what you are fixing and if possible the issue number being fixed. For example git checkout -b 'feature/1234-add-unit-tests-for-inline-panel'. This branch name uses / to represent a folder and also has the issue number 1234, finally, it uses lower-kebab-case with a short description of the issue.
Bonus tip: You may find that your editor has some handy Git tooling and will often be able to tell you what branch you are on or whether you have any changes staged. https://code.visualstudio.com/docs/sourcecontrol/overview
4. Keep the changes focused
As a developer, it is easy to get distracted, maybe a typo here or white space that does not feel 'right' there. Sometimes, even our editor gets distracted and starts adding line breaks at the end of files as we save or it formats code without our consent due to configuration from a different project.
These added changes that are not the primary goal or not strictly required by the project's set-up are noise. This noise makes it harder to review the pull request and also can create confusion for future developers that see these commits and wonder how it relates to the bug that was fixed.
When you go to stage changes, only stage the parts you need or at least review the changes and 'undo' them before you save the commits.
If you do find a different problem, maybe a typo in the docs, this is what branches are for. Save your commits, create a new branch off main fix/fix-documentation-typo and then save that change to that branch., Now you have a small change, one that is easy to merge, which you can prepare a pull request for.
Keep your changes focused on the goal, do not add overhead to the reviewer or to yourself by changing things that do not need it (yet).
Bonus tip: It's OK to make changes in a 'messy' way locally, with lots of commits that maybe include things that are not needed. However, be sure to take some time to review your commits and clean up anything that is not required before you do your pull request.
5. Write unit tests
We are getting close to having a pull request, but the next critical step is unit tests. Anecdotally, I find that testing code you wrote will take 5-10x longer than the actual bug fix. Often, if the use case is right, it is better to write the tests first and get them running (but failing) before you fix the problem.
Finding how and where to write the unit tests can be hard in a new project, but hopefully, the project's development docs contain the clues you need to get started. Wagtai’s docs are hopefully a good example of this with a dedicated testing section.
If you fix a bug or introduce a new feature, you want to ensure that fix is long-lived and does not break again. You also want to help yourself by thinking through edge cases or potential problems. Testing helps with this. While regressions do happen, they are less likely to happen when code is tested.
Many projects will not even review a pull request without unit tests. Often, fixing a bug is not hard, ensuring the fix is the 'real' fix and that it does not break again is the hard part. Take the time to do the harder thing. It will help you grow as a developer and help your contributions make a longer lasting difference.
Bonus tip: A pull request that just adds unit tests to some core functionality that does not yet have tests is a great way to contribute, it helps you learn about the code and makes the project more reliable.
6. Give your pull request a name with context
A pull request that has the title 'fixes issue' is unhelpful at best, and spammy at worst. Take a few moments to think about how to give your change a title. Communicate, in a few words, the problem solved, feature added or bug fixed. Instead of 'Fixes 10423', use words and write a title 'Fixes documentation dark mode refresh issue'. No one in a project knows that issue 10423 is that one about the documentation dark mode refresh issue.
Also, add a proper name when you create the pull request. This will ensure that any notifications that go out to the team have the correct name from the start.
Bonus tip: Remember you can make a draft pull request in both GitHub and GitLab. This is a way to run the CI steps but in a way that indicates you are not ready for a review yet.
7. Reference the issue being fixed or resolved in the pull request
Referencing the issue being fixed within the pull request description is just as important as a good title. A pull request without a description is very difficult to review. Add some context and some steps to reproduce the issue or scenario.
It is often good to write yourself a checklist for any pull request and fill in the gaps.
- Small description of the solution, one sentence.
- Link to issue/s that should be resolved if this pull request gets merged.
- Questions or assumptions, maybe you made an assumption we no longer support IE11 with your CSS change, if it's not in the docs - write the assumption down.
- Details - additional details, context or links that help the reviewer understand the pull request.
- Note: Some projects have a checklist or template for all new pull requests, it is best to use that if it is there.
Bonus tip: You can also include issues in the commit message. Adding something like fixes #1234 in your commit message will let GitHub know that the change is for that issue.
8. Review & fix the CI failures
Once you have created your pull request, there will often be a series of build/check/CI steps that run.
These steps are normally all required to pass before the pull request can be merged. CI is a broad term but usually, the testing and linting will run on the code you have proposed to change. Linting is a tricky one because sometimes the things that are flagged seem trivial, but they are important for code consistency. Re-read the development instructions and see how you can run the linting locally to avoid frustrating back & forth with small linting fixes.
Testing is a bit more complex. Maybe all the tests can be run locally or maybe the CI will run tests on multiple versions of a project or language. Do your best to run all the tests locally, but there may still be issues on the CI when you do. That is OK, and normally you can solve these issues one by one.
The most important thing is to not just ignore CI failures. Read through each error report and try to work out the problem and provide a fix. Ignoring these will likely lead to pull requests that do not get reviewed because they do not get the basics right.
Bonus tip: GitHub will not run the CI automatically for new contributors in some projects. This is an intentional security feature and a core contributor will need to approve your initial CI run.
9. Push to the same branch with fixes and do not open a new pull request
Finally, after you have fixed the failing linting and tests locally, you will want to push those changes to your remote branch. You do not need to open a new pull request. This creates more noise and confusion. Instead, push your changes up to your branch, and the CI will run automatically on those changes.
You can add a comment if you want to the pull request that you have updated, but often this is not really needed.
Avoid opening multiple pull requests for the same fix. Doing that means all the comments and discussion from the previous pull request will get lost and reviewers will have trouble finding them.
10. Eagerness balanced with patience
When you take time to contribute out of your own personal time, or even that from your paid employer, it can be very frustrating when a pull request does not get reviewed. It is best to temper your expectations with this process and remember that many people on the other side of this are also volunteers or have limited time to prioritise.
It is best to celebrate your accomplishment at this point even if your pull request never gets merged. However, it is good to balance that with an eagerness about getting your amazing fix in place to help people who use the project. Balancing this tension is hard, but the unhelpful thing to do is give up and never contribute or decide that you won’t respond to feedback because it came too late.
Also, remember that it is OK to move on and try something else. Try a different issue or project or area of the code. Don’t just sit waiting for a response on the one thing you did before looking at other challenges.
If there can be one summary of all of this, it is to take the time to read and remember that Slow is fast and fast is slow. If you rush through and do not take the time to read and truly understand the problem you are solving, you will end up being much slower in the long run.
Finally, remember that you can do all the right things and it may not be something that should be a core feature. The problem you are solving may not be a priority for a number of reasons or sometimes the solution ends up being a minefield of edge cases and complexity.
Be thankful that you are in a position to be able to write code, have a laptop that works and have a brain that can solve problems. There will always be more problems to solve later.
Also, remember that this is hard and that is OK. Anything new is hard and new things in programming often mean learning multiple new things in parallel. Get a notepad (virtual or otherwise) and write each problem down as you encounter it, this way you can work through each one individually without being overwhelmed.
Bonus: Learn how to rebase on main/master
One of the more complex things to grok is the git rebase. I found this very hard to understand when getting started but it is a critical part of developing for open source.
When you have your branch of code, it contains commits. The branch itself is essentially a 'pointer' to the last of those commits. Git is incredible software and works out where these commits 'branch' off the other commits that themselves have pointers.
Often on a project, you will make a change, and then a few days or weeks later see that these changes cannot be merged in easily as they conflict with the core code.
Merging main into your branch is unhelpful here, it will seem like it solves the current problem of conflicts but will create other headaches down the road. Merging main into your branch pulls ALL those commits, there could be hundreds` into your branch. Git now points your branch to the latest commit here, which is a mix of your original commits and all other ancestor changes back to before you started your changes.
Instead, think about this process as a thought exercise;
- Create a new branch called my-temp-feature-2 off the latest main branch.
- Go to each of your commits and re-apply them one by one on main.
- Each time you get a conflict, resolve it one by one.
- Now you have those commits sitting 'on top' of main without any code conflicts.
- Delete the previous branch and rename your current temp branch with the same name as the previous branch.
This seems convoluted, but this is essentially the exact process of a rebase. It takes each commit, one by one, and writes it on top of another branch. As a conflict happens, you resolve the conflict for that specific commit and move on.
The final step is key though. You now have a local branch that is out of sync with the remote branch. You cannot just push because you will get an error. You must instead use the force and force push commands to overwrite the remote branch. Be careful before you do because these commands are absolute and will completely delete anything on the remote branch.
Some additional links on rebasing
- The countless developers who have contributed small and meaningful changes to the free software & open source world.
- Cover image - https://unsplash.com/photos/i9Hgm8Cf-20
- Editing - Meagen Voss from Torchbox
- This is a cross-post from my personal blog - https://dev.to/lb/ten-tasty-ingredients-for-a-delicious-pull-request-hgc
- You, the reader, for learning to read the whole issue before moving on to fixing the problem