Add some initial structure and guideline docs

Adds a docs directory with some initial documentation, including guidelines and the notes from the discussion following the first review. Co-authored-by: Penelope Yong <[email protected]>
alan-turing-institute · Nov 8, 2024 · 46ff152 · 46ff152
1 parent c7d8ff2
commit 46ff152
Show file tree

Hide file tree

Showing 5 changed files with 369 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -1,9 +1,21 @@
 # REG code reviews
 
+A repository for conducting Cross-Project Code Review.
+
+This repository will be used to conduct the review process, but also as a place to capture historical reviews.
+
 ## Current review
 
-**14 Oct 2024:** [Wordcloud web app (SvelteKit, TypeScript, PostgresQL)](https://github.com/alan-turing-institute/the-reg-review/pull/1)
+| Reviewee | Reviewers | Pull Request | Date started | Date concluded |
+|----------|-----------|--------------|--------------|----------------|
+| penelopeysm | mastoffel, cptanalatriste, llewelld, phinate | [#1](https://github.com/alan-turing-institute/the-reg-review/pull/1) | 2024-10-14 | ongoing |
 
 ## Past reviews
 
-To be populated.
+| Reviewee | Reviewers | Pull Request | Date started | Date concluded |
+|----------|-----------|--------------|--------------|----------------|
+| --       | --        | --           | --           | --             |
+
+## Documentation
+
+For general documentation, please see the [docs directory](./docs).
diff --git a/docs/guidelines-general.md b/docs/guidelines-general.md
@@ -0,0 +1,58 @@
+# General Guidelines
+
+This document contains general guidelines and useful references related to the review process.
+
+We're still in the early stages of the process, trying to figure out how this should (could) work.
+This is a live document and we should expect it to be updated as time goes on.
+
+## What is a code review?
+
+TBC
+
+## What is a cross-project code review?
+
+TBC
+
+## The process
+
+There's no strict process that reviewers and reviewees are required to go through: this is supposed to be flexible and allow developers to choose the approach that best suits them and their project.
+Nevertheless it may be helpful to have some understanding about how we've done this in the past, what worked and what didn't.
+
+Please try to avoid making this section prescriptive.
+
+The high-level approach we've used up until now has been.
+
+1. Create a personal fork of this [`the-reg-review`](https://github.com/alan-turing-institute/the-reg-review) repository and create a new branch in it for your work.
+2. Add a new subdirectory `<date>-<name>` to the working tree for this new branch and add all of the code to be reviewed to it.
+3. Commit these changes to your personal fork.
+4. Create a Pull Request from the new branch on your personal fork to `main` in the upstream repo.
+5. Add clear instructions about how to run the code and what the reviewee would like the reviewers to focus on in the pull request description.
+6. Create a second pull request to update the `main` branch with the details of the review.
+7. Reviewers are given period of time to review the code (one or two weeks).
+8. Refiewers add comments to the Pull Request using the standard GitHub tooling.
+9. At the end of the two weeks the reviewee can respond to the comments, update the code and submit it for re-review.
+10. Steps 5-7 are repeated at the discretion of the reviewee.
+11. The reviewee chooses when to merge any changes.
+12. The review process is completed once the Pull Request has been merged.
+
+In this arrangement the reviewee retains ultimate control to merge, or not merge, the changes as they see fit.
+In particular, because the review and merging is happening in a repository separate from the actual project repository, there is no requirement for them to ultimately make use of any of the changes.
+
+Step 5 is needed because the `main` branch is protected.
+Note that a linear history is also enforced by GitHub, so any changes will need to be rebased before merging.
+This is just to help separate out the different reviews and ensure everything is kept clean.
+
+## Useful resources
+
+Some useful resources that may be helpful to read before embarking on a review.
+
+1. The original [Cross-project Code Review proposal](https://github.com/alan-turing-institute/Hut23/issues/1736).
+2. Eric's [Peer Repo Audio proposal](https://github.com/alan-turing-institute/Hut23/issues/925) from a few years back.
+2. [Code Review Stack Exchange](https://codereview.stackexchange.com/).
+3. [Google's Code Review Guidelines](https://google.github.io/eng-practices/review/reviewer/standard.html).
+4. [Google's Python Style Guide](https://google.github.io/styleguide/pyguide.html); just provided as an example.
+5. Edward R. Ivimey‐Cook, *et al.*, ["Implementing code review in the scientific workflow: Insights from ecology and evolutionary biology,"](https://doi.org/10.1111/jeb.14230) Journal of Evolutionary Biology, Volume 36, Issue 10, 1 October 2023, Pages 1347–1356.
+6. Carol S Lee, Catherine M Hicks, ["Understanding and effectively mitigating code review anxiety,"](https://doi.org/10.1007/s10664-024-10550-9) Empir Software Eng 29, 161 (2024). 
+7. The Turing Way, ["Code Reviewing Process"](https://book.the-turing-way.org/reproducible-research/reviewing) section in the Guide for Reproducible Research.
+8. The Turing Way, ["Guidelines for Code Review"](https://book.the-turing-way.org/communication/peer-review/peer-review-code) section in the Guide for Communication.
+
diff --git a/docs/guidelines-reviewee.md b/docs/guidelines-reviewee.md
@@ -0,0 +1,31 @@
+# Guidelines for reviewees
+
+This document contains some suggested guidelines and ideas for reviewees.
+A reviewee is someone who puts their code forward for review by others.
+
+In group discussions about the review process, the following have been chosen as good guidelines for reviewees to follow.
+
+Please avoid controversial or highly-opinionated suggestions in the Good Practice list; use the Opinions list for those instead.
+
+If it looks like two items in the Good Pracctice list contradict each other, the contradiction should be resolved, or they should be moved to the Opinions list.
+
+## Good Practice
+
+This list gives high-level advice about good practice when putting your code forward for review.
+
+Please avoid controversial or highly-opinionated suggestions; there's a second list for more opinionated items below.
+
+If it looks like two items in this list contradict each other, the contradiction should be resolved, or they should be moved to the opinionated list below.
+
+1. You will get a lot of feedback, don't take it personally.
+2. Try to include as much code as is needed to get an example to run.
+3. But focus the review on around 200 lines of code at a maximum.
+4. It shouldn't be a requirement for reviewers to run your code, but if they can it's likely to be helpful.
+5. Think carefully about how to present your review. Formatting your code and request for review should be a useful activity in itself.
+
+## Opinions
+
+This list is for opinions or more controversial comments that may nevertheless be useful for reviewers to consider.
+It's expected that this list will contain contradictory advice, which is fine.
+
+1. Nothing yet.
diff --git a/docs/guidelines-reviewers.md b/docs/guidelines-reviewers.md
@@ -0,0 +1,25 @@
+# Guidelines for reviewers
+
+This document contains some suggested guidelines and ideas for reviewers.
+A reviewer is someone who reviews someone else's code.
+
+In group discussions about the review process, the following have been chosen as good guidelines for reviewers to follow.
+
+Please avoid controversial or highly-opinionated suggestions in the Good Practice list; use the Opinions list for those instead.
+
+If it looks like two items in the Good Pracctice list contradict each other, the contradiction should be resolved, or they should be moved to the Opinions list.
+
+## Good Practice
+
+This list gives high-level advice about good practice when putting your code forward for review.
+
+1. If you highlight something that should be changed, ideally highlight also how you would advise changing it.
+2. Running the code isn't a reqiurement, but is likely to be informative if you can.
+3. Don't be intransitive: be willing to yeild if there's disagreement.
+
+## Opinions
+
+This list is for opinions or more controversial comments that may nevertheless be useful for reviewers to consider.
+It's expected that this list will contain contradictory advice, which is fine.
+
+1. Nothing yet.
diff --git a/docs/meeting-notes/rev001-20251105.md b/docs/meeting-notes/rev001-20251105.md
@@ -0,0 +1,241 @@
+# REG Cross-Project Code Review Review
+
+## The Date
+
+2024-11-05 - 15:00 - 16:00
+
+## The Actors
+
+The reviewee:
+1. Penelope Yong
+
+The reviewers
+1. Carlos Gavidia-Calderon
+2. Martin Stoffel
+3. Nathan Simpson
+4. David Llewellyn-Jones
+
+The reviewed and the review:
+1. https://github.com/alan-turing-institute/the-reg-review/pull/1
+
+## Summary
+
+There was universal agreement that Penny had been impressively brave in putting forward her code for the very first REG Cross-Project Code Review
+
+Everyone -- both reviewers and reviewee -- seemed to get something useful out of the process.
+
+Having some high-level guidelines for how to conduct the reviews will be useful, but nothing too restrictive.
+
+We can collect material in the repository with old reviews going in brances off `main`.
+
+Crucial to success will be getting others to offer the code and reviewing time.
+
+Actions:
+1. David/Nathan/Martin: Approach Markus to see whether the OSSA could provide a Service Area platform to give a cadence to the review process.
+2. All: if you personally have some code that you'd be willing to have reviewed, suggest it for review.
+3. All: Gently encourage anyone who looks like they may have some code up for review to put it forwards.
+
+## Discussion Notes - experience of the review process
+
+Penny
+1. I got a lot out of it, but I'm not sure others did.
+2. The comments were really nice: it didn't feel antagonistic.
+3. People didn't take it as an MVP, which it was, but that was still useful.
+
+Martin
+1. As someone who's not so familiar with web apps, this was a useful learning experience.
+2. It's somethign that I am really interested in, so it was good to get to dig into it.
+
+Nathan
+1. I have an "accidental perspective". I came in to the process very enthusiastically, then spent the entire time getting the code to run.
+3. So I come at it form the perspective of someone who knows about the style of coding but didn't leave very many comments. Going through the comments; I am learning not just from the app set up, but also things to watch out for that the comments threw up.
+4. That's in part due to the maximal approach Penny used (providing the entire code in a runnable form).
+5. There's learning to be had from the comments alone.
+6. In this case we seem to have hit the **Goldilocks Zone**: good set up instructions and useful comments on that code.
+
+Penny
+1. Although Nathan didn't comment on the code, I received a lot of useful comments from the feedback during installing the app.
+
+Nathan
+1. There's a lot of positivity towards the code and the effort that was put in to making it accessible (in reference to the install instrutions, rather than the 'accessibility' question raised by Penny for the review).
+
+Penny
+1. I appreciate the appreciation.
+2. It took maybe 90 mintues to put the pull request together.
+3. Assuming it's not the same person doing it every week this feels like a reasonable cost.
+
+Nathan
+1. Penny introduced a "viable product"
+2. In some case it may be useful to have review of a smaller piece of code, for example a test case.
+
+Penny
+1. What's the ideal size? How to give something that's independent enough to review?
+   - This is too big https://github.com/TuringLang/Turing.jl/pull/2328
+   - This might be reasonable https://github.com/TuringLang/AbstractPPL.jl/pull/100
+       - Both in terms of size but also the fact that it doesn't use functionality from elsewhere in the package (so reduces the amount of context needed to understand the code). I wonder if that's the hardest part of it.
+
+Carlos
+1. I've not used Svelte, it was hard to get into the code review without knowing the code.
+2. Previously I've worked in environments where people had to be certified for code review, so this was a very different experience (i.e. not knowing so much about the codebase, etc.).
+3. This was okay though, I could still contribute.
+
+Martin
+1. How do we choose what would be a good PR?
+2. We do a PR from a normal project, but with some additional context.
+3. Do we only take particular PRs?
+4. Probably we'd need a case-by-case basis.
+5. Especially if you have specific questions.
+
+Carlos
+1. I've only had one example of cross-project code review in the past.
+2. In that case the code was reviewed for security by another engineer from an external organisation.
+3. This was really painful.
+4. The reviewer had their own code conventions.
+5. We spent weeks on it.
+6. The problem there was primarily to do with communication.
+7. The reviewer had a very fixed set of ideas, for example about style.
+8. To avoid this, we could have agreed on a set of coding standard guidelines at the outset.
+
+Nathan
+1. That type of reviewer, that kind of comment, coincides with a reviewer who has a strong style preference.
+
+Martin
+1. To mitige this, for every comment you give, you could give a number: 1 - definitely change; 2 - feel strongly but just style; 3 - this is just opinion.
+
+Carlos
+1. "Nit" is typically used for this in code reviews.
+
+Nathan
+1. I had a very positive experience recently and we both had strong opinions but were very willing to yield.
+2. But this may not happen. I'm not sure how you can prevent this except emphasising in the PR the kind of feedback you're looking for (e.g. "please don't give style comments").
+3. Of course it's not just style, it may be libraries, etc. People may be stuck in their ways for all sorts of reasons.
+
+## Discussion Notes - should we have guidelines?
+
+Carlos
+1. Try first, see what happens, then maybe bring in guidelines later.
+
+Martin
+1. Agree.
+
+Nathan
+1. What about a minimal proto-alpha approach so that people not familiar with code review have an idea about it.
+2. It's possible we just got the right dispositions on this review and some initial steering criteria may be useful.
+
+David
+1. Having some light guidelines may also be helpful in ensuring people less familiar with code review overcome any apprehensions.
+
+Carlos
+1. Maybe we *should* have some initial guidelines, saying "you will get a lot of feedback, don't take it personally".
+
+Nathan
+1. I can see how this could happen; someone could take it negatively.
+2. Make clear how we can best support you in the process?
+3. Highlight these things could be important.
+
+Martin.
+1. We could have some initial small guidelines.
+2. Something I've experienced and really liked during a scientific peer review was when a reviewer didn't just suggest what to change, but also how to change it.
+3. E.g. "Try this model from ths tool".
+4. Given the person who is being reviewed some direction and suggestions for how to change something not just what to change.
+
+Carlos
+1. Put some limit on the lines of code: 200 lines of code. If it's really big people won't have time.
+
+Nathan
+1. We've started to suggest things for the person sumbitting the PR.
+2. The most beneficial part of the process might be the construction of the PR.
+3. In my experience the number one quality of life difference when working in a team is when the team can make a PR that's catered for other people. You've taken time to spell out things, distill a problem into Minimum Viable example. A really useful example.
+4. Like how giving a talk is a great way to learn a topic really well.
+
+Penny
+1. For a larger code base, abstracting a single file makes it easier to look at but harder to test in conjunction with other code.
+
+Carlos
+1. I rarely run the code, I just review it! \[gasps from everyone\]
+2. Requiring code execution is a high bar.
+
+Nathan
+1. When I bump versions of my packages and my colleague reviews it, I don't expect them to run the test suite.
+
+Penny
+1. A normal project has CI, which is kind of like running the code, as long as the PR adds tests.
+
+Carlos
+1. I've been involved in "artefact evaluation" where it's all about checking that the packages run and give the expected outputs.
+2. But in those case there's typically not enough time to then actually look at the code.
+
+Martin
+1. If we have separate reproducible examples, it would make sense to take code out of the codebase.
+2. But in this case do things become too trivial?
+3. It may work when you have a small amount of code that's independent of the bigger picture.
+4. But this means you end up having to focus on the technicalities. You don't get the architectural picture.
+5. This makes the more abstract questions harder to tackle.
+
+Carlos
+1. This sounds like a "Design Review".
+
+Nathan
+1. Doing a design review like this could still have its place; people could submit code showing the basic idea of how things work.
+2. Are the kind of things you're thinking about still useful in the context of people directly looking at the code?
+3. Otherwise you could replace this process with a carefully worded question on another forums (e.g. making a post on Slack asking for people to comment on your design).
+
+Martin
+1. The key thing is to think through how to present your work.
+
+Carlos
+1. In the case of the external review mentioned earlier one of the comments was to restructure the entire project.
+2. We could include the entire codebase, or at least link to it.
+
+Nathan
+1. In many cases people may not have access to the repositories, so having a snippet works, but if you can link to the bigger picture that's good.
+
+## Finalising Penny's Review
+
+Penny
+1. Merging indicates when it's done.
+2. Even though it's merged you can still see the diffs, so the review won't be lost.
+3. We should link the PRs to the meeting notes 
+
+Nathan
+1. There's repository aswell.
+2. I'd like to see a log of past reviews.
+
+## Future steps
+
+Carlos
+1. We'll need to publicise at least for a few months to get people on board.
+
+Nathan
+1. We could do with a cadence.
+2. Perhaps the OSSA would provide a cadence for this? Raising possible reviews at the OSSA meetings?
+
+Penny
+1. A "PR" service area? It is intra-team communication after all!
+
+Nathan
+1. That is actually true. As a team we could do better at intra-team communication.
+
+Martin
+1. I don't have anything to offer for review right now but will in the next four weeks, but sparodically, not at a given time.
+2. We need enough people involved to keep it ongoing.
+
+David
+1. If anyone notices someone who might have some code to review, encourage them to submit it for a cross-project code review.
+
+## Minimal guidelines
+
+For reveiwees:
+1. You will get a lot of feedback, don't take it personally.
+2. Try to include as much code as is needed to get an example to run.
+3. But focus the review on around 200 lines of code at a maximum.
+4. It shouldn't be a requirement for reviewers to run your code, but if they can it's likely to be helpful.
+5. Think carefully about how to present your review. Formatting your code and request for review should be a useful activity in itself.
+
+For reviewers:
+1. If you highlight something that should be changed, ideally highlight also how you would advise changing it.
+2. Running the code isn't a reqiurement, but is likely to be informative if you can.
+3. Don't be intransitive: be willing to yeild if there's disagreement.
+
+
+