Skip to content

First draft for automatic generation of changelog entries #6875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from

Conversation

DudeNr33
Copy link
Collaborator

@DudeNr33 DudeNr33 commented Jun 6, 2022

  • Write a good description on what the PR does.
  • Add an entry to the change log describing the change in
    doc/whatsnew/2/2.15/index.rst (or doc/whatsnew/2/2.14/full.rst
    if the change needs backporting in 2.14). If necessary you can write
    details or offer examples on how the new change is supposed to work.
  • If you used multiple emails or multiple names when contributing, add your mails
    and preferred name in script/.contributors_aliases.json

Type of Changes

Type
📜 Docs

Changelog

The changelog is now automatically generated from the PR content.
This makes it easier for contributors as merge conflicts of the changelog file are now a thing of the past.

(edited 12.06. to see if the CI check correctly determines a changelog is included in the description)

Description

Refs #6688

This is a first draft for automatically generating the changelog using a custom Sphinx extension, which provides a new directive .. changelog::.

I discarded the idea of using github-changelog-generator. While it produced good results on one of my own, small repos, I had some problems when running on pylint's repo.

The main idea is:

  • The changelog directive allows the user to generate a list of issues or pull requests matching the query passed in through the :query: option.
  • The :caption: option is used as title in the final document.
  • With :hide_if_empty: the whole section will be skipped if no pull requests matching the query were found.
  • General configuration options like repository owner, project, token and labels that should always be included can be defined in conf.py.

This means that we have to apply the appropriate labels to each pull request. We will also have to come up with new labels for things like "new checker", "removed checker", etc.

I want to discuss some open issues and problems before moving forward:

Open issues:

  • Define corresponding labels.
    Decision: TBD
  • Define labels that shall always be ignored.
    Decision: TBD
  • Define what to use as the changelog entry. Currently I take the PR title and add a link to the PR beneath. However, the PR title is not always in a form that fits in the changelog, or is too short to convey all relevant information. We could add a # Changelog: section in the PR template and extract this part from the PR body with a regular expression. A CI check could enforce that this section is not removed manually. PR title could still be used as a fallback to make the changelog generation more robust.
    Decision: TBD

There are also some problems that need to be adressed:

  • Accessing the GitHub API requires a token, as unauthenticated requests are subject to rate limiting. While this is not a problem for generating on readthedocs, our checks.yml also tries to build the documentation. We can include a secret to store the token to make it accessible in GitHub actions for our main branch, but secrets are not accessible for PRs that originate from forks. Should we simply skip the changelog generation for PR runs?
    Decision: TBD
  • I currently use PyGithub for accessing the GitHub REST API. However, this is a bit buggy. Each query returns one item less than what is found when running the same query on GitHub directly. Looking at the project page it looks abandoned, so I don't think fixing this in PyGithub is viable. However, there are various other libraries I can take a look at.
    Decision: TBD
  • The query for "Other Changes" is not really maintainable, as I manually exclude all labels that are used in the other sections. I would be in favour of adding an explicit label for PRs that should go in there.
    Descision: TBD

Let me know what you think, and if we want to move forward with this or stick to manual creation of the changelog entries!

@DudeNr33 DudeNr33 added Documentation 📗 Maintenance Discussion or action around maintaining pylint or the dev workflow labels Jun 6, 2022
@DudeNr33 DudeNr33 added this to the 2.15.0 milestone Jun 6, 2022
@DanielNoord
Copy link
Collaborator

@DudeNr33 This looks very promising! Before delving any deeper, have we considered using existing libraries such as towncrier? It's used by pip and pytest for example.

It would require some changes in the way we store changelog entries from 2.15+ onwards, but it's a tried and tested library which we could perhaps benefit from?

Comment on lines 80 to 86
def _parse_as_rst(self, text: str) -> nodes.document:
parser = Parser()
components = (Parser,)
settings = frontend.OptionParser(components=components).get_default_values()
document = new_document("<rst-doc>", settings=settings)
parser.parse(text, document)
return document
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figuring out how to do this took hours and a bit of my sanity.
For a library with "doc" in its name, the documentation of docutils is horrible. 😄 Not to mention the usability.

And to cite a YouTube video of a PyGotham talk:
"Sphinx is weird and crazy and old [...], and it sits on top of something that was old before the thing below it was old before Moses was old."

@DudeNr33
Copy link
Collaborator Author

DudeNr33 commented Jun 6, 2022

@DudeNr33 This looks very promising! Before delving any deeper, have we considered using existing libraries such as towncrier? It's used by pip and pytest for example.

It would require some changes in the way we store changelog entries from 2.15+ onwards, but it's a tried and tested library which we could perhaps benefit from?

I "know" towncrier (i.e. heard of it, but not used it myself yet). If this is an option for us, I will happily take a closer look.
We should get a clear consensus of what requirements and expectations we have for the changelog generation.
From my point of view:

  1. It must be possible to structure the generated changelog into different sections (not just for release versions, but for topics like false negative, false positive etc.).
  2. It must be possible to exclude the "old" versions from the generation (pylint <= 2.15.0), as this would only be waste.
  3. It should be possible to change the structure from one release to the other, without affecting the changelogs for already released versions.
  4. It must be possible to build the documentation including changelog through CI and readthedocs.

@coveralls
Copy link

coveralls commented Jun 6, 2022

Pull Request Test Coverage Report for Build 2490288774

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 95.541%

Totals Coverage Status
Change from base Build 2489729047: 0.0%
Covered Lines: 16411
Relevant Lines: 17177

💛 - Coveralls

pylint_changelog_project = "pylint"
pylint_changelog_token = os.getenv("GITHUB_TOKEN")
pylint_changelog_exclude_labels = [
"Documentation 📖",
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this: I said in the original issue "If the PR is labelled maintenance, primer or documentation in github then nothing is required". But I also added 'skip news 🔇', because there's some condition to take into account imo. documentation alone should be skipped but what about documentation and bug together or primer and something else ... ? Maybe just one label with clear expectation is better ? We're doing a lot of maintenance/documentation issue that will need to have one more label if we do that though.

I'm thinking about "skip news" or (maintenance, primer or documentation) but only that label with no other, right now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too think that an explicit "skip news" label would be the cleanest way to go.
Same for the "Other changes" section.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must be possible to structure the generated changelog into different sections (not just for release versions, but for topics like false negative, false positive etc.).

+1. maybe using the label from the pull request ?

It must be possible to exclude the "old" versions from the generation (pylint <= 2.15.0), as this would only be waste.

👍

It should be possible to change the structure from one release to the other, without affecting the changelogs for already released versions.

👍 Separating the changlog for each minor version in particular

It must be possible to build the documentation including changelog through CI and readthedocs.

Ideally. But where would we take the changelog entry ? In the commit message ?

I think we also need to be able to create release notes for a patch version. Right now everything is mixed up in 2.15.0, but what will happen when we maintain it and want to release 2.15.1 ?

@DudeNr33
Copy link
Collaborator Author

DudeNr33 commented Jun 6, 2022

It must be possible to structure the generated changelog into different sections (not just for release versions, but for >> topics like false negative, false positive etc.).

+1. maybe using the label from the pull request ?
Yes, that's what I intended. If you look at the doc/whatsnew/2/2.15/index.rst, you can see that the query is filtering for closed PRs associated with the milestone (this part is constant for all queries) and then selects the PRs to display through labels:

.. changelog::
   :caption: False positives fixed
   :query: is:closed is:pr milestone:2.15.0 label:"False Positive 🦟"

It should be possible to change the structure from one release to the other, without affecting the changelogs for already released versions.

👍 Separating the changlog for each minor version in particular

With the Sphinx extension from this PR this would be possible, and I think it should be the same for towncrier.
When we have settled for the requirements the changelog generation must fulfil I can check both options and write a short comparison what steps contributors and maintainers would have to take with each one. Then we can decide which option gives us the better workflow.

It must be possible to build the documentation including changelog through CI and readthedocs.

Ideally. But where would we take the changelog entry ? In the commit message ?

Commit message is problematic imo, as it is inconvenient to fix if it was forgotten. I would prefer to put it in the description of the PR. We could add something like this in the PR template:

--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -23,6 +23,16 @@ To ease the process of reviewing your PR, do make sure to complete the following
 | ✓   | :hammer: Refactoring   |
 | ✓   | :scroll: Docs          |
 
+## Changelog
+
+<!--
+Provide a short summary of the change, focused on the user-facing aspects.
+This summary will be included in the changelog. Use reStrcturedText formatting.
+-->
+```rst
+

The extension can extract the relevant portion from the PR body using a regex.
The problem I see here is that contributors might get confused with having to write just this part as reStructuredText, while the rest of the PR is written in markdown.
We could also leave it as plain markdown in the PR description and use pandoc for converting to reStructuredText.

I think we also need to be able to create release notes for a patch version. Right now everything is mixed up in 2.15.0, but what will happen when we maintain it and want to release 2.15.1 ?

That's easy, we would simply copy paste the whole block from 2.15.0 and replace all milestone:2.15.0 with milestone:2.15.1..

.. changelog::
   :caption: False positives fixed
   :query: is:closed is:pr milestone:2.15.1 label:"False Positive 🦟"

@Pierre-Sassoulas
Copy link
Member

I would prefer to put it in the description of the PR. We could add something like this in the PR template:

Sounds great !

The problem I see here is that contributors might get confused with having to write just this part as reStructuredText, while the rest of the PR is written in markdown.

We're already handling markdown for release.md (there's a dependency handling that), maybe's it's not going to be a problem to include markdown from the PR ?

That's easy, we would simply copy paste the whole block from 2.15.0 and replace all milestone:2.15.0 with milestone:2.15.1..

Neat. Something to keep in mind when modifying the bump script.

@DudeNr33
Copy link
Collaborator Author

DudeNr33 commented Jun 6, 2022

We're already handling markdown for release.md (there's a dependency handling that), maybe's it's not going to be a problem to include markdown from the PR ?

Yes, it should be possible by using the myst-parser that we already depend on. 👍 I did not think about that at first, good point! 😊
As our custom directive only needs to return a list of docutils nodes, the source format is not really relevant.

## Changelog

<!--
Provide a short summary of the change, focused on the user-facing aspects.
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Provide a short summary of the change, focused on the user-facing aspects.
Provide a short summary of the change, focused on the user-facing aspects.
If the change does not need to be documented ask a maintainer to add the
'skip news 🔇' label.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@DudeNr33
Copy link
Collaborator Author

pylint_changelog Sphinx extension vs. Towncrier

TL;DR

I noted down some more information regarding the comparison mainly to document it for myself, but I include it if someone wants to read it.
But I think the result can be summarised as follows:

Currently our documentation also includes the already merged PRs for the upcoming releases (like 2.15.0 and 2.14.2 at the moment).
If this is just a byproduct of the current way of creating the changelog and we can live without this, towncrier would be the better choice.
It is customizable to our needs and well maintained. Plus, it is faster and does not require a GitHub token, as it is works with local files
instead of querying the API.

With towncrier we can only create the "preview" changelog for the news fragments on the main branch, we will typically be
for the next minor release.

Use Case Comparison

Let's consider the following aspects for comparison:

  • What are the necessary steps for contributors when creating a PR?
  • What are the necessary steps for maintainers when reviewing a PR? (Other than checking the actual content of the changelog)
  • What checks can be run through CI?
  • What are the necessary steps to generate the changelog for a new release?
  • What are the necessary steps to prepare the changelog for the next release?
  • How are backmerges handled?
Aspect Towncrier pylint_changelog Sphinx extension
Steps for contributors Create a news fragment of correct type (i.e. towncrier create 123.feature) Fill out the changelog section of the PR template
Steps for maintainers None Check that the correct label is applied to the PR
CI Checks towncrier check to check if a news fragment exists. However this does not check if the news fragment contains the correct PR number. Check if changelog section is present in PR body
Changelog generation for release 1. Run towncrier build 1. Add release summary to the whatsnew file
2. Add release summary to updated whatsnew file 2. Rest is automatic
Preparation for next release Update version in towncrier.toml Create a new section in doc/whatsnew/2/2.15/index.rst
with the necessary changelog directives.
Handling of backmerges ??? PRs only show up in the changelog sections corresponding to its associated milestone.

The steps for preparing the next release could probably be automated through the tbump script in both cases.

Regarding the handling of backmerges when using towncrier:
From what I understood so far, towncrier should play relatively nicely with our patch releases.
When running towncrier build all news fragments are removed (with git rm), and the updated release notes file is created.
For us, this would mean:

  1. After a PR that needs backport is merged on main and cherry-picked on the maintenance branch, the news fragment for this PR will exist on both branches.
  2. When releasing the patch release from the maintenance branch, we would run towncrier build which removes the individual news fragments and updates the release notes.
  3. After releasing the maintenance branch is merged on main, which would remove the news fragments there.
  4. When releasing a new minor version, the PR would correctly show up only in the release notes section for the patch release and not for the new minor release, as the news fragment no longer exists on the main branch.

The only difference to our current approach and also to my pylint_changelog Sphinx extension:
The changelog entry would appear under the section for the new to-be-released minor version until the patch release is actually built and released.
Actually, the whole section for the patch release itself would probably only be created afterwards.
Currently (with the manual changelog creation) the changelog entry would be placed under the patch release section before merging the PR.
Thus users can already check what patch releases are planned and what the contents will be.
With towncrier this is not possible as far as I see if we build the documentation against main.

Pros and Cons

Towncrier pylint_changelog
Pros * widely used OSS solution * least effort for users
* better performance * moving a PR from one changelog section to another can be done anytime by changing the PR
* changelogs for each release is only generated once and is then "frozen" * no merge conflicts possible
* no merge conflicts possible
Cons * slightly more effort for contributors * custom-built solution --> maintenance burden
* moving existing changelog entries to another section requires a commit * requires GitHub token
* creation of changelog for upcoming version not supported natively * worse performance
* including release notes for upcoming patch releases is difficult. * does not (yet?) support "finalizing" / "freezing" changelog for released versions

@github-actions
Copy link
Contributor

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit 94bf3af

@DanielNoord
Copy link
Collaborator

Thanks for the extensive review @DudeNr33.

Do you have a preference yourself? To me it seems that towncrier has a higher initial cost but might be beneficial long term. But please correct me if I'm wrong. So that would mean we would need somebody to sponsor/push the initial change, but then be able to rely on towncrier to fix our issues?

@DanielNoord DanielNoord removed their request for review June 13, 2022 21:14
@DudeNr33
Copy link
Collaborator Author

If we can live without "preview" release notes for upcoming patch releases, I would vote for towncrier.
I don't think that the initial cost for implementing the towncrier approach is higher. With the Sphinx extension of this current PR there would be additional steps, like defining the necessary labels and apply them to all PRs for 2.15.0 and 2.14.2. Plus, there is always the potential for bugs with a homegrown solution.

Let's vote for it:

  • 🚀 towncrier
  • 🎉 Sphinx extension of this PR (pylint_changelog)

@Pierre-Sassoulas
Copy link
Member

Thank you for the state of the art, very interesting !

If we can live without "preview" release notes for upcoming patch releases, I would vote for towncrier.

It's a nice to have but it wasn't intended, we can live without.

With the Sphinx extension of this current PR there would be additional steps, like defining the necessary labels and apply them to all PRs for 2.15.0 and 2.14.2.

Can't we use the milestone directly for this ? Or are we talking about 'skip-news 🔇' ? Wouldn't we need that for towncrier too ?

@DudeNr33
Copy link
Collaborator Author

With the Sphinx extension of this current PR there would be additional steps, like defining the necessary labels and apply them to all PRs for 2.15.0 and 2.14.2.

Can't we use the milestone directly for this ? Or are we talking about 'skip-news 🔇' ? Wouldn't we need that for towncrier too ?

The milestone alone is not enough if we want to distinguish sections like "New checkers", "False negatives fixed", etc.
For some of these we already have appropriate labels (like "False Positive 🦟" and "False Negative 🦋"), but not for all sections that we'd like to have in the release notes. For those we would have to define the corresponding labels, and apply them to all PRs that need them.

But towncrier already won looking at the votes. 😊
I will prepare a new separate PR for it.

@DudeNr33 DudeNr33 closed this Jun 15, 2022
@Pierre-Sassoulas
Copy link
Member

For those we would have to define the corresponding labels, and apply them to all PRs that need them.

It would definitely be nice to be able to choose the section using labels 👍 But towncrier do not need that if I understand correctly ?

@DudeNr33
Copy link
Collaborator Author

Exactly. towncrier works with news fragments. Each fragment has a type, and the possible types can be configured through a configuration file.
When using the towncrier CLI, it will also check that only one of the configured types is used.
So for example, the news fragment for a PR fixing a false positive would be 1234.false_positive.
I have to check if towncrier also already provides a way to prevent users from just manually creating a news fragment with wrong type. If not, it would be no problem to simply write a custom script to check that in the CI.

@Pierre-Sassoulas Pierre-Sassoulas removed this from the 2.15.0 milestone Aug 25, 2022
@jacobtylerwalls jacobtylerwalls removed their request for review September 24, 2022 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation 📗 Maintenance Discussion or action around maintaining pylint or the dev workflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A single changelog file instead of two
4 participants