Disclaimer: This repository was entirely created by AI using aider.chat. All visualization ideas belong to @drew2a, but the code and documentation were written by ChatGPT-4 and Claude-3.5-Sonnet.
A collection of scripts for analyzing and visualizing Git repositories. Explore commit history, contributor activity, code changes, and more with easy-to-use tools designed to help you gain insights into your codebase.
- Prerequisites
- plot_number_of_contributors.py
- calculate_branch_age.py
- plot_open_issues.py
- plot_open_issues.py
You can install the required Python packages using:
pip install -r requirements.txt
plot_number_of_contributors.py
is a versatile script designed to analyze and visualize contributor activity within a
Git repository. By examining commit history, it identifies continuous contribution periods and aggregates contributor
data over time, providing insights into both individual and collective engagement patterns. The script offers
configurable parameters to tailor the analysis, such as specifying the branch, excluding certain contributors, and
defining the time window for activity periods. It generates visualizations that highlight both all contributors and
those with sustained activity, making it a valuable tool for understanding contributor dynamics in a project.
The original work for this script was done here: GitHub Issue Comment.
The script plot_number_of_contributors.py
accepts the following configurable parameters:
--repo_path
: Path to the repository. Default is the current directory (.
).--branch
: Branch to analyze. Default ismain
.--exclusions
: List of contributors to exclude. Default is["dependabot", "snyk"]
.--delta_days
: Number of days to look back for commits. Default is 30 years (365 * 30
days).--window_days
: The maximum allowed gap between consecutive commits to be considered as part of the same activity period. For example, a 7-day window means that if the gap between two commits is less than or equal to 7 days, they are considered part of a continuous contribution period. Default is 90 days.--granularity_days
: The minimum length of time that a contribution period must be to be considered. For instance, a 1-day granularity means that any period shorter than 1 day is extended to 1 day. Default is 15 days.--contribution_duration
: The minimum total number of days a contributor must have contributed to be included in the analysis. For example, a filter of "at least two days in total" means that only contributors who have made commits on two or more separate days throughout the entire period are included. Default is 1 day.--less_than_year
: Use less frequent date ticks on x-axis. This is a flag, so it has no default value.--activity_plot_file
: File name for the activity plot. Default isout/activity_plot.png
.--contributor_count_plot_file
: File name for the contributor count plot. Default isout/contributor_count_plot.png
.
To generate the graphs, follow these steps:
git clone <repository-url>
Then, use the following commands:
This example visualizes all contributors over time with a window of 90 days, granularity of 15 days, and a minimum contribution duration of 1 day. This setting captures all contributors who have made at least one commit within any 90-day period, providing a broad view of contributor activity.
python plot_number_of_contributors.py --repo_path /path/to/repo --branch main --window_days 90 --granularity_days 15 --contribution_duration 1 --activity_plot_file all_contributors.png
This example focuses on continuous contributors, using a window of 90 days, granularity of 1 day, and a minimum contribution duration of 30 days. It highlights contributors who have been consistently active, making contributions over a longer period, thus offering insights into sustained engagement.
python plot_number_of_contributors.py --repo_path /path/to/repo --branch main --window_days 90 --granularity_days 1 --contribution_duration 30 --contributor_count_plot_file continuous_contributors.png
calculate_branch_age.py
is a script designed to calculate and visualize the age of branches in a Git
repository. It fetches all branches, determines the fork and latest commit dates for each branch, and calculates
the age in days. The script then generates a horizontal bar plot showing the age of each branch, with additional labels
for the start and end dates of each branch's age.
The script calculate_branch_age.py
accepts the following configurable parameters:
--repo_path
: Path to the repository. This parameter is required.--output_file
: File name for the branch age plot. Default isout/branch_ages.png
.--branch_regex
: Regex pattern to filter branches. Default is.+
.--min_age
: Minimum age of branches to include in days. Default is0
.--main_branch
: Name of the main branch to compare against. Default ismain
.
To generate the branch age plot, follow these steps:
Clone the target repository:
git clone <repository-url>
Then, use the following command to plot all branches older than 100 days for https://github.com/arvidn/libtorrent
python calculate_branch_age.py --repo_path ../../arvidn/libtorrent --main_branch master --min_age 100
plot_open_issues.py
is a script designed to fetch and visualize open issues from a GitHub repository over time. It
allows you to analyze the trend of open issues and visualize release periods with optional coloring and timestamp
display.
Due to the limitations of the public GitHub REST API, the number of requests is restricted. To avoid frequent requests to GitHub, the script operates in two stages: first, it fetches all issues and releases and saves them to files; then, it analyzes these files.
The script plot_open_issues.py
accepts the following configurable parameters:
--repo
: GitHub repository in the format "owner/repo". Default isTribler/tribler
.--issues_file
: File to save issues data. Default isout/issues.json
.--releases_file
: File to save releases data. Default isout/releases.json
.--state
: State of issues to fetch (e.g., open, closed, all). Default isall
.--labels
: Labels to filter issues by. Default istype: bug
.--override
: Override existing files and fetch new data. This is a flag, so it has no default value.--output_plot
: Output file for the plot. Default isout/open_issues_plot.png
.--show_release_timestamps
: Display release timestamps on the plot. This is a flag, so it has no default value.--color_releases
: Color the release periods on the plot. This is a flag, so it has no default value.
To generate the plot, use the following command:
python plot_open_issues.py --repo Tribler/tribler --labels "type: bug" --output_plot out/open_issues_plot.png --show_release_timestamps --color_releases
This command will fetch issues and releases from the specified repository, save the data to JSON files, and generate a plot of open bugs over time with colored release periods and timestamps.