Skip to content

Add compressed size to git-sizer output #140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

ScottArbeit
Copy link

@ScottArbeit ScottArbeit commented Apr 2, 2025

This PR adds the compressed size of the .git directory to the output of the git-sizer tool, and makes a few cosmetic label changes to make the output more clear.

Notes

  • The compressed size is calculated by computing the on-disk size of the .git directory.
  • The scale used to compute the histogram for the new Compressed total size is 1e9, which is 1/10th of the scale used to compute it for the Uncompressed total size (10e9). I'm open to feedback on whether or not that's the right estimate to use.
  • Label changes are meant to make the output more clear, after fielding questions for years about what "Total size" meant in the Blobs section.
  • This is my first contribution to a Go codebase, and while I did more than just vibe-code it, I certainly relied on Copilot to make the changes. If there's a better way to have made these changes, I'm open to feedback.

Here's the new output run on a large repository, with indicators for where the output has changed:

(12:03:26 PM) >C:\Source\GitHub\git-sizer\bin\git-sizer.exe -v
Processing blobs: 4370529
Processing trees: 12753930
Processing commits: 2117746
Matching commits to trees: 2117746
Processing annotated tags: 1294
Processing references: 34998
| Name                         | Value     | Level of concern               |
| ---------------------------- | --------- | ------------------------------ |
| Repository statistics        |           |                                |    <----- changed label
| * Commits                    |           |                                |
|   * Count                    |  2.12 M   | ****                           |
|   * Total size               |  1.04 GiB | ****                           |
| * Trees                      |           |                                |
|   * Count                    |  12.8 M   | ********                       |
|   * Total size               |  60.7 GiB | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
|   * Total tree entries       |  1.32 G   | **************************     |
| * Blobs                      |           |                                |
|   * Count                    |  4.37 M   | **                             |
|   * Uncompressed total size  |   463 GiB | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |    <----- changed label
| * On-disk size               |           |                                |    <----- new output
|   * Compressed total size    |  24.4 GiB | **************************     |    <----- new output
| * Annotated tags             |           |                                |
|   * Count                    |  1.29 k   |                                |
| * References                 |           |                                |
|   * Count                    |  35.0 k   | *                              |
|     * Branches               |     1     |                                |
|     * Tags                   |  2.17 k   |                                |
|     * Remote-tracking refs   |  32.8 k   | *                              |
|                              |           |                                |
| Biggest objects              |           |                                |
| * Commits                    |           |                                |
|   * Maximum size         [1] |  3.74 MiB | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
|   * Maximum parents      [2] |     7     |                                |
| * Trees                      |           |                                |
|   * Maximum entries      [3] |  10.3 k   | **********                     |
| * Blobs                      |           |                                |
|   * Maximum size         [4] |  99.9 MiB | **********                     |
|                              |           |                                |
| History structure            |           |                                |
| * Maximum history depth      |   299 k   |                                |
| * Maximum tag depth      [5] |     1     |                                |
|                              |           |                                |
| Biggest checkouts            |           |                                |
| * Number of directories  [6] |  37.4 k   | ******************             |
| * Maximum path depth     [7] |    20     | **                             |
| * Maximum path length    [8] |   349 B   | ***                            |
| * Number of files        [9] |   232 k   | ****                           |
| * Total size of files    [6] |  8.30 GiB | ********                       |
| * Number of symlinks    [10] |   594     |                                |
| * Number of submodules  [11] |     4     |                                |

_detailed refs redacted_

@Copilot Copilot AI review requested due to automatic review settings April 2, 2025 18:33
@ScottArbeit ScottArbeit requested a review from a team as a code owner April 2, 2025 18:33
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds functionality to compute and display the compressed (on-disk) size of the .git directory while also updating some output labels for improved clarity. Key changes include:

  • Adding a new GitDirSize field to store the on-disk size of the repository’s .git directory.
  • Modifying the output labels in the results table to better differentiate between repository statistics.
  • Introducing a new helper function to calculate the .git directory size and updating the GitDir API to return an error when unset.

Reviewed Changes

Copilot reviewed 6 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
sizes/sizes.go New field GitDirSize added to HistorySize with updated documentation.
sizes/output.go Updated labels and added a new section to display the compressed size.
sizes/dirsize.go Added new function CalculateGitDirSize to compute the directory size.
git/git.go Modified GitDir method to return an error when the git directory is not set.
git-sizer.go Updated to use the revised GitDir and integrate the computed GitDirSize.
CONTRIBUTING.md Reformatted dependency installation instructions for clarity.
Files not reviewed (4)
  • Makefile.win: Language not supported
  • script/bootstrap.ps1: Language not supported
  • script/ensure-go-installed.ps1: Language not supported
  • script/go.ps1: Language not supported
Comments suppressed due to low confidence (2)

git/git.go:153

  • Changing the GitDir function signature to return an error now requires all callers to handle this error. Please verify that all usage sites have been updated to properly manage error propagation.
func (repo *Repository) GitDir() (string, error) {

sizes/dirsize.go:14

  • Errors encountered during the directory walk are silently ignored by returning nil. Consider logging these errors or handling them explicitly to aid in diagnosing potential access issues.
err := filepath.Walk(gitDir, func(path string, info os.FileInfo, err error) error {

Tip: If you use Visual Studio Code, you can request a review from Copilot before you push from the "Source Control" tab. Learn more

@ScottArbeit
Copy link
Author

cc: @ttaylorr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant