Skip to content

Demonstrate use of Git filters to format source files during checkout, staging, and diffing.

Notifications You must be signed in to change notification settings

shoogle/git-filter-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

git-filter-demo

Demonstrate use of Git filters to auto-format source files during checkout, staging, and diffing.

Introduction

The C source files use 6-space banner-style indentation, which was the standard used in MuseScore Studio's source code prior to version 4. This is not a popular style, but the point is to show how you can change it to something that you are more familiar with without affecting how the code looks for other developers.

You don't actually need to compile the C code in order to understand and use this demo.

Instructions

Install dependencies

You need these in PATH:

  • Git
  • Bash (version 4.2 or higher to enable shopt -s lastpipe)
  • Uncrustify (ideally version 0.73, see instructions).

Uncrustify's output can change between versions, so it's usually best to use a fixed version.

Get the demo code

git clone https://github.com/shoogle/git-filter-demo.git
cd git-filter-demo

Or fork the project on GitHub and clone your fork.

Define alias

Let's define a Git alias to list files and their attributes as declared in .gitattributes:

git config --global alias.ls-attr '!f() { git ls-files "$@" | git check-attr --stdin --all ;}; f'

# Usage
git ls-attr                                 # list all attributes for all files
git ls-attr [paths...] [globs...]           # list all attributes for specific files
git ls-attr | sed -n "s|: attr: value$||p"  # list all files that have attr=value

Define smudge and clean filters

One of the attributes declared in .gitattributes is a filter called tidy_c (look for filter=tidy_c). Let's define smudge and clean commands for this filter. Git will run these commands during checkout and staging respectively.

# Checkout K&R style (optional):
git config filter.tidy_c.smudge "lint/uncrustify/wrapper.sh -l C -c lint/uncrustify/kr.cfg"
git ls-attr | sed -n "s|: filter: tidy_c$||p" | xargs touch             # mark affected files dirty
git ls-attr | sed -n "s|: filter: tidy_c$||p" | xargs git checkout --   # apply smudge

# Check-in MuseScore legacy style (required):
git config filter.tidy_c.clean "lint/uncrustify/wrapper.sh -l C -c lint/uncrustify/musescore.cfg"
git ls-attr | sed -n "s|: filter: tidy_c$||p" | xargs git add --renormalize --  # apply clean

You can define the smudge command to be whatever you want, or you can leave it undefined if you prefer to use the internal style as shown in the GitHub preview.

You must always define the clean command exactly as specified above. This ensures the internal style remains consistent, which keeps code and diffs clean in the online preview.

Learn more about the smudge and clean commands

The command defined for smudge or clean will be run by Git inside a shell. It can use shell features such as pipes, variables, and redirects, providing these are quoted or escaped correctly. The command must read data from STDIN, process it somehow, and then write to STDOUT. Git will not supply any arguments to the command besides those given in the definition.

You can simulate this with:

git show HEAD:src/demo.c | sh -c 'YOUR_COMMAND_DEFINITION' | less
git show HEAD:src/demo.c | sh -c "$(git config filter.tidy_c.smudge)" | less

If the command definition includes %f, Git will replace this with the 'quoted' path to the file currently being processed. This could be useful to display in error messages, or to customize the filtering based on file extension (see --assume option for Uncrustify). However, the command must not attempt to read from the %f file because it may not exist or its contents may differ from STDIN. This happens if the filter is processing the staged version of the file.

git show HEAD:src/demo.c | sh -c "$(echo 'YOUR_COMMAND_DEFINITION' | sed "s|%f|'src/demo.c'|g")" | less
git show HEAD:src/demo.c | sh -c "$(git config filter.tidy_c.smudge | sed "s|%f|'src/demo.c'|g")" | less

Try substituting echo >&2 "Cleaning: <%f>" or echo >&2 "Smudging: <%f>" as YOUR_COMMAND_DEFINITION and see what happens!

See also:

Edit some files

Make some changes to the .c or .h source files in the repository and see how your changes are reported by git status and git diff.

Try making some whitespace changes (e.g. move {} curly braces to a new line), and also try making some semantic changes (e.g. add another puts(), or change some words in a string).

Notice that git diff ignores whitespace changes because they don't survive the filter.

Define diff filter

If you defined a smudge command earlier, you may have noticed that git diff displays C code in the internal style rather than in your checked-out style.

To remedy this, .gitattributes also declares a diff filter called tidy_c (look for diff=tidy_c). Let's define the textconv command for this filter. Git will run this command when you diff files with this attribute.

# Set the diff filter to match the smudge filter (may not work with all smudge filters):
git config diff.tidy_c.textconv "$(git config filter.tidy_c.smudge) <"

# Alternatively, set the diff filter explicitly:
git config diff.tidy_c.textconv "uncrustify -l C -c lint/uncrustify/kr.cfg -f"

Now diffs will use your preferred style. Note that this is purely a visual change. It doesn't affect what happens with git add or git commit.

Learn more about the textconv command

Unlike the smudge and clean commands, the command defined for textconv doesn't receive data from STDIN. Instead, Git provides the path to a single file, which the textconv command must read, process somehow, and then write to STDOUT. This path is provided as an extra argument after all arguments in the command definition, and is also exposed to the command as the shell variable $1. Although the path is 'quoted' to preserve space characters, these quotes are stripped by the shell so they are not visible to your command.

You can simulate this with:

sh -c 'YOUR_COMMAND_DEFINITION '"'src/demo.c'" '' 'src/demo.c' | less
sh -c "$(git config diff.tidy_c.textconv) 'src/demo.c'" '' 'src/demo.c' | less

Try substituting echo >&2 "Diffing: <$1>" as YOUR_COMMAND_DEFINITION and see what happens!

When you perform a diff (e.g. git diff src/demo.c), Git runs the staged and unstaged versions of the file through your filter and compares them using the ordinary git diff algorithm.

git show HEAD:src/demo.c >/tmp/staged
diff -u --color=always <(sh -c 'YOUR_COMMAND_DEFINITION /tmp/staged' '' /tmp/staged) <(sh -c 'YOUR_COMMAND_DEFINITION src/demo.c' '' src/demo.c) | less -R
diff -u --color=always <(sh -c "$(git config diff.tidy_c.textconv) /tmp/staged" '' /tmp/staged) <(sh -c "$(git config diff.tidy_c.textconv) src/demo.c" '' src/demo.c) | less -R

The diff filter is only used for the visual diff. When committing changes, Git calculates deltas based on the output of the clean filter. If no clean filter is defined then it uses the actual file contents.

See also:

Optional: Define more filters!

You could define auto-format rules for other types of files, such as Markdown README.md files or build scripts like CMakeLists.txt.

Rules declared in .gitattributes will affect all developers, whereas rules declared in .git/info/attributes are personal to you.

Conclusion

My personal view is it's definitely worth defining a clean filter for source projects. Doing so ensures the internal code style remains consistent, which makes for easy code review on GitHub. It also unlocks the possibility of developers defining smudge filters on their local machines, because you can't smudge code unless there's a consistent target to clean back to.

I would define smudge and diff filters for binary files because it's difficult to inspect these files otherwise.

I would declare smudge and diff filters for text files, however I personally would not bother to define commands for them on my local machine. I prefer to work with content directly rather than with a representation of the content. This means getting used to a different coding style in each project I contribute to, but at least this way the local files on my machine, as well as the output of git diff and git show, always match what you see online in the GitHub preview.

Peter.

About

Demonstrate use of Git filters to format source files during checkout, staging, and diffing.

Topics

Resources

Stars

Watchers

Forks