Commit Watcher finds interesting and potentially hazardous commits in git projects. Watch your own projects to make sure you didn't accidentally leak your AWS keys or other credentials, and watch open-source projects you use to find undisclosed security vulnerabilities and patches.
At SourceClear, we want to help you use open-source software safely. Oftentimes when a security vulnerability is discovered and fixed in an open-source project, there isn't a public disclosure about it. In part, this is because the CVE process is onerous and labor intensive, and notifying all the users of a project isn't possible.
Oh, and about that UI. Commit Watcher is intended to be an API accessible backend service. The UI is only there for testing, and the scope of functionality is limited to collecting commits and auditing them against a set of rules.
Check out the dozens of rules and patterns in the srcclr/commit-watcher-rules repository that help find leaked credentials and potential security issues. Just open an issue or PR in that repo if there's a rule you'd like to see added.
Additionally, if you find a security issue on an open-source project using Commit Watcher, our security research team would love to help verify it. You can open an issue against this repo from the UI, or just drop a link to the offending commit in a new issue.
Install and configure Ruby using RVM or Rbenv. Avoid using the system's bundled Ruby to avoid permission issues during installation/setup.
RVM: https://rvm.io
Rbenv: https://github.com/rbenv/rbenv
Install MySQL and Redis. On Mac, with Brew, you can do that with this command:
brew install mysql redis
Follow the instructions Brew gives you so the services are started properly.
Install gem dependencies:
gem install bundler
bundle install
Then setup some Rails secrets and passwords:
figaro install
echo "COMMIT_WATCHER_DATABASE_PASSWORD: 'changeme123'" >> config/application.yml
echo "SECRET_KEY_BASE: `rake secret`" >> config/application.yml
The rest of the setup depends on how you want to run Commit Watcher. You can either run it locally, which is good for quick development, or you can run it with Docker.
To use email notifications, set your Gmail username and password with these commands:
echo "GMAIL_USERNAME: '[email protected]'" >> config/application.yml
echo "GMAIL_PASSWORD: 'urpassbro'" >> config/application.yml
If you'd like to use another email provider other than Gmail, you'll have to change these two files: config/environments/development.rb
and config/environments/production.rb
.
Create the database, load the schema, and seed it with some sample rules:
rails db:setup
Now you're ready to start Rails with:
rails s
To start processing jobs, in another terminal:
bundle exec sidekiq
First, change the root and user passwords in .env.db
.
# Not used but should set one for security.
MYSQL_ROOT_PASSWORD=changeme123
# This is for the commit_watcher user.
MYSQL_PASSWORD=changeme123
Second, modify config/database.yml
by commenting out socket
in favor of host
, like this:
# Use this for local mysql instances
#socket: /tmp/mysql.sock
# Use this for Docker
host: db
Alternatively, for RDS, setup the external RDS URL:
echo "COMMIT_WATCHER_EXTERNAL_DATABASE_URL: 'somedb.rds.amazonaws.com'" >> config/application.yml
Then, modify config/database.yml
by commenting out socket
in favor of host
, like this:
# Use this for local mysql instances
#socket: /tmp/mysql.sock
# Use this for Docker
#host: db
# Use this for External RDS
host: <%= ENV['COMMIT_WATCHER_EXTERNAL_DATABASE_URL'] %>
And modify docker-compose.yml
by commenting out - db
in the web:
and sidekiq:
sections, like this:
web:
build: .
volumes:
- .:/myapp
ports:
- '3000:3000'
links:
#- db
- redis
...
sidekiq:
build: .
volumes:
- .:/myapp
links:
#- db
- redis
Now start everything going with:
docker-compose up
This downloads the images and builds the database and rails app containers. When it's finished building, and both containers are running, you should see rails messages like this:
77bcf6cd5a_commitwatcher_web_1 | [2016-03-09 18:29:36] INFO WEBrick 1.3.1
77bcf6cd5a_commitwatcher_web_1 | [2016-03-09 18:29:36] INFO ruby 2.2.2 (2015-04-13) [x86_64-linux]
77bcf6cd5a_commitwatcher_web_1 | [2016-03-09 18:29:36] INFO WEBrick::HTTPServer#start: pid=1 port=3000
Stop Docker with Ctrl+C
so the database can be setup with:
docker-compose run web bundle exec rake db:schema:load db:seed
Now start everything up again with:
docker-compose up
If using Docker, the server will be accessible from the IP address given by:
docker-machine ip default
To crawl any projects, you must set a GitHub API token in the default configuration. This can be reached here: http://localhost:3000/configurations/1/edit.
The web UI contains a dashboard which links to all available pages. It's located here: http://localhost:3000/.
Sidekiq dashboard is here: http://localhost:3000/sidekiq/cron.
The process starts by every few minutes any project which hasn't been checked in a while is polled for new commits. These commits are then checked against whatever rules are setup for the project. Any commits which match are recorded and available at the /commits
endpoint.
Everything is broken up into different Sidekiq jobs. There are three:
- Selecting projects which need to be polled
- Collecting new commits
- Auditing a single commit
The API endpoints are similar to the web UI and are documented by code.
The app must have a hostname to access the API endpoints. This can be done in development by adding a record to the host file:
echo "127.0.0.1 api.my_app.dev" >> /etc/hosts
Then the API can be accessed by:
curl http://api.my_app.dev:3000/v1/commits
Rule types are defined and described in config/rule_types.yml. They are:
filename_pattern
- Regular expression for a filenamechanged_code_pattern
- Regular expression for a changed linecode_pattern
- Regular expression for any code in a changed filemessage_pattern
- Regular expression for a commit messageauthor_pattern
- Regular expression for a commit author name, normalized to "name "commit_pattern
- Combination of code_pattern and message_patternexpression
- Boolean expression referencing one or more rules
This is a special rule type that allows for combining multiple rules in a boolean expression. The boolean expression has three operators: &&
(and), ||
(or), !
(not), and also allows for parenthetical expressions.
For example, if there are three rules:
is_txt
-/\.txt\z/
(filename_pattern)has_lulz_msg
-/\blulz\b/
(message_pattern)has_42
-/\b42\b/
(code_pattern)
To create an expression rule which would match commits that include "lulz" in the commit message and contains at least a single text file or has a file with the word "42":
(is_txt && has_lulz_msg) || has_42
To match a commit where any file is not a text file and includes "42":
!is_txt && has_42
Automated identification of security issues from commit messages and bug reports, FSE 2017