Describe the various classes involved in interacting with git and what kinds of behavior to find in each.
The GitHub package uses dugite to execute git commands as subprocesses. Dugite bundles a minimal git distribution built from the primary git tree. This has the advantages that we ensure compatibility and consistency with native git operations and that Atom users don't need to download and install git themselves, at the cost of a larger download size (by about 30MB).
When a subprocess is spawned from Node.js, the resident set of memory pages needs to be copied into the new process' address space. This copy happens synchronously even when using asynchronous variants of functions from the child_process
module, and from an Electron process, the RSS can become quite large. Because this blocks the event loop it locks the processing of UI events. This leads to a quite noticeable degradation of Atom's performance when spawning a large number of subprocesses, manifesting as stuttering and locking.
To work around this, the GitHub package creates a secondary Electron renderer process, with no visible window, and uses an IPC request/response protocol to perform subprocess creation within that process instead. The sidecar renderer process tracks a running average of the duration of the synchronous portion of the spawn calls it performs and, if it degrades too much, self-destructs and re-launches itself. The IPC and process creation overhead are easily cancelled out by the smoothing that this brings.
The sidecar process execution is implemented on the host process side by the WorkerManager
, Worker
, RendererProcess
and Operation
classes. The client side is implemented by worker.js
, which is loaded by renderer.html
.
If you wish to see the sidecar renderer process window with its diagnostic information, set the environment variable ATOM_GITHUB_SHOW_RENDERER_WINDOW
before launching Atom. To opt out of the sidecar process entirely (for CI tests, for example) set ATOM_GITHUB_INLINE_GIT_EXEC
.
The GitShellOutStrategy
class is responsible for composing the actual commands and arguments passed to git
subprocesses, either through dugite directly or through the WorkerManager
. An asynchronous queue implementation manages git command concurrency: commands that acquire a lock on the git index - write operations - run serially, but read operations are permitted to execute in parallel.
Command arguments are injected to override problematic git configuration options that could break our ability to parse git's output for certain commands, and to register Atom's GitPromptServer as a handler for SSH, https auth, and GPG credential requests.
It also measures performance data and reports diagnostics to the dev console if the appropriate Atom configuration key is set.
GitShellOutStrategy
methods communicate by means of plain JavaScript objects and strings. They are very low-level; each method calls a single git
command and reports any output with minimal postprocessing or parsing.
Historical note:
GitShellOutStrategy
andCompositeGitStrategy
are the remnants of exploratory work to back some operations by calls to libgit2 by means of nodegit. The performance and stability cost ended up not being worth it for us.
A GitTempDir
and GitPromptServer
are created during certain GitShellOutStrategy
methods to service any credential requests that git requires. We handle passphrase requests by:
- Creating a temporary directory.
- Copying a set of helper scripts to the temporary directory and, on non-Windows platforms, marking them executable. These scripts are
/bin/sh
scripts that execute their corresponding JavaScript modules as Node.js processes with the current Electron binary (by settingELECTRON_RUN_AS_NODE=1
), propagating along any arguments. - A UNIX domain socket or named pipe is created within the temporary directory. 📝 Note that UNIX domain socket paths are limited to a maximum of 107 characters for reasons. On platforms where this is an issue, the temporary directory name must be short enough to accommodate this.
- The host Atom process creates a server listening on the UNIX domain socket or named pipe.
- The
git
subprocess is spawned, configured to use the copied helper scripts as credential handlers.- For HTTPS authentication, the argument
-c credential.helper=...
is used to ensurebin/git-credential-atom.js
is used as the highest-priority git credential helper.git-credential-atom.js
implements git's credential helper protocol by:- Executing any credential helpers configured by your system git. Some git installations are already configured to read from the OS keychain, but dugite's bundled git won't respect configution from your system installation.
- Reading an Atom-specific key from your OS keychain. If you have logged in to the GitHub tab, your OAuth token will be found here as well.
- If neither of those are successful, connect to the socket opened by
GitPromptServer
and write a JSON query. - When a JSON reply is received, it is written back to git on stdout.
- If git reports that the credential is accepted, and if the "remember me" flag was set in the query reply, the provided password will be written to the OS keychain.
- If git reports that the credential was rejected, the provided password will be deleted from the OS keychain.
- To unlock SSH keys, the environment variables
SSH_ASKPASS
andGIT_ASKPASS
are set to the path to the script that runsgit-askpass-atom.js
.DISPLAY
is also set to a non-empty value so thatssh
will respectSSH_ASKPASS
.git-askpass-atom.js
reads its prompt from its process arguments, attempts to execute the system askpass if one is present, and falls back to querying theGitPromptServer
if that does not succeed. Its passphrase is written to stdout. - For GPG passphrases,
-c gpg.program=...
is set tobin/gpg-wrapper.sh
.gpg-wrapper.sh
attempts to use the--passphrase-fd
argument to GPG to prompt for your passphrase by reading and writing to file descriptor 3. Unfortunately, more recent versions of GPG not longer respect this argument (and use a much more complicated architecture for pinentry configuration throughgpg-agent
,) so for now native GPG pinentry programs must often be used. - On Linux,
GIT_SSH_COMMAND
is set tobin/linux-ssh-wrapper.sh
, a wrapper script that runs the ssh command in a new process group. Otherwise,ssh
will ignoreSSH_ASKPASS
and insist on prompting on the tty you used to launch Atom.
- For HTTPS authentication, the argument
Repository
is the higher-level model class that most of the view layer uses to interact with a git repository.
Repositories are stateful: when created with a path, they are loading, after which they may become present if a .git
directory is found, or empty otherwise. They may also be absent if you don't even have a path. Empty repositories may transition to initializing or cloning if a git init
or git clone
operation is begun. For more details about Repository states, see the lib/models/repository-states/
README.
Repository instances mostly delegate operations to their current state instance. (This delegation is not automatic; there is an explicit list of methods that are delegated, which must be updated if new functionality is added.) However, Repositories do directly implement methods for:
- Composite operations that chain together several one-git-command pieces from its state, and
- Alias operations that re-interpret the result from a single primitive command in different ways.
Present
is the most often-used state because it represents a Repository
that's actually there to operate on. Present has methods for all primitive git
operations, implemented as calls to the active git strategy.
Present's methods communicate with a language of model objects: Branch
, Commit
, FilePatch
.
Present is responsible for caching the results of commands that read state and for selectively busting invalidated cache keys based on write operations that are performed or filesystem activity observed within the .git
directory.
To write a method that reads from the cache, first locate or create a new cache key. These are static CacheKey
objects found within the Key
structure. If the git operation depends on some of its operations, you may need to introduce a function that creates a unique cache key based on its input.
const Keys = {
// Single static key that does not depend on input.
lastCommit: new CacheKey('last-commit'),
// A group of related cache keys.
config: {
// Generate a key based on a command argument.
// The created key belongs to two "groups" that can be used to invalidate it.
oneWith: (setting, local) => {
return new CacheKey(`config:${setting}:${local}`, ['config', `config:${local}`]);
},
// Used to invalidate *all* cache entries belonging to a given group at once.
all: new GroupKey('config'),
},
}
Then write your method to call this.cache.getOrSet()
with the appropriate key or keys as its first argument:
getConfig(option, local = false) {
return this.cache.getOrSet(Keys.config.oneWith(option, local), () => {
return this.git().getConfig(option, {local});
});
}
To write a method that may invalidate the cache, wrap it with the invalidate()
method:
setConfig(setting, value, options) {
return this.invalidate(
() => Keys.config.eachWithSetting(setting),
() => this.git().setConfig(setting, value, options),
);
}
To respond appropriately to git commands performed externally, be sure to also add invalidation logic to the Present::observeFilesystemChange()
.
State
is the root class of the hierarchy used to implement Repository states. It provides implementations of all expected state methods that do nothing and return an appropriate null object.
When adding new git functionality, be sure to provide an appropriate null version of your methods here, so that newly added methods will work properly on Repositories that are loading, empty, or absent.