The automated ADP Reference Ingest Capability is a Github action workflow that will automatically add references to an ADP container. The ADP Reference Ingest Capability has the ability to pull from multiple sources of ADP references in Github, de-duplicate those entries, and keep a master list of references.
Data flows through a multi-staged Github Action. The first two stages can be repeated for as many data source repositories as desired.
During the first stage, the source repository is "checked out" into the Github Actions's runner. This requires that the source repository is public, or that the Github Action is configured to have permissions to access the source repository. See Github's documation details here for more information about configuration of actions to use private repositories.
This stage performs two major operations:
- Identifes the new files that have been added to the source data repository since the last run of the Github Action.
- Creates copies of the the new files in the main reference list contained in this repository in the
references
folder.
To perfom these operations, the ADP Automated Reference Ingest Capability leverages the last_run_shas
directory. The last_run_shas
directory keeps track of the last Git commit SHA from the data source repository that was processed by the github action. This SHA is then used by this stage as a starting point where as the most recent SHA is used as the ending point. The previous SHA is collected through the Github API and processed in base64 by the read-file-via-api.py
python helper script.
Once the previous SHA and current SHA have been deterimined, a call to the git diff
tooling is made to get the list of files that have changed during the time between the provided shas. These files are then copied to the cve-reference-ingest/references
repostiory/folder by the create-file-via-api.py
python helper script for processing in a later stage.
Stage 2 requires the usage of REST requests to the Github REST API. This requires useage of access tokens for successful authentication.
- Details on how to create Access tokens can be found in Github's documentation here
- Details on how secrets are used in Github actions can be found in Github's documentation here
Stage 2 is a cruical point for maintaining state. Stage two must be completed in full in a linear methodology. If stage two fails the primary stage of the Github Action, Stage 4, will incorrectly complete and cause an invalid state to be reached that will require manual fixing.
Therefore, if Stage 2 fails, the Github Action will fail, and will trigger an email to the team.
If a failed state is reached, a proper debug message will be written to the logs to be viewed at a later time. However, almost all failed states in this stage will be related to network based failures to the Github API. All network based failures will be automatically retried the next time the github action runs, and should allow the action to retry copying the files.
If multiple failures of this stage happen consecutively over a 24 hour period, it should be invesitaged by a member of the team.
During the third stage, the cve-reference-ingest
repository is "checked out" into the Github Actions's runner. Providing the action with the cruical last_run_shas
folder and references
folder.
Stage 4 is responsible for 3 major operations:
- Determining the new references in the
cve-reference-ingest
repository that need to be proccessed. - Writing the references to CVEs using CVE services.
- Updating and committing the
last_run_shas
for any sources that data was pulled from.
For step 1 listed above, the Github Action checks its current Git SHA against the SHA saved in last_run_shas
. A call to the git diff
tooling then determines what files changed between those points.
For step 2 listed above, the Github actions passess the reference file to the adp.py
python helper script. The adp.py
helper script will then check to ensure the CVE the reference is for exists, ensures that there is no duplicate in the ADP container if there is one, and finally will write the new reference to the ADP container.
For step 3 listed above, after processing all the files, the Github action will update the appropriate last_run_shas
files and commit the changes to the cve-reference-ingest
repository.
Stage 4 requires the useage of a CVE services account and API key. Speak to your ORG's CNA to have an account created. The API key will then need to be added as a secret, as described in stage 2's configuration.
While a file is being processed by adp.py
, if any network requests to CVE services fail, the network reqeust will be automatically retried once. If a failure happens on the second attempt, the file will be copied to the retry
folder. Where it will be queued to be re-attempted at a later time.
This stage only triggers if there is a reference file in the retry
directory. Files are added to the retry directory due to failures in Stage 4. The Github Action will attempt to write all references to the respective CVE ADP containers. If the write fails, the file will remain in the retry
folder to be tried again during the next run of the Github Action. If the write succeeds, the file will be removed from the retry
folder.