Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security: Logging robustness (DEBUG -> INFO, etc) #981

Open
joeflack4 opened this issue Nov 22, 2024 · 0 comments
Open

Security: Logging robustness (DEBUG -> INFO, etc) #981

joeflack4 opened this issue Nov 22, 2024 · 0 comments
Assignees
Labels
security Security related
Milestone

Comments

@joeflack4
Copy link
Member

joeflack4 commented Nov 22, 2024

Overview

While working on #874, I had Claude do an audit of our codebase. There are several decent recommendations.

Results of asking codebase LLM (Cursor running Claude) to check:

Mainly, we should change all prints to logging, and set DEBUG locally but INFO on deploy.

Details

I've analyzed the codebase for potential security risks related to the PALANTIR_ENCLAVE_AUTHENTICATION_BEARER_TOKEN. Here are my findings:

  1. Direct references to the token:
    The token is referenced in several files, but it's not directly exposed in plain text. It's typically accessed through environment variables or configuration files.

  2. Printing or logging:
    There are no instances where the token is directly printed or logged. However, there are some areas that might indirectly expose the token:

def log_debug_info():
    """Logs additional info when making HTTP requests"""
    # These two lines enable debugging at httplib level (requests->urllib3->http.client)
    # You will see the REQUEST, including HEADERS and DATA, and RESPONSE with HEADERS but without DATA.
    # The only thing missing will be the response.body which is not logged.
    try:
        import http.client as http_client
    except ImportError:
        # Python 2
        import httplib2 as http_client
    http_client.HTTPConnection.debuglevel = 1

    # You must initialize logging, otherwise you'll not see debug output.
    logging.basicConfig()
    logging.getLogger().setLevel(logging.DEBUG)
    requests_log = logging.getLogger("requests.packages.urllib3")
    requests_log.setLevel(logging.DEBUG)
    requests_log.propagate = True

This log_debug_info() function enables debug logging for HTTP requests. While it doesn't explicitly log the token, it could potentially expose sensitive information in headers or request data.

  1. Serialization:
    There are no instances of serializing environment variables that would expose the token.

  2. GitHub Actions:
    Several GitHub Action workflows access the token:

          TOKEN=$(grep '^PALANTIR_ENCLAVE_AUTHENTICATION_BEARER_TOKEN=' env/.env | cut -d'=' -f2)
          python3 -m pip install vshub_sdk --upgrade --extra-index-url "https://:${TOKEN}@unite.nih.gov/artifacts/api/repositories/ri.artifacts.main.repository.9bc9cc56-4b8c-4560-9bfb-fba8ade55246/contents/release/pypi/simple" --extra-index-url "https://:${TOKEN}@unite.nih.gov/artifacts/api/repositories/ri.foundry-sdk-asset-bundle.main.artifacts.repository/contents/release/pypi/simple"
          TOKEN=$(grep '^PALANTIR_ENCLAVE_AUTHENTICATION_BEARER_TOKEN=' env/.env | cut -d'=' -f2)
          python3 -m pip install vshub_sdk --upgrade --extra-index-url "https://:${TOKEN}@unite.nih.gov/artifacts/api/repositories/ri.artifacts.main.repository.9bc9cc56-4b8c-4560-9bfb-fba8ade55246/contents/release/pypi/simple" --extra-index-url "https://:${TOKEN}@unite.nih.gov/artifacts/api/repositories/ri.foundry-sdk-asset-bundle.main.artifacts.repository/contents/release/pypi/simple"
          TOKEN=$(grep '^PALANTIR_ENCLAVE_AUTHENTICATION_BEARER_TOKEN=' env/.env | cut -d'=' -f2)
          python3 -m pip install vshub_sdk --upgrade --extra-index-url "https://:${TOKEN}@unite.nih.gov/artifacts/api/repositories/ri.artifacts.main.repository.9bc9cc56-4b8c-4560-9bfb-fba8ade55246/contents/release/pypi/simple" --extra-index-url "https://:${TOKEN}@unite.nih.gov/artifacts/api/repositories/ri.foundry-sdk-asset-bundle.main.artifacts.repository/contents/release/pypi/simple"

These workflows access the token from the environment variables and use it to install dependencies. While this doesn't directly expose the token, it's worth noting that if the installation process logs its commands or if there's an error, it could potentially expose the token in the GitHub Actions logs.

  1. Shell scripts:
    There are a couple of shell scripts that use the token:
#!/bin/sh
source $( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )/../../env/.env
curl  -H "Content-type: application/json" \
            -H "Authorization: Bearer $PALANTIR_ENCLAVE_AUTHENTICATION_BEARER_TOKEN" \
            https://unite.nih.gov/api/v1/ontologies/ri.ontology.main.ontology.00000000-0000-0000-0000-000000000000/actionTypes | jq
#!/bin/sh
source $( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )/../../env/.env

curl  -H "Content-type: application/json" -H "Authorization: Bearer $TOKEN" https://unite.nih.gov/multipass/api/me | jq

These scripts use the token in curl commands. While they don't directly output the token, if these scripts are run in an environment where command-line arguments are logged, it could potentially expose the token.

Conclusion:
While there are no instances of the token being directly exposed in plain text, there are several areas where caution is needed:

  1. The debug logging in enclave_wrangler/utils.py could potentially expose sensitive information.
  2. The GitHub Actions workflows that use the token for dependency installation could potentially expose the token if there are installation errors or if the process is logged in detail.
  3. The shell scripts in the curl-stuff directory use the token in command-line arguments, which could be a risk if run in environments where commands are logged.

To mitigate these risks, consider:

  1. Reviewing and potentially disabling the debug logging in production environments.
  2. Ensuring that GitHub Actions logs are properly secured and that error messages don't include sensitive information.
  3. Using more secure methods to pass the token in shell scripts, such as environment variables instead of command-line arguments.

Great use case for Cursor. Impressive.

Results: It doesn't think it was exposed, but gave several different routes for further interrogation.

@joeflack4 joeflack4 self-assigned this Nov 22, 2024
@joeflack4 joeflack4 added the security Security related label Nov 22, 2024
@github-project-automation github-project-automation bot moved this to 3. Backlog in TermHub Nov 22, 2024
@joeflack4 joeflack4 added this to the 9. Unplanned milestone Nov 22, 2024
@joeflack4 joeflack4 changed the title Security: Logging robustness Security: Logging robustness (DEBUG -> INFO, etc) Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security Security related
Projects
Status: 3. Backlog
Development

No branches or pull requests

1 participant