Skip to content

jrmidkiff/citygeo_secrets_mirrored

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

THIS IS A MIRRORED VERSION OF AN INTERNAL PRODUCTION REPOSITORY LAST UPDATED 2024-07-25

CityGeo Secrets

Authors: James Midkiff and Roland MacDavid

Securely obtain, cache, and update secrets from Keeper

When retrieving secrets, citygeo_secrets will search in the following order and return a secret when found:

  1. Internal memory cache (if the secret has already been retrieved in the same python process)
  2. Drive on Windows or Linux (if available and the user has permission)
  3. Keeper API (the source of "truth")

When a secret is retrieved or updated, it will write the secret out in the reverse order: First to Keeper (if the user updates the secret), then to the "mounted" drive (if available and the user has permission), and then to the memory cache.

Pre-Setup

Keeper Initialization

First, ensure your secrets are in Keeper in one (or more) shared folders and that they are type "login" or "database". For unknown reasons, this package is unable to locate "general" type records. Secret names must be globally unique.

Then, either

  1. Use an existing client-config.json file, passing its file location to the module via:
    cgs.set_config(keeper_dir="<relative or absolute filepath>")
    See Configuration section for more information. Or,

  2. Initialize a new Keeper application. Follow these directions.

    Notes:

    • Name the application for the script and the server that will be using it, e.g. "Election Results - linux-scripts-aws"
    • The application should generally only need read-access; if a secret needs automated updating, then write-access can be granted, but the preference is to manually update secrets with the Keeper GUI.
    • Lock the application to the first IP that uses the inital request
    • Be sure to give the application access to all the folders where your various secrets may be located

    Next, copy the one-time token into a file named config-secret located wherever you are running your application from. The application upon first use will consume the token and generate a new file client-config.json.

Do not add any of these files to your github repository; add the following to .gitignore:

*client-config.json
*config-secret
*env_vars.bash

Installation

Requires python >= 3.7

https:

pip install git+https://github.com/CityOfPhiladelphia/citygeo_secrets.git

ssh:

pip install git+ssh://[email protected]/CityOfPhiladelphia/citygeo_secrets.git

Usage

import citygeo_secrets as cgs

Automatically connect using most up-to-date credentials

cgs.connect_with_secrets(func, *secret_names, **kwargs)

Use secret names to connect to host, automatically retrieving newest secrets and retrying once if an exception is raised.

Parameters:

  • func: (dict) -> Any
    • A user-created function that accepts a dictionary, extracts the desired credentials, and returns the desired connection. This function should fail if the credentials are invalid so that cgs.connect_with_secrets can grab the latest credentials from Keeper and retry once.
      • Pass the function name itself, do not call the function with ()
    • Certain modules use lazy-initialization of connections (specifically sqlalchemy.create_engine()), so ensure that the code actually verifies if the credentials are correct by forcing a new connection to be made if necessary
    • For the structure of the dictionary that the function must accept, see cgs.get_secrets()
  • *secret_names: str
    • Names of one or more secrets exactly as they appear in Keeper
      • '<secret_name1>', '<secret_name2>', ...
  • **kwargs: Any
    • Any keyword-arguments you need to pass to func
      • kwarg1 = 'val1', kwarg2 = 'val2', ...

Returns:

  • Return value of func, e.g. a database connection, API connection, SFTP connection, boto3 client, etc.
import citygeo_secrets as cgs

def connect_db(creds: dict): 
    db_creds = creds['Test CityGeo_Secrets DB']
    db_creds_part2 = creds['Test CityGeo_Secrets DB - Part 2']
    ... # Connect to database
    return connection

def connect_other(creds: dict, extra: str): 
    other_creds = creds['Test CityGeo_Secrets']
    print(f'extra: {extra}')
    ... # Connect to other service
    return connection


conn_db = cgs.connect_with_secrets(
    connect_db, "Test CityGeo_Secrets DB", "Test CityGeo_Secrets DB - Part 2")
conn_other = cgs.connect_with_secrets(
    connect_other, "Test CityGeo_Secrets", extra='EXTRA')

Generate environment variables as part of a bash script

cgs.generate_env_file(method='keeper', **kwargs)

Generate a file to source environment variables for a shell script, ignoring string interpolation.

While this method only creates the file itself (and does not actually source the variables), it will print the lines of code necessary to run in shell. The method can be run interactively, but is also designed to be run directly by a shell script - see the example below.

Warning: Choose the names of environment variables carefully. It is not recommended to overwrite existing system environment variables such as

  • ORACLE_HOME
  • SSH_CLIENT
  • SHELL
  • PWD

There are currently no checks for this. Overwriting an environment variable related to citygeo_secrets is fine.

Parameters:

  • method: str = 'keeper'
    • Must be one of
      • "keeper" (preferred), or
      • "mount", "mounted", or "tmpfs"
    • Determines whether secrets are sourced from keeper or from mounted drive
    • "keeper" is preferred as secrets sourced from mounted drive as environment variables will not auto-update upon connection failure. "mount" can be used for scripts that run very frequently but whose secrets need to be changed manually only infrequently.
  • **kwargs: tuple[str, str | list[str] ]
    • Format: <ENV_VAR_NAME> = ('<secret_name>', subset_path)
    • Information:
      1. <ENV_VAR_NAME> is the name of the environment variable to create
      2. '<secret_name>' is name of one secret exactly as it appears in Keeper
      3. subset_path is a list of parsing levels in a secret's dictionary: ["<index1>", "<index2>", ...]
        • If only one level is needed then a string can be passed instead.

        • To see a secret's dictionary, use:

          cgs.get_secrets('<secret_name>', build=False)['<secret_name>']
          

Returns:

  • None (writes out a file)
#!/bin/bash

##### 
# This is a bash script
# Run `source <this_bash_script>` to have environment variables for all subprocesses
# Copy this file into your dbt project and modify as needed
#####
source venv/bin/activate # wherever citygeo_secrets & python are installed

# Writes out a file of environment variables 
# Note that 'databridge-oracle/hostname' requires two levels to access the host value
python -c "
import citygeo_secrets as cgs 
cgs.generate_env_file('keeper', 
    DATABRIDGE_USER = (
        'SDE', 
        'login'), 
    DATABRIDGE_PASSWORD = (
        'SDE', 
        'password'), 
    DATABRIDGE_HOST = (
        'databridge-oracle/hostname', 
        ['host', 'hostName']), 
    DATABRIDGE_DBNAME = (
        'databridge-oracle/hostname', 
        'database'), 
    AWS_ACCESS_KEY_ID = (
        'CityGeo AWS script access key', 
        'login'), 
    AWS_SECRET_ACCESS_KEY = (
        'CityGeo AWS script access key', 
        'password')
    )
"
###### 
# Include the following lines in your bash script to automatically source and delete the correct file, which is produced in the same location as this bash script. 
# If you do not copy the below lines, `citygeo_secrets` will output bash code that you SHOULD include instead. 

# Get dirname of this script
SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &> /dev/null && pwd) 
ENV_VARS_FILE=""$SCRIPT_DIR"/citygeo_secrets_env_vars.bash"

source $ENV_VARS_FILE
rm $ENV_VARS_FILE
######

# Add additional variables to export here
export SCHEMA=$USER

# Rest of bash script and any subprocesses now have access to the above environment variables
...

Note this feature is currently implemented on Linux only, not on Windows.

Manually use a secret or view its structure

cgs.get_secrets(*secret_names, build=True, search_cache=True)

Obtain secrets from mounted drive (tmpfs) and/or Keeper

While developing and debugging, it may be easier to use this manual method first. This method will not auto-update an existing secret on the mounted drive if the secret is out-dated, but it will create the mounted drive and write any new secrets there if either do not exist.

Parameters:

  • *secret_names: str
    • Names of one or more secrets exactly as they appear in Keeper
      • '<secret_name1>', '<secret_name2>', ...
  • build: bool = True
    • If True, attempt to build mounted drive, otherwise just get secrets from Keeper or memory cache
  • search_cache: bool = True
    • If True, search for secret in cache. Regardless of this value, the returned secret will be written to cache

Returns:

  • dict of credentials
import citygeo_secrets as cgs

# Retrieve a dictionary of credentials
# Secret names must exactly match entries in Keeper (case-sensitive)
credentials = cgs.get_secrets(
    "db1", "db2", "db3")

# Returns a dictionary: 
# credentials = {
#     "db1": {
#         "login": "login1", 
#         "password": "password1"
#     }, 
#     "db2": {
#         "host": {
#             "hostName": "host2", 
#             "port": "5432"
#         }, 
#         "password": "password2"
#     }, 
#     ...
# }

# User creates function to extract credentials and connect
postgres_conn = create_postgres_conn(credentials)
oracle_conn = create_oracle_conn(credentials)

Configuration

You may set various configuration parameters for how citygeo_secrets runs. These remain in place as long as the parent python process is active

import citygeo_secrets as cgs

cgs.set_config(
    keeper_dir="~/citygeo_secrets/venv_3.10", 
    log_level='debug', 
    verify_ssl_certs=False)

cgs.get_config() # Print out currently configuration

db_conn = cgs.connect_with_secrets(func, 'my_secret')
  • keeper_dir - Directory where either client-config.json or config-secret are located. This means you can place one file in your user directory on the server and use that for each new script without making a new Keeper application each time. Defaults to the current directory otherwise
  • log_level - One of "debug", "info", "warn", "error". Default is "info"
  • verify_ssl_certs - True | False, used by keeper_secrets_manager. Default is True.

Additional Functionality

  • cgs.update_secret(secret_name, secret)
    • Update a secret in Keeper and mounted drive (if possible), overwriting fields where possible otherwise adding new custom fields
    • Recommended only for secrets that require automated updating; manually updating secrets is preferred. Raises AssertionError if secret does not exist
    • Parameters:
      • secret_name: str
        • Name of secret to update
      • secret: dict
        • {'field1': 'new_value1', 'field2': 'new_value2', ...}
    • Returns:
      • None
  • cgs.get_keeper_record(secret_name)
    • Obtain a secret from keeper; only meant to be used if automated parsing fails
    • Parameters:
      • secret_name: str
        • Name of secret to update
    • Returns:
      • keeper_secrets_manager_core.dto.dtos.Record
  • cgs.worker.reset_mount_attributes()
    • Redetermine existence and accessibility of "mounted" drive.
    • Values will then be written to cgs.worker.mount_exists and cgs.worker.mount_access

Global variables

  • cgs.worker - The object whose methods are utilized, depending on the operating system. Inheriting from cgs.AbstractWorker, it is a singleton object (never created more than once). It should not be accessed by most users, but it does provide information about current global variables

Notes

  • The memory cache is only available within the same python process; it has not been tested in multiprocessing or multithread environments.

Linux

  • WARNING: This mounted drive will only be accessible by the first sudo user who ran the application. If it is necessary to undo a mistake, then discuss with the systems engineer, but the general approach will be to unmount and (carefully) remove the added entry in /etc/fstab, and then re-run the application.
  • If a user does not have sudo access, then the application will only retrieve secrets from Keeper.

Windows

  • Every user will be able to create their own hidden drive locaton of secrets, regardless of their administrative privileges

About

Securely obtain and/or cache secrets

Resources

Stars

Watchers

Forks

Packages

No packages published