THIS IS A MIRRORED VERSION OF AN INTERNAL PRODUCTION REPOSITORY LAST UPDATED 2024-07-25
Authors: James Midkiff and Roland MacDavid
Securely obtain, cache, and update secrets from Keeper
When retrieving secrets, citygeo_secrets
will search in the following order and return a secret when found:
- Internal memory cache (if the secret has already been retrieved in the same python process)
- Drive on Windows or Linux (if available and the user has permission)
- Keeper API (the source of "truth")
When a secret is retrieved or updated, it will write the secret out in the reverse order: First to Keeper (if the user updates the secret), then to the "mounted" drive (if available and the user has permission), and then to the memory cache.
First, ensure your secrets are in Keeper in one (or more) shared folders and that they are type "login" or "database". For unknown reasons, this package is unable to locate "general" type records. Secret names must be globally unique.
Then, either
-
Use an existing
client-config.json
file, passing its file location to the module via:
cgs.set_config(keeper_dir="<relative or absolute filepath>")
See Configuration section for more information. Or, -
Initialize a new Keeper application. Follow these directions.
Notes:
- Name the application for the script and the server that will be using it, e.g. "Election Results - linux-scripts-aws"
- The application should generally only need read-access; if a secret needs automated updating, then write-access can be granted, but the preference is to manually update secrets with the Keeper GUI.
- Lock the application to the first IP that uses the inital request
- Be sure to give the application access to all the folders where your various secrets may be located
Next, copy the one-time token into a file named
config-secret
located wherever you are running your application from. The application upon first use will consume the token and generate a new fileclient-config.json
.
Do not add any of these files to your github repository; add the following to .gitignore
:
*client-config.json
*config-secret
*env_vars.bash
Requires python >= 3.7
https:
pip install git+https://github.com/CityOfPhiladelphia/citygeo_secrets.git
ssh:
pip install git+ssh://[email protected]/CityOfPhiladelphia/citygeo_secrets.git
import citygeo_secrets as cgs
cgs.connect_with_secrets(func, *secret_names, **kwargs)
Use secret names to connect to host, automatically retrieving newest secrets and retrying once if an exception is raised.
Parameters:
- func: (dict) -> Any
- A user-created function that accepts a dictionary, extracts the desired credentials, and returns the desired connection. This function should fail if the credentials are invalid so that
cgs.connect_with_secrets
can grab the latest credentials from Keeper and retry once.- Pass the function name itself, do not call the function with
()
- Pass the function name itself, do not call the function with
- Certain modules use lazy-initialization of connections (specifically sqlalchemy.create_engine()), so ensure that the code actually verifies if the credentials are correct by forcing a new connection to be made if necessary
- For the structure of the dictionary that the function must accept, see
cgs.get_secrets()
- A user-created function that accepts a dictionary, extracts the desired credentials, and returns the desired connection. This function should fail if the credentials are invalid so that
- *secret_names: str
- Names of one or more secrets exactly as they appear in Keeper
'<secret_name1>', '<secret_name2>', ...
- Names of one or more secrets exactly as they appear in Keeper
- **kwargs: Any
- Any keyword-arguments you need to pass to func
kwarg1 = 'val1', kwarg2 = 'val2', ...
- Any keyword-arguments you need to pass to func
Returns:
- Return value of func, e.g. a database connection, API connection, SFTP connection, boto3 client, etc.
import citygeo_secrets as cgs
def connect_db(creds: dict):
db_creds = creds['Test CityGeo_Secrets DB']
db_creds_part2 = creds['Test CityGeo_Secrets DB - Part 2']
... # Connect to database
return connection
def connect_other(creds: dict, extra: str):
other_creds = creds['Test CityGeo_Secrets']
print(f'extra: {extra}')
... # Connect to other service
return connection
conn_db = cgs.connect_with_secrets(
connect_db, "Test CityGeo_Secrets DB", "Test CityGeo_Secrets DB - Part 2")
conn_other = cgs.connect_with_secrets(
connect_other, "Test CityGeo_Secrets", extra='EXTRA')
cgs.generate_env_file(method='keeper', **kwargs)
Generate a file to source environment variables for a shell script, ignoring string interpolation.
While this method only creates the file itself (and does not actually source the variables), it will print the lines of code necessary to run in shell. The method can be run interactively, but is also designed to be run directly by a shell script - see the example below.
Warning: Choose the names of environment variables carefully. It is not recommended to overwrite existing system environment variables such as
- ORACLE_HOME
- SSH_CLIENT
- SHELL
- PWD
There are currently no checks for this. Overwriting an environment variable related to citygeo_secrets
is fine.
Parameters:
- method: str = 'keeper'
- Must be one of
- "keeper" (preferred), or
- "mount", "mounted", or "tmpfs"
- Determines whether secrets are sourced from keeper or from mounted drive
- "keeper" is preferred as secrets sourced from mounted drive as environment variables will not auto-update upon connection failure. "mount" can be used for scripts that run very frequently but whose secrets need to be changed manually only infrequently.
- Must be one of
- **kwargs: tuple[str, str | list[str] ]
- Format:
<ENV_VAR_NAME> = ('<secret_name>', subset_path)
- Information:
<ENV_VAR_NAME>
is the name of the environment variable to create'<secret_name>'
is name of one secret exactly as it appears in Keepersubset_path
is a list of parsing levels in a secret's dictionary:["<index1>", "<index2>", ...]
-
If only one level is needed then a string can be passed instead.
-
To see a secret's dictionary, use:
cgs.get_secrets('<secret_name>', build=False)['<secret_name>']
-
- Format:
Returns:
- None (writes out a file)
#!/bin/bash
#####
# This is a bash script
# Run `source <this_bash_script>` to have environment variables for all subprocesses
# Copy this file into your dbt project and modify as needed
#####
source venv/bin/activate # wherever citygeo_secrets & python are installed
# Writes out a file of environment variables
# Note that 'databridge-oracle/hostname' requires two levels to access the host value
python -c "
import citygeo_secrets as cgs
cgs.generate_env_file('keeper',
DATABRIDGE_USER = (
'SDE',
'login'),
DATABRIDGE_PASSWORD = (
'SDE',
'password'),
DATABRIDGE_HOST = (
'databridge-oracle/hostname',
['host', 'hostName']),
DATABRIDGE_DBNAME = (
'databridge-oracle/hostname',
'database'),
AWS_ACCESS_KEY_ID = (
'CityGeo AWS script access key',
'login'),
AWS_SECRET_ACCESS_KEY = (
'CityGeo AWS script access key',
'password')
)
"
######
# Include the following lines in your bash script to automatically source and delete the correct file, which is produced in the same location as this bash script.
# If you do not copy the below lines, `citygeo_secrets` will output bash code that you SHOULD include instead.
# Get dirname of this script
SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
ENV_VARS_FILE=""$SCRIPT_DIR"/citygeo_secrets_env_vars.bash"
source $ENV_VARS_FILE
rm $ENV_VARS_FILE
######
# Add additional variables to export here
export SCHEMA=$USER
# Rest of bash script and any subprocesses now have access to the above environment variables
...
Note this feature is currently implemented on Linux only, not on Windows.
cgs.get_secrets(*secret_names, build=True, search_cache=True)
Obtain secrets from mounted drive (tmpfs) and/or Keeper
While developing and debugging, it may be easier to use this manual method first. This method will not auto-update an existing secret on the mounted drive if the secret is out-dated, but it will create the mounted drive and write any new secrets there if either do not exist.
Parameters:
- *secret_names: str
- Names of one or more secrets exactly as they appear in Keeper
'<secret_name1>', '<secret_name2>', ...
- Names of one or more secrets exactly as they appear in Keeper
- build: bool = True
- If True, attempt to build mounted drive, otherwise just get secrets from Keeper or memory cache
- search_cache: bool = True
- If True, search for secret in cache. Regardless of this value, the returned secret will be written to cache
Returns:
- dict of credentials
import citygeo_secrets as cgs
# Retrieve a dictionary of credentials
# Secret names must exactly match entries in Keeper (case-sensitive)
credentials = cgs.get_secrets(
"db1", "db2", "db3")
# Returns a dictionary:
# credentials = {
# "db1": {
# "login": "login1",
# "password": "password1"
# },
# "db2": {
# "host": {
# "hostName": "host2",
# "port": "5432"
# },
# "password": "password2"
# },
# ...
# }
# User creates function to extract credentials and connect
postgres_conn = create_postgres_conn(credentials)
oracle_conn = create_oracle_conn(credentials)
You may set various configuration parameters for how citygeo_secrets
runs. These remain in place as long as the parent python process is active
import citygeo_secrets as cgs
cgs.set_config(
keeper_dir="~/citygeo_secrets/venv_3.10",
log_level='debug',
verify_ssl_certs=False)
cgs.get_config() # Print out currently configuration
db_conn = cgs.connect_with_secrets(func, 'my_secret')
- keeper_dir - Directory where either
client-config.json
orconfig-secret
are located. This means you can place one file in your user directory on the server and use that for each new script without making a new Keeper application each time. Defaults to the current directory otherwise - log_level - One of "debug", "info", "warn", "error". Default is "info"
- verify_ssl_certs - True | False, used by
keeper_secrets_manager
. Default is True.
- cgs.update_secret(secret_name, secret)
- Update a secret in Keeper and mounted drive (if possible), overwriting fields where possible otherwise adding new custom fields
- Recommended only for secrets that require automated updating; manually updating secrets is preferred. Raises AssertionError if secret does not exist
- Parameters:
- secret_name: str
- Name of secret to update
- secret: dict
{'field1': 'new_value1', 'field2': 'new_value2', ...}
- secret_name: str
- Returns:
- None
- cgs.get_keeper_record(secret_name)
- Obtain a secret from keeper; only meant to be used if automated parsing fails
- Parameters:
- secret_name: str
- Name of secret to update
- secret_name: str
- Returns:
- keeper_secrets_manager_core.dto.dtos.Record
- cgs.worker.reset_mount_attributes()
- Redetermine existence and accessibility of "mounted" drive.
- Values will then be written to
cgs.worker.mount_exists
andcgs.worker.mount_access
cgs.worker
- The object whose methods are utilized, depending on the operating system. Inheriting fromcgs.AbstractWorker
, it is a singleton object (never created more than once). It should not be accessed by most users, but it does provide information about current global variables
- The memory cache is only available within the same python process; it has not been tested in multiprocessing or multithread environments.
- WARNING: This mounted drive will only be accessible by the first sudo user who ran the application. If it is necessary to undo a mistake, then discuss with the systems engineer, but the general approach will be to unmount and (carefully) remove the added entry in /etc/fstab, and then re-run the application.
- If a user does not have sudo access, then the application will only retrieve secrets from Keeper.
- Every user will be able to create their own hidden drive locaton of secrets, regardless of their administrative privileges