Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: environment variables validation #216

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

AlexanderLukin
Copy link
Contributor

Refactor environment variable validation rules. Apply consistent patterns to env validation.

The main motivation for this change was as follows:

  1. Make env validation more strict and consistent. Stick to the same validation patterns in different projects.

  2. Provide consistent fallback options for optional variables.

Here are the main changes:

  1. Previously if a user set an empty string as the variable name in the .env file like this:
VARIABLE_NAME=

not all variable initializers assigned a correct default value to the variable in this case. Now it is fixed.
For the purpose of this improvement, now all optional variable initializers always have the @Transform decorator that provides a default value for the variable.
Also for this purpose, the toBoolean function has been refactored to be more similar to the toNumber function.

  1. Make validation of boolean variables more strict.
    CAUTION: This might be a breaking change for users who use values like "yes" to initialize boolean variables.

  2. Implement the new toArrayOfUrls() function to provide a reasonable default value for variables that should have a list of URLs in their values.
    This function also fixes a bug with default values for such variables. Previously, if a user set an empty string as a value of such variables, the validation worked incorrectly.

  3. Now we validate that values in PROVIDERS_URLS and CL_API_URLS are indeed URLs.

  4. The CHAIN_ID variable now supports only a pre-defined list of testnet IDs. Add the appropriate Network enum for this purpose.
    CAUTION: Currently the Keys API doesn't officially support testnets other than Goerli, Holesky, and Mainnet. But this might be a breaking change if some users use Keys API for other testnets.

  5. More strict validation for the PORT and DB_PORT values.
    CAUTION: This might be a breaking change.

  6. Add the @IsNotEmpty decorator to the required variables.

  7. Fix confusing validation rules for the DB_PASSWORD variable. Add default value.

Refactor environment variable validation rules. Apply consistent
patterns to env validation.

The main motivation for this change was as follows:

1. Make env validation more strict and consistent. Stick to the same
validation patterns in different projects.

2. Provide consistent fallback options for optional variables.

Here are the main changes:

1. Previously if a user set an empty string as the variable name in the
.env file like this: ``` VARIABLE_NAME= ``` not all variable
initializers assigned a correct default value to the variable in this
case. Now it is fixed. For the purpose of this improvement, now all
optional variable initializers always have the `@Transform` decorator
that provides a default value for the variable. Also for this purpose,
the `toBoolean` function has been refactored to be more similar to the
`toNumber` function.

2. Make validation of boolean variables more strict. CAUTION: This might
be a breaking change for users who use values like "yes" to initialize
boolean variables.

3. Implement the new `toArrayOfUrls()` function to provide a reasonable
default value for variables that should have a list of URLs in their
values. This function also fixes a bug with default values for such
variables. Previously, if a user set an empty string as a value of such
variables, the validation worked incorrectly.

4. Now we validate that values in `PROVIDERS_URLS` and `CL_API_URLS` are
indeed URLs.

5. The `CHAIN_ID` variable now supports only a pre-defined list of
testnet IDs. Add the appropriate `Network` enum for this purpose.
CAUTION: Currently the Keys API doesn't officially support testnets
other than Goerli, Holesky, and Mainnet. But this might be a breaking
change if some users use Keys API for other testnets.

6. More strict validation for the `PORT` and `DB_PORT` values. CAUTION:
This might be a breaking change.

7. Add the `@IsNotEmpty` decorator to the required variables.

8. Fix confusing validation rules for the `DB_PASSWORD` variable. Add
default value.
@IsArray()
@ArrayMinSize(1)
@Transform(({ value }) => value.split(',').map((url) => url.replace(/\/$/, '')))
@IsUrl({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this check run after Transform ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer is yes. The @Transform() decorator is always executed first. Then all other validation decorators are executed.

It works this way.

  1. If the PROVIDERS_URLS is not presented in the .env file, the function in the @Transform() decorator is not called. As the default value is not assigned to the PROVIDERS_URLS variable, the @IsNotEmpty() and other decorators will return the appropriate error.

  2. If there is a value for the PROVIDERS_URLS in the .env. file (including the empty value PROVIDERS_URLS=), the function in the @Transform() decorator will be called first, and then all other validation decorators will be called.

@Transform(({ value }) => parseInt(value, 10))
CHAIN_ID!: number;
CHAIN_ID!: Network;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be named Chain instead of Network

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I agree. Chain is much better. There was also a good idea to use this constant
import { CHAINS } from '@lido-nestjs/constants';
instead of the custom enum. If you don't mind, I'll do it this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer to write for kapi CHAINS enum, that will contain chains that kapi support

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agree with you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it is mainnet, goerli and holesky


@IsNotEmpty()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check how it behave now without @isnotempty() check ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't set IsNotEmpty() here it will work as follows.

  1. If the DB_HOST variable is not presented in the .env file, the validation will throw an error, because undefined is not a string.

  2. If a user sets an empty string in the .env file like this: DB_HOST=, the validation error will not be thrown, because the empty string is a string, everything is OK.

I don't think that we want to allow empty strings here.

@IsString()
DB_USER!: string;

@IsOptional()
@IsString()
@IsNotEmpty()
DB_PASSWORD!: string;
DB_PASSWORD = '';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ? it was @isnotempty() , now it is opposite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, both @IsOptional() and @IsNotEmpty() rules were applied to this variable at the same time and it was pretty confusing.

My point here is that the service should allow users to set up connections to unprotected databases (DBs without passwords) if they want to do that.

But Maxim also expressed an opinion, that for the purpose of protecting the user's development infrastructure, the service should return an error if someone wants to connect it to an unprotected database. This should protect users if they forget to set the value for the DB_PASSWORD variable in the .env file.

As a compromise solution, we can log a warning if someone didn't provide a non-empty value for the DB password, but not prohibit that.

What do you think?

Copy link
Contributor

@Amuhar Amuhar Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it could break reading password from file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently setting password is required but you could set it via DB_PASSWORD or DB_PASSWORD_FILE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this aspect internally today with Kirill and Maxim. Yes, I agree that this variable should be required.
Protecting the infrastructure from oversight mistakes is more important than allowing to run the app on databases without passwords. I'll update the PR.

}

const toArrayOfUrls = (url: string | null): string[] => {
if (url == null || url === '') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it extra check ? Because all urls required, or depends on another config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check protects the app from the situation when a user sets an empty string as a value of the PROVIDERS_URLS for example. Like this: PROVIDERS_URLS=

Previously the @Transform() decorator had the following implementation:
@Transform(({ value }) => value.split(',').map((url) => url.replace(/\/$/, '')))

Assume a user sets this in the .env file: PROVIDERS_URLS=. The function in the @Transform decorator is executed first. It splits the empty string and converts it into an array. The result is the array that consists of only one element - an empty string: [""]. The next .map() function converts this array to the same result: [""]. Then validation decorators look at this value. It is an array with at least one element, everything is OK. The previous code accepted empty strings here and it was a bug. Now we have an extra @IsUrl() check that will protect the app from this case. But the check that you mentioned allows us to make sure that empty strings will never be accepted as URL values regardless of the @IsUrl() decorator. I prefer to leave this check in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

if (!(typeof value === 'string')) {
return false;
}
const str = value.toString().toLowerCase().trim();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if it is number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the previous code gave users too much unnecessary freedom in setting boolean values. I think it's better to be more strict here and allow users to set only "true" or "false" strings as valid values of boolean variables and throw errors otherwise.

But I understand that it is a breaking change. If we know or can reasonably assume that some KAPI users should be able to set "1" or "yes" as valid values of boolean variables for a good reason, or some users already doing it this way, we should revert back to the previous behavior.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a lot of users, so i dont know
i think in guides was true/false value, so they used it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the results of today's internal discussion, it has been decided to revert back support of numerical values and special constants like "yes" and "no", but return an error if the undefined, null or empty string was provided. I'll update the PR.
@Amuhar please, let me know if you have any objections.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, but i agree that just boolean values look better

@@ -17,3 +17,9 @@ export enum LogFormat {
json = 'json',
simple = 'simple',
}

export enum Network {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it is chain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already responded above. I agree, "Chain" is better.

Resolve review comments. The main changes are:

1. Loosen validation rules for boolean variables. Now numbers "1" and
"0" and constants "yes" and "no" are accepted as valid boolean values.

2. Make DB password value required. Remove the default empty DB password
value.

3. Rename the `Network` interface to `Chain`.
@AlexanderLukin
Copy link
Contributor Author

OK, I updated the PR to support the suggestions above. The changes are:

  1. Loosen validation rules for boolean variables. Now numbers "1" and "0" and constants "yes" and "no" are accepted as valid boolean values.

  2. Make DB password value required. Remove the default empty DB password value.

  3. Rename the Network interface to Chain.

Anna, please, take a look when you have time. It's not urgent by the way. I'm completely OK if you do this at the beginning of December.


const str = value.toString().toLowerCase().trim();

switch (str) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I did a similar solution for Ejector we came across the fact that in some environments boolean variables can be capitalized, it would be good to take this into account.

https://github.com/lidofinance/lido-ts-nanolib/pull/5/files#diff-146bc8ef351e775395fe473346533483494222a0a9770be47547b8e6619a8188

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it has been taken into account. As you can see in the quoted code in line 38, I cast the input to lowercase before making comparisons. So, all strings in all registers should be accepted (uppercase, lowercase, capitalized).

@eddort
Copy link
Member

eddort commented Dec 7, 2023

Good job!
I suggest creating a module in lido-nestjs, adding these features, and making a standardized solution for all our projects.
Usually, config validators can load fields as files (this feature was needed in Council Daemon). This is a feature I have implemented in the Council Daemon itself. If you combine all these features, you will have a very useful library that will allow you to describe configs in a unified style.

@AlexanderLukin
Copy link
Contributor Author

Yes, that's a good idea! I agree. I implemented a very similar boilerplate code for this validation also in EVM and operators widget backend. Moving this logic to some common place will allow us to reduce this repetition.

Looks like this needs a bit more internal discussion, but I like the idea. If everyone agrees with this, I'll do that and update this PR once we move this validation logic to lido-nestjs.

@AlexanderLukin AlexanderLukin marked this pull request as draft February 20, 2024 13:47
AlexanderLukin added a commit to lidofinance/lido-nestjs-modules that referenced this pull request Mar 19, 2024
In many application repositories, we have an environmental variable
validation engine. It is typical for this engine to have code that needs
to transform incoming environmental variable values to values of a
specific type. These tasks are repeatable in many projects:
lidofinance/ethereum-validators-monitoring#224
lidofinance/node-operators-widget-backend-ts#41
lidofinance/lido-keys-api#216

It makes sense to move this repeated code from many projects to some
common place.

The new "transform" module introduced in this PR collects these common
transformation functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants