- Requirements
- Pre-Deployment
- Deployment
- Post-Deployment
- Creating a User
- Activating User Self Sign up
Before you deploy, you must have the following softwares installed on your device. Please install the correct version of each software according to your machine's operating system. As of October 31, 2024 this deployment has been verified with the following versions of softwares:
- AWS CLI (v2.15.43)
- AWS CDK (v2.149.0)
- npm (v10.7.0)
- node (v20.12.2)
- Docker Desktop (v26.0.0)
- git (v2.45.0.windows.1)
Once you have downloaded docker desktop launch and setup the application. Once the application is setup leave the engine running.
If you are on a Windows device, it is recommended to install the Windows Subsystem For Linux, which lets you run a Linux terminal on your Windows computer natively. Some of the steps will require its use. Windows Terminal is also recommended for using WSL.
To deploy this solution, you will need to generate a GitHub personal access token. Please visit here for detailed instruction to create a personal access token.
Note: when selecting the scopes to grant the token (step 8 of the instruction), make sure you select repo
scope.
Once you create a token, please note down its value as you will use it later in the deployment process.
The application interact with various third-party API, so it is vital to obtain those API keys/tokens and securely store them before moving forward with the deployment.
- Elsevier API Key and Institution Token obtainable via the Elsevier Developer Portal.
- European Patent Office's OPS API Key obtainable by following the instructions on the Open Patent Services portal.
First, you need to fork the repository. To create a fork, navigate to the main branch of this repository. Then, in the top-right corner, click Fork
.
You will be directed to the page where you can customize owner, repository name, etc, but you do not have to change any option. Simply click Create fork
in the bottom right corner.
Now let's clone the GitHub repository onto your machine. To do this:
- Create a folder on your computer to contain the project code.
- For an Apple computer, open Terminal. If on a Windows machine, open Command Prompt or Windows Terminal. Enter into the folder you made using the command
cd path/to/folder
. To find the path to a folder on a Mac, right click on the folder and pressGet Info
, then select the whole text found underWhere:
and copy with ⌘C. On Windows (not WSL), enter into the folder on File Explorer and click on the path box (located to the left of the search bar), then copy the whole text that shows up. - Clone the GitHub repository by entering the following command. Be sure to replace
<YOUR-GITHUB-USERNAME>
with your own username.
git clone https://github.com/<YOUR-GITHUB-USERNAME>/FacultyCV
The code should now be in the folder you created. Navigate into the root folder containing the entire codebase by running the command:
cd FacultyCV
Go into the cdk folder which can be done with the following command:
cd cdk
Now that you are in the cdk directory, install the core dependencies with the following command:
npm install
Go into the frontend folder which can be done with the following command:
cd ../frontend
Now that you are in the frontend directory, install the core dependencies with the following command:
npm install
You would have to supply your GitHub personal access token you created eariler when deploying the solution. Run the following command and ensure you replace <YOUR-GITHUB-TOKEN>
and <YOUR-PROFILE-NAME>
with your actual GitHub token and the appropriate AWS profile name.
aws secretsmanager create-secret \
--name github-access-token-facultyCV \
--secret-string "{\"github-token\":\"<YOUR-GITHUB-TOKEN>\"}"\
--profile <YOUR-PROFILE-NAME>
Moreover, you will need to upload your github username to Amazon SSM Parameter Store. You can do so by running the following command. Make sure you replace <YOUR-GITHUB-USERNAME>
and <YOUR-PROFILE-NAME>
with your actual username and the appropriate AWS profile name.
aws ssm put-parameter \
--name "facultycv-owner-name" \
--value "<YOUR-GITHUB-USERNAME>" \
--type String \
--profile <YOUR-PROFILE-NAME>
It's time to set up everything that goes on behind the scenes! For more information on how the backend works, feel free to refer to the Architecture Deep Dive, but an understanding of the backend is not necessary for deployment.
Navigate to the cdk directory in the repository using the following command.
cd cdk
While in the cdk
folder, run the following commands. Ensure you replace "INSTITUTION_TOKEN" in the first command with your own Elsevier institution token and you replace "API_KEY" in the second command with your own Elsevier API key.
aws ssm put-parameter --name "/service/elsevier/api/user_name/instoken" --value "INSTITUTION_TOKEN" --type SecureString --overwrite --profile <YOUR-PROFILE-NAME>
aws ssm put-parameter --name "/service/elsevier/api/user_name/key" --value "API_KEY" --type SecureString --overwrite --profile <YOUR-PROFILE-NAME>
You would also have to supply a custom database username when deploying the solution to increase security. Run the following command and ensure you replace DB-USERNAME
with the custom name of your choice.
aws secretsmanager create-secret \
--name facultyCV-dbUsername \
--description "Custom username for PostgreSQL database" \
--secret-string "{\"username\":\"DB-USERNAME\"}" \
--profile <YOUR-PROFILE-NAME>
For example: you want to set the database username as "facultyCV"
aws secretsmanager create-secret \
--name facultyCV-dbUsername \
--description "Custom username for PostgreSQL database" \
--secret-string "{\"username\":\"facultyCV\"}" \
--profile <YOUR-PROFILE-NAME>
Similar to Elsevier API, you would have to obtain a consumer key and consumer secret key to be able to use the OPSv3.2 API. Store the secrets in Secret Manager by doing the following, replacing CONSUMER_KEY
and CONSUMER_SECRET_KEY
with the appropriate values:
aws secretsmanager create-secret \
--name "facultyCV/credentials/opsApi" \
--description "API keys for OPS" \
--secret-string "{\"consumer_key\":\"CONSUMER_KEY\",\"consumer_secret_key\":\"CONSUMER_SECRET_KEY\"}" \
--profile <YOUR-PROFILE-NAME>
The following set of instructions are only if you want to deploy this application in a hybrid cloud environment. If you do not want to do this you can skip to 3b: CDK Deployment.
In order to deploy in a hybrid cloud environment, you will need to have access to the aws-controltower-VPC and the name of your AWSControlTowerStackSet.
-
Modify the VPC Stack:
-
Navigate to the
vpc-stack.ts
file located atcdk/lib/vpc-stack.ts
. -
Replace line 13 with your existing VPC ID:
const existingVpcId: string = 'your-vpc-id'; //CHANGE IF DEPLOYING WITH EXISTING VPC
You can find your VPC ID by navigating to the VPC dashboard in the AWS Management Console and locating the VPC in the
Your VPCs
section.
-
-
Update the AWS Control Tower Stack Set:
-
Replace line 21 with your AWS Control Tower Stack Set name:
const AWSControlTowerStackSet = "your-stackset-name"; //CHANGE TO YOUR CONTROL TOWER STACK SET
You can find this name by navigating to the CloudFormation dashboard in AWS, under
Stacks
. Look for a stack name that starts withStackSet-AWSControlTowerBP-VPC-ACCOUNT-FACTORY
.
-
You can proceed with the rest of the deployment instructions and the Vpc Stack will automatically use your existing VPC instead of creating a new one. For more detailed information about the hybrid cloud deployment you checkout the Hybrid Cloud Deployment Guide
Initialize the CDK stacks, replacing <YOUR_AWS_ACCOUNT_ID>
, <YOUR_ACCOUNT_REGION>
and <YOUR-PROFILE-NAME>
. with the appropriate values. NOTE: Remember to have your Docker daemon running.
cdk bootstrap aws://<YOUR_AWS_ACCOUNT_ID>/<YOUR_ACCOUNT_REGION> --profile <YOUR-PROFILE-NAME>
cdk synth --profile <YOUR-PROFILE-NAME>
Deploy the CDK stacks:
Note for deploying the PatentDataStack: You must make a note of what the name of your institution appear on Espacenet or by working with a representative from the European Patent Office. We highly recommend working with a patent specialist to determine the exact name that represents your institution on Espacenet/EPO.
For example, it was determined that the EPO/Espacenet use "UNIV BRITISH COLUMBIA" to represent UNIVERSITY OF BRITISH COLUMBIA.
Thus you should do the following if you would like to deploy only the Patent Data Stack:
cdk deploy <PREFIX>-PatentDataStack --parameters <PREFIX>-PatentDataStack:epoInstitutionName="UNIV BRITISH COLUMBIA,UNIVERSITY OF BRITISH COLUMBIA" --profile <YOUR-PROFILE-NAME> --context prefix=<PREFIX>
Note that the two name "UNIV BRITISH COLUMBIA,UNIVERSITY OF BRITISH COLUMBIA"
is separated by a comma, and there is no space before or after the comma. <PREFIX>
should be replaced with a suitable all lowercase string that is appended to the beginning of all the project resources
You may run the following command to deploy the stacks all at once. Again, replace <YOUR-INSTITUTION-NAME>
with the name that represents your instiution on Espacenet/EPO,<YOUR-PROFILE-NAME>
with the appropriate AWS profile used earlier, and <PREFIX>
with a suitable all lowercase string that is appended to the beginning of all the project resources.
cdk deploy --all --parameters <PREFIX>-PatentDataStack:epoInstitutionName="<YOUR-INSTITUTION-NAME>" --profile <YOUR-PROFILE-NAME> --context prefix=<PREFIX>
- Example:
cdk deploy --all --parameters facultycv-PatentDataStack:epoInstitutionName="UNIV BRITISH COLUMBIA,UNIVERSITY OF BRITISH COLUMBIA" --profile <YOUR-PROFILE-NAME> --context prefix=facultycv
To take down the deployed stack, navigate to AWS Cloudformation, click on the stack(s) and hit Delete. Please delete the stacks in the opposite order of how they were deployed. The deletion order is as follows:
- PatentDataStack
- GrantDataStack
- DataFetchStack
- DbFetchStack
- AmplifyStack
- Resolver3Stack
- Resolver2Stack
- ResolverStack
- ApiStack
- CVGenStack
- DatabaseStack
- VpcStack
- First ensure you have all the CSV files needed to be put into the S3 bucket. Examples of how these files should be formatted can be found here: Example CSV Files.
- At the AWS online console, enter
S3
in the search bar. - In the
Buckets
search bar enteruser-data
and click on the name of the bucket (the actual name will vary a bit). - In this bucket you will need to make a new folder by clicking on the
create folder
button. - Name this folder
user_data
and clicksave
. - Click on the
user_data
folder then clickUpload
. - Click
Add Files
and select theinstitution_data.csv
,university_info.csv
,data_sections.csv
, andteaching_data.csv
files. Remember the names of these files must be exactly the same or the S3 trigger will not work. ClickUpload
to complete this process. - Once the upload is complete click
Close
-
Refer to the User Guide to Grant Downloads for instructions on how to obtain the grant data for your institution. After obtaining the CSV files, double check your files with the sample csv files labeled
sample_cihr.csv
,sample_nserc.csv
,sample_cfi.csv
,sample_sshrc.csv
,sshrc_program_codes.csv
andsample_rise.csv
. Ensure that the format is similar to the sample files. -
At the AWS online console, enter
S3
in the search bar. There are two buckets whose name contain grantdatastack. One contains the scripts for the glue jobs that run to clean and store the grant/patent data. The bucket you must upload your CSV files to is thegrantdatastack
bucket withoutglues
in the name. -
There are a folder called
raw
already created for you at deployment, and it contains 5 subfolders (cihr
,cfi
,nserc
,sshrc
,rise
). Inside each of the subfolder, upload the corresponding CSV file for that grant there. For SSHRC, please also remember to include thesshrc_program_codes.csv
file along with the SSHRC grant data CSV file. The resulting folder structure should look like this:
raw/
├── cihr/
│ └── your-cihr-file.csv
├── nserc/
│ └── your-nserc-file.csv
├── sshrc/
│ ├── your-sshrc-file.csv
│ └── sshrc_program_codes.csv
├── cfi/
│ └── your-cfi-file.csv
└── rise/
└── your-rise-file.csv
NOTE:
- If you found out that you there was a mistake in the uploading process, either you put the wrong files in the wrong folders, or there were extra files uploaded accidentally, then you should delete the wrong file then wait for 20 minutes and redo the uploading process.
- In the extremely unlikely situation that you do not see the
raw
folder and its 5 subfolders automatically created during first-time deployment, you can also manually create theraw
folder first, then the 4 subfolders inside.
-
If the uploading process was performed correctly, the Grant Data Pipeline will automatically be invoked and the new data will show up in the RDS PostgreSQL database after around 20 min or so.
-
After around 20 minutes, navigate to the S3 bucket that you uploaded the grant earlier. If you're still having that page open, simply refresh the page. If this Grant Data Pipeline has successfully executed, you should see another folder called clean being added in addition to your raw folder.
-
By going into this new folder, you should see that it has a similar subfolder structure to raw. You dont have to do anything further.
-
If you see that a folder(s) is missing. Please wait for another 10 or so minutes because this could be a latency issue. If you came back and check and that missing folder still has not show up, then it is possible that a wrong file was uploaded in raw folder. Please double check your raw folder and follow the instructions above to reupload accordingly.
-
Search for a Glue job that contains the string
fetchEpoPatents
. -
No further action is needed. This will execute the entire pipeline. This takes roughly 15 minutes.
NOTE: You would have to run the steps above for first-time deployment. The Patent Data Pipeline is scheduled to run on every month on the 1st and 15th day (twice a month).
Log in to AWS console, and navigate to AWS Amplify. You can do so by typing Amplify
in the search bar at the top.
From All apps
, click faculty-cv-amplify
. The first time you enter this console, you will need to follow a series of straightforward instructions to configure your GitHub app and give permission to Amplify to modify your repo.
After this go back to All apps
, click faculty-cv-amplify
to go to the app settings. Note down the App ID.
You may run the following command to build the app. Please replace <APP-ID>
with the app ID found in amplify and <PROFILE-NAME>
with the appropriate AWS profile used earlier.
aws amplify start-job --job-type RELEASE --app-id <APP-ID> --branch-name main --profile <PROFILE-NAME>
This will trigger the build.
When the build is completed, you will see the screen as shown in the below image.
Please note down the URL highlighted in red, as this will be the URL of the web application.
Click on Hosting
in the left taskbar and click on Rewrites and redirects
.
Here click on Manage redirects
and then Add Rewrite
to add a redirect with:
- Source Address:
</^[^.]+$|\.(?!(css|gif|ico|jpg|js|png|txt|svg|woff|woff2|ttf|map|json|webp)$)([^.]+$)/>
- Target Address:
/
- Type:
404 (Redirect)
And then click Save
.
Refer to AWS's Page on Single Page Apps for further information on why we did that.
Now you can navigate to the URL you created in step 1 to see your application in action.
To set up a user account on the app, you will need to do the following steps:
- At the AWS online console, enter
Cognito
in the search bar. - Click
User Pools
from the left hand sidebar and select the user poolfaculty-cv-user-pool
. - Click the
Users
tab, then clickCreate User
. - For Invitation message, select
Send an email invitation
. Then fill in the user's email address in the Email address text field below and select theMark email address as verified
. For Temporary password, selectGenerate a password
. Then clickCreate User
. - The user will receive an email to the email address that was previously entered containing their temporary password.
- When the user enters their email and temporary password on the sign in page of the app, they will then be prompted to replace their temporary password by setting a new password and choosing the role they want for the account.
- The new user account has been created!
Note: The first Admin must be made manually through Cognito, but after that the Admin is able to make any user an Admin or Department Admin through the application once they have signed up if user self sign up is enabled. Department Admins can make other users in their department a Department Admin for their department as well.
- Navigate back to same user pool in the previous step on the Cognito Console, click on
Sign-up experience
. - In order to activate self sign up, the
Self-registration
option must be enabled. If it is not, simply click theEdit
button and enable the feature. This allow users to create their own accounts.