This project will demonstrate how you can use Amazon Textract to extract valuable and relevant financial info from financial documents in an asynchronous manner.
There are 2 main projects in this folder:
- cdk: your IaC (Infrastructure as Code)
- textract-async-invoker (Lambda functions -- the guts of the business logic)
You'll use the CDK project to setup your AWS infrastructure. If you're not familiar with what AWS Cloud Development Kit is I recommend you spend an hour or two experimenting with it since it'll save you a ton of time deploying, customizing, redeploying, testing, and fixing/debugging the provisioned infrastructure.
There are 2 Lambda functions:
- TextractAsyncInvoker
- TextractJobIdHandler
The first one submits the Textract job and the second one receives the result of the job
- Change the property values under cdk/src/main/resources/application.properties
Enter your correct account number --> aws.accountnumber=XXXXXXXXX
Modify any other variable value with the names that you'd like to use...if the deployment fails that might mean the S3 bucket name has been taken already so you might need to change it (the Exceptions are quite explicit so just read through them)
-
Build the
cdk
project (go to the cdk folder and typebash mvn package && cdk synth
). Once you do that you'll see the following line:Supply a stack id (S3UploadsBucketStack, TextractJobNotificationsStack, S3DataLakeBucketStack, DDBStack) to display its template.
-
To deploy all of the resources that were just complied type
cdk deploy \*
, otherwise deploy each one of them independenlty by typing out their namese.g. cdk deploy "S3UploadsBucketStacK"
During the deployment you will be prompted multiple times to type in 'y' to proceed with the installation process so you'll need to be watching the installation (most of the prompts are to confirm that you want to create the appropriate IAM roles) -
[Optional] If you want to delete everything (some resources will not be removed though) then type
cdk destroy \*
and follow the prompts
Voila! Your infrastructure is provisioned! Now you're ready to compile and deploy your Lambda functions
- Change the account and region property value under textract-async-invoker/src/main/resources/application.properties
Enter your correct account number --> aws.accountnumber=XXXXXXXXX
- Build the project
./gradlew fatJar
- Deploy the Lambda functions
serverless deploy