Now that we have the web app deployed, we can see that some claims are still unprocessed.
Of course, we want to execute this processing, and it’s even better if it can be fully automated!
For that, we will use a pipeline that can either be run ad-hoc or scheduled just like, the confidence check pipeline. However, in this case, it won’t technically be a Data Science Pipeline. It will be more of a raw Tekton Pipeline.
If you navigate to parasol-insurance/lab-materials/05/05-05
you can see a variety of files.
This time, we will use the yaml definition of a pipeline, process_claims.yaml
, to process the claims.
Here are the main files of the pipeline and what they do:
-
get_claims - Will connect to the database, fetch any unprocessed claims, and add them to a list that will be passed to the other tasks through a file:
claims.json
-
The following scripts will go through all the claims that need to be processed, and use the full body of the text to try and find some important feature, then push the results to the database:
-
get_location - Finds the location of the accident
-
get_accident_time - Finds the time of the accident
-
summarize_text - Makes a short summary of the text
-
get_sentiment - Gets the sentiment of the text
-
-
detect_objects - Downloads the images of the claim and uses the served object-detection model to classify the damages in the image
Note
|
In the folder, we still have for reference an Elyra version of the pipeline (process_claims.pipeline ), but you cannot really use it from VSCode, which is the environment you should still be in.
|
Before we can run the pipeline, we need to create a PVC that will be used to store intermediary files and results in.
-
Go to the {ocp-short} Console
-
Make sure and change your view from Developer to Administrator
-
Under the Administrator view, navigate to Storage → PersistentVolumeClaims
-
Make sure you are in the right project (your username) and then press Create PersistentVolumeClaim
-
Use these settings:
-
StorageClass:
ocs-storagecluster-cephfs
-
PersistentVolumeClaim name:
processing-pipeline-storage
-
Access mode:
Shared access (RWX)
-
Size:
1 GiB
-
-
it should look like:
-
Then press Create
To import the pipeline, start by downloading the process_claims.yaml
file locally.
Navigate to parasol-insurance/lab-materials/05/05-05
to find it.
-
Start by downloading the
process_claims.yaml
file locally to your laptop-
In your VSCode Workbench, right-click on the file, and select Download
-
Save the file somewhere on your laptop
-
-
Then go to the {rhoai} Dashboard
-
Select your Data Science project
-
Scroll down until you see the Pipelines section
-
Click Import Pipeline
-
Now upload the
process_claims.yaml
file, either by drag-and-dropping or using the Upload button -
Then make sure to give your pipeline a good name like
Process Claims Pipeline
-
It should look something like this afterwards:
-
Click Import Pipeline and you should see it appear under the pipelines section of your Project
-
Now go into the settings at the right side
-
Click Create Run to create a new run of the pipeline you just added
-
Use these settings:
-
Name:
Process Claim Run
-
Run type:
Run once immediately after creation
-
claim_id:
3
-
detection_endpoint:
http://modelmesh-serving.{user}:8008
-
This is the same route to the object detection endpoint that was used earlier in the workshop.
-
After the pipeline has finished running, you can go to the app and take a look at the claims
-
You will see that claim 3 is now processed
-
Click on claim3
-
Instead of just a long body, you will now see a summary, a location field, an accident time field, and a sentiment field
-
You can also see that we have new image(s) which have bounding boxes where the damage is