Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Process Graph Extraction From Recording Using pm4py #852

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

KrishPatel13
Copy link
Collaborator

What kind of change does this PR introduce?

Summary

Checklist

  • My code follows the style guidelines of OpenAdapt
  • I have performed a self-review of my code
  • If applicable, I have added tests to prove my fix is functional/effective
  • I have linted my code locally prior to submission
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (e.g. README.md, requirements.txt)
  • New and existing unit tests pass locally with my changes

How can your code be run and tested?

Other information

@KrishPatel13 KrishPatel13 self-assigned this Jul 14, 2024
@KrishPatel13 KrishPatel13 linked an issue Jul 14, 2024 that may be closed by this pull request
@KrishPatel13 KrishPatel13 changed the title Add Process Graph Generation Add Process Graph Extraction From Recording Using pm4py Jul 14, 2024
@KrishPatel13
Copy link
Collaborator Author

KrishPatel13 commented Jul 14, 2024

One thing we will need to take care of is:

image

We will need to edit install script to handle the installation of Graphviz if this PR gets merged to main.

Link: https://pm4py.fit.fraunhofer.de/static/assets/api/2.7.11/install.html#pip

We can add a new issue for above if this PR gets merged

@abrichr
Copy link
Member

abrichr commented Jul 15, 2024

@KrishPatel13 it's not clear to me that Graphviz is necessary, see e.g. pm4py/pm4py-core#425 (comment)

@KrishPatel13
Copy link
Collaborator Author

KrishPatel13 commented Jul 20, 2024

Output of commit: d7a2e6c is below:
image

The data for above is here:

case_id;activity;timestamp;costs;resource
3;register request;2010-12-30 14:32:00+01:00;50;Pete
3;examine casually;2010-12-30 15:06:00+01:00;400;Mike
3;check ticket;2010-12-30 16:34:00+01:00;100;Ellen
3;decide;2011-01-06 09:18:00+01:00;200;Sara
3;reinitiate request;2011-01-06 12:18:00+01:00;200;Sara
3;examine thoroughly;2011-01-06 13:06:00+01:00;400;Sean
3;check ticket;2011-01-08 11:43:00+01:00;100;Pete
3;decide;2011-01-09 09:55:00+01:00;200;Sara
3;pay compensation;2011-01-15 10:45:00+01:00;200;Ellen
2;register request;2010-12-30 11:32:00+01:00;50;Mike
2;check ticket;2010-12-30 12:12:00+01:00;100;Mike
2;examine casually;2010-12-30 14:16:00+01:00;400;Sean
2;decide;2011-01-05 11:22:00+01:00;200;Sara
2;pay compensation;2011-01-08 12:05:00+01:00;200;Ellen
1;register request;2010-12-30 11:02:00+01:00;50;Pete
1;examine thoroughly;2010-12-31 10:06:00+01:00;400;Sue
1;check ticket;2011-01-05 15:12:00+01:00;100;Mike
1;decide;2011-01-06 11:18:00+01:00;200;Sara
1;reject request;2011-01-07 14:24:00+01:00;200;Pete
6;register request;2011-01-06 15:02:00+01:00;50;Mike
6;examine casually;2011-01-06 16:06:00+01:00;400;Ellen
6;check ticket;2011-01-07 16:22:00+01:00;100;Mike
6;decide;2011-01-07 16:52:00+01:00;200;Sara
6;pay compensation;2011-01-16 11:47:00+01:00;200;Mike
5;register request;2011-01-06 09:02:00+01:00;50;Ellen
5;examine casually;2011-01-07 10:16:00+01:00;400;Mike
5;check ticket;2011-01-08 11:22:00+01:00;100;Pete
5;decide;2011-01-10 13:28:00+01:00;200;Sara
5;reinitiate request;2011-01-11 16:18:00+01:00;200;Sara
5;check ticket;2011-01-14 14:33:00+01:00;100;Ellen
5;examine casually;2011-01-16 15:50:00+01:00;400;Mike
5;decide;2011-01-19 11:18:00+01:00;200;Sara
5;reinitiate request;2011-01-20 12:48:00+01:00;200;Sara
5;examine casually;2011-01-21 09:06:00+01:00;400;Sue
5;check ticket;2011-01-21 11:34:00+01:00;100;Pete
5;decide;2011-01-23 13:12:00+01:00;200;Sara
5;reject request;2011-01-24 14:56:00+01:00;200;Mike
4;register request;2011-01-06 15:02:00+01:00;50;Pete
4;check ticket;2011-01-07 12:06:00+01:00;100;Mike
4;examine thoroughly;2011-01-08 14:43:00+01:00;400;Sean
4;decide;2011-01-09 12:02:00+01:00;200;Sara
4;reject request;2011-01-12 15:44:00+01:00;200;Ellen

@KrishPatel13
Copy link
Collaborator Author

KrishPatel13 commented Jul 20, 2024

The following sqlite3 commands will run the query mentioned in process-query.sql (present in same directory where you have sqlite3 open), and redirect its output to a file called dataout.csv in the CWD .

sqlite> .headers on
sqlite> .mode csv
sqlite> .once dataout.csv
sqlite> .read process-query.sql

My process-query.sql file is this so far:

select r.id as case_id, 
	we.title as activity, 
    -- ae."timestamp" as timestamp,
    datetime(ae."timestamp", 'unixepoch', 'localtime') AS "timestamp",
	COALESCE(ae."timestamp" - LAG(ae."timestamp") OVER (ORDER BY ae."timestamp"), 0) as costs,
	ae.name	as resource
from recording r
inner join action_event ae on r."timestamp" = ae.recording_timestamp 
inner join window_event we on r."timestamp" = we.recording_timestamp and we."timestamp" = ae.window_event_timestamp
where r.id = 1
order by r.id, ae."timestamp";

image

Once we have this dataout.csv we can read it using pandas and convert it to dataframe type and then make the process-graph using this piece of code:

import pm4py
import pandas

if __name__ == "__main__":
    log = pandas.read_csv("dataout.csv", sep=",")
    log = pm4py.format_dataframe(
        log,
        case_id="case_id",
        activity_key="activity",
        timestamp_key="timestamp",
        timest_format="%Y-%m-%d %H:%M:%S",
    )

    dfg, start_activities, end_activities = pm4py.discover_dfg(log)
    pm4py.view_dfg(dfg, start_activities, end_activities, format="html")

Output:
image

This is how the data looks like of my recording: https://github.com/OpenAdaptAI/OpenAdapt/pull/852/files#diff-03b12224302b60e98d6398edf88b09df24291a794a603ef435dd31b64bea8c8cR1

@KrishPatel13
Copy link
Collaborator Author

KrishPatel13 commented Jul 20, 2024

@abrichr What shall be our next step for this pr (or issue: #564), since I am able to produce a process-graph from our db, what shall be our next steps ?

@KrishPatel13
Copy link
Collaborator Author

KrishPatel13 commented Jul 20, 2024

I know we also want the action target as said here: #564 (comment), could you please give a bit more info that how do you want to see (visualize) the action target in the process-graph.

As currently, it just shows the activity column only on the process graph as seen here: #852 (comment).

So, I will need to do a bit more research on how to include the resource / other information in the process-graph and how to customize it according to our need.

I would love some general feedback and/or guidance for the work so far and next steps. Thank you.

@abrichr
Copy link
Member

abrichr commented Jul 20, 2024

Hi Krish, the activity should include the action target, as well as any other relevant information.

@KrishPatel13
Copy link
Collaborator Author

It means I can format an activity to be like:

" {title}~{action}~{action_target} "

and make this string under activity column. This way, we could have different information showacse in the process graph and also indexed.

@KrishPatel13
Copy link
Collaborator Author

I will give it a shot and I will let you its results.

@KrishPatel13
Copy link
Collaborator Author

KrishPatel13 commented Jul 21, 2024

@abrichr I have an update for this PR. I have tried to make a process-graph to showcase relevant details of each action event.

For that, could you see this output html page:

openadapt/process-graph/output/tmpkbsu9k18.html

OR below:

Recording.2024-07-20.224743.mp4

@KrishPatel13
Copy link
Collaborator Author

I more change that I have to do in this is:

image

FOr press/prelease events, I need to pick action_tager from either (key_name OR key_char), whicheveyr is NOT Null

@abrichr
Copy link
Member

abrichr commented Jul 21, 2024

Use the events.py::get_events function with process_events=True in order to get keyframe events (ignore children).

Use ActionEvent.text to get action representation.

Use openadapt/strategies/visual.py::add_active_segment_descriptions to get targets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extract process models with pm4py
2 participants