Add feature to define findings manually in the model, similar to over… #180

nozmore · 2021-10-02T01:54:41Z

…rides

The idea here is after working with a dev team and threats are identified which does not exist in the pytm threatlib they can be manually added to the model. Similar to using overrides to modify a threat/finding. Alternatively someone could use pytm only for model definition, dfd/seq generation and reporting using manual findings instead of using any of the threatlib.

I originally called these "custom findings" and named variables accordingly. I was thinking of renaming these to "manual findings", thoughts?

I would like to update tm.py and the README.md with a real world example that makes sense given the base tm model. I haven't spent any mental energy to come up with one that good and doesn't exist in the threatlib. If anyone has any thoughts let me know.

…rides

ghost · 2021-10-02T01:56:03Z

CodeSee Review Map:

Review in an interactive map

View more CodeSee Maps

Legend

raphaelahrens · 2021-10-02T05:09:06Z

Hi,

I think the notation is a little bit redundant

web.custom_findings = [
    Finding(web,
        id = "foo_id_3",
        description = "foo_description",
        details = "foo_details",
        severity = "HIGH",
        mitigations = "foo_mitigations",
        example = "foo_example",
        threat_id = "foo_threat_id_3",
        references = "foo_references",
        response = "accepted",
    ),
]

could just be something like

f(
  web,
  "foo_description",
  details = "foo_details",
  severity = "HIGH",
  mitigations = "foo_mitigations",
  example = "foo_example",
  threat_id = "foo_threat_id_3",
  references = "foo_references",
  response = "accepted",
)

So a function or class constructor could take care of adding the new Finding to the custom/manual_finding of web.
This removes the double use web and the list notation.
It can also create an id for us if we don't want to specify one.
It can also have required fields, like the description, directly as parameters.

If you prefer to have the element in the beginning of the line maybe

web.add_finding( ...

Regarding the name I tend towards manual, but I'm not sure if there should be a separation between manual and automatic findinds.

nozmore · 2021-10-02T15:49:43Z

Yes, I meant to mention that in the description, the redundancy is another reason I had this as draft. We haven't used methods for assignment in tm.py before so I just went with the same approach and obviously this doesn't work for json import. At minimum I was going to add logic to remove the dependency requirement.

I think the required fields are up to the report format, not sure if we should impose this. For an internal threat maybe from a custom threat library it may just be a pointer or reference to a ticket.

I was thinking instead of treating them as separate findings in the report I could add them to the main findings list and the findings instance variables but add a 'source' field to the Finding (default would be 'pytm', source could be 'manual' for these findings or a custom value set in the model. To the earlier point when I copy them from the custom_findings instance variable to findings I could assign the target element to remove the redundancy.

raphaelahrens · 2021-10-02T16:07:21Z

Source could also be an author, so that you have someone you can contact?

colesmj · 2021-10-02T16:25:29Z

A couple of comments on this thread:

It would potentially be useful to know the source of a finding as automatic (from pytm detection) or manual (human added) or possibly others like external or inherited.

Findings should have common fields such as summary, description (long or short), etc, and should probably also contain references or some field to support a contact url or email or something (maybe a link to a support or security team, etc)

To @nozmore 's comment about tying fields to the report, I'm a big fan of delinking. The findings should have metadata or data to be meaningful, the report can use what it wants/needs from findings, as long as there is a common set of fields and logic to detect when something is empty or missing.

nozmore · 2021-10-02T16:33:36Z

Source could also be an author, so that you have someone you can contact?

It would potentially be useful to know the source of a finding as automatic (from pytm detection) or manual (human added) or possibly others like external or inherited.

Exactly 'pytm' when found using the threatlib, 'manual' if the finding is defined in the model but 'source' is not set. any string (except for 'pytm') if 'source' is defined in the model.

colesmj · 2021-10-02T16:38:18Z

Source could also be an author, so that you have someone you can contact?

It would potentially be useful to know the source of a finding as automatic (from pytm detection) or manual (human added) or possibly others like external or inherited.

Exactly 'pytm' when found using the threatlib, 'manual' if the finding is defined in the model but 'source' is not set. any string (except for 'pytm') if 'source' is defined in the model.

Question: why should the finding(s) be defined in the model, and not with the model?

nozmore · 2021-10-02T16:49:40Z

Findings should have common fields such as summary, description (long or short), etc, and should probably also contain references or some field to support a contact url or email or something (maybe a link to a support or security team, etc)

To @nozmore 's comment about tying fields to the report, I'm a big fan of delinking. The findings should have metadata or data to be meaningful, the report can use what it wants/needs from findings, as long as there is a common set of fields and logic to detect when something is empty or missing.

I re-used the Finding class so they do have a common set of fields however for these manually identified findings I don't believe requiring fields is for us to say. For threatlib findings, yes, fully agree. For these I think the organization maintaining their report template would need to decide what fields are required. Currently target is required and should be, adding description is reasonable so there is a way to describe the finding or point to something.

I'm open to making whatever fields as required that we all agree to, I just don't think its necessary and will lead to people putting junk data like I did to satisfy the requirements of the tool.

colesmj · 2021-10-02T18:10:58Z

I'm open to making whatever fields as required that we all agree to, I just don't think its necessary and will lead to people putting junk data like I did to satisfy the requirements of the tool.

I'm not suggesting making fields mandatory unless they make sense - summary, description, source (how finding got into the list) certainly, and an identifier (auto generated or manually supplied); all other fields should be optional, and if possible not fixed. The report should not rely on fields existing otherwise those fields should always exist and have at least some form of default (even if an empty string), or the report generator should know how to handle missing/empty fields in findings.

…d per element finding lists, updated json test data.

…o an element in a model.

nozmore · 2021-10-05T05:15:23Z

Ok. I think I incorporated all of the comments. Only thing I didn't was @colesmj comment about do the findings need to be stored in the model. No its code so they dont have to be, for now I like that it is part of the model. For me teams would manage 1 tm file per app (or per a given scope if a larger app) run this file thru the tool and produce output. Also there is a link to the model since the findings are added to elements. When I take a look at json support I would like to separate the input (model, manual findings) from the output (all findings)

Below are the comment I addressed..

Renamed to manual findings instead of custom
manual findings are now incorporated with all other findings with a source field, (if separation is required the if template feature can be used)
similar numerical finding 'id' counter is used
the duplicate element reference in tm.py is removed (web.manual_findings = [Finding(web........
removed required fields from Finding.

The only outstanding change is to finalize on the custom findings in tm.py.... or decide to remove those from tm.py and just document them in the README.md.

pytm/pytm.py

…tm, docs and json test data.

raphaelahrens · 2021-10-05T19:51:45Z

Ok. I think I incorporated all of the comments. Only thing I didn't was @colesmj comment about do the findings need to be stored in the model.

@colesmj I also have to ask what you ment by "in" vs "with" the model.

When I take a look at json support I would like to separate the input (model, manual findings) from the output (all findings)

Why is this separation helpful? Could you give an example where this separation would make a difference?

I also wanted to state my preference for an extra function for manual findings.
So the first thing I would do is add a function like this to my threat model.

def finding(element, description, **kwargs):
    element.manual_finding.append(Finding(.....

This function allows me to just create a finding without carrying about the structure of pytm.
The model would then at the end have a bunch of lines like

tm = TM()
dB = Datastore()
broker = Process()
Dataflow(broker, dB, " messages")

finding(dB, "cyclic references" source="RA")
finding(broker, "missing topic authZ", source="RA")

The advantage is that the internal data model and the DSL is more decoupled. So as a user I also don't have to care what kind of finding this is and how it is stored.

The json representation is currently depending on the internal data structure and it might not be a good idea to have a user interface which is the datastructure.

nineinchnick · 2021-10-05T20:09:11Z

pytm/pytm.py

-                "cannot overwrite {}.{} value with {}, already set to {}".format(
-                    instance, self.__class__.__name__, value, self.data[instance]
+            if (not isinstance(instance, Finding) 
+                or (isinstance(instance, Finding) and isinstance(self,varString) and self.data[instance] != "invalid")


I'm against using special values like "invalid". Looks like we should not use this var* type for Findings? What's exactly the purpose of these conditions?

invalid was already there, I didn't remove it. It set when a Finding is created without and element, I dont think it ever happened before because the element was required but actually gets used now. The conditions are used to workaround the 'overwrite' protection. I wanted to narrow down the specific occurances where overwrite would be allowed and is only for Finding's element and target attributes.

Maybe we can avoid setting the findings two times? Can you locate the place where it's being set the second time?

ok, cleaned things up. Instead of using setting the value to 'Element("invalid")' I set it to None and avoided the double set on the findings element and target attributes.

nineinchnick · 2021-10-05T20:13:24Z

pytm/pytm.py

@@ -626,6 +632,7 @@ class Finding:
 """,
    )
    cvss = varString("", required=False, doc="The CVSS score and/or vector")
+    source = varString("manual", required=False, doc="The source of the Finding.")


The name source on its own is too vague. Can the doc mention that it's automatically populated with pytm for findings matched from the threat library? It'll be the second possible value next to the default value, and it would help users to figure out the meaning of this.

Sure. I'll add some better text. I could remove the logic around source altogether and we just add it to every element in the threatlib.

It's fine to have that logic, it's better to have this field only in Finding than both in here and in Threat

colesmj · 2021-10-05T23:04:44Z

Ok. I think I incorporated all of the comments. Only thing I didn't was @colesmj comment about do the findings need to be stored in the model.

@colesmj I also have to ask what you ment by "in" vs "with" the model.

By writing findings directly into the model definition, changes may invalidate the findings or result in conflicting findings. Better IMHO to separate model definition from outcomes. But maybe we're not at that point in pytm structure to support separation.

…nt's findings, added Finding added to the model with element as arg.

nozmore · 2021-10-05T23:50:08Z

When I take a look at json support I would like to separate the input (model, manual findings) from the output (all findings)
Why is this separation helpful? Could you give an example where this separation would make a difference?

I would like to have this running in the pipeline to create the report, dfd and findings. This way I can monitor changes to the model (or lack there of) and have them separate from the findings which could update due to pytm or threatlib changes while the model remains the same.

I also wanted to state my preference for an extra function for manual findings. So the first thing I would do is add a function like this to my threat model.

With moving away from the manual_findings attribute to use findings I was 90% there, just made some changes. It now supports findings added to the element's findings attribute or ones created using a lone Finding object with the element as an arg.

The json representation is currently depending on the internal data structure and it might not be a good idea to have a user interface which is the data structure.

I agree, I have some json change locally that I started, might start over but I do want to decouple the data models, definitely for json, but could be done for python stuff as well. Might fit well with @izar 's branch for modularizing pytm.

nineinchnick · 2021-10-06T08:00:31Z

I would like to have this running in the pipeline to create the report, dfd and findings. This way I can monitor changes to the model (or lack there of) and have them separate from the findings which could update due to pytm or threatlib changes while the model remains the same.

To avoid unexpected changes in threats, you should lock the version of pytm used or use your own threat library (a copy).

nineinchnick · 2021-10-06T08:01:32Z

tests/output.json

@@ -801,4 +801,4 @@
    "onDuplicates": "Action.NO_ACTION",
    "threatsExcluded": [],
    "threatsFile": "pytm/threatlib/threats.json"
-}
+}


Please don't remove end of line at end of file. Configure your text editor not to do that automatically.

I replaced the json using the produced json during the tests. I'm editing in vi so nothing is removed automatically on my end. If the space is important we could look to add it to the generated file.

pytm/pytm.py

Add feature to define findings manually in the model, similar to over…

82bc106

…rides

nozmore-vera added 3 commits October 4, 2021 21:43

rename from 'custom' to 'manual' findings.

9e0fce5

Added source element to Findings, merged manual findings into main an…

1eaeded

…d per element finding lists, updated json test data.

Removed duplicate element reference when assigning a manual finding t…

1640899

…o an element in a model.

nozmore marked this pull request as ready for review October 5, 2021 05:00

nozmore requested a review from izar as a code owner October 5, 2021 05:00

nineinchnick reviewed Oct 5, 2021

View reviewed changes

pytm/pytm.py Outdated Show resolved Hide resolved

remove manual_findings attribute and re-use findings. Updated sample …

389ee11

…tm, docs and json test data.

nineinchnick reviewed Oct 5, 2021

View reviewed changes

review comments. In addition to supporting Findings added to an eleme…

05f6dfd

…nt's findings, added Finding added to the model with element as arg.

nineinchnick reviewed Oct 6, 2021

View reviewed changes

pytm/pytm.py Outdated Show resolved Hide resolved

Code cleanup, removed unnecessary condition.

34959b6

nozmore added the hacktoberfest-accepted label Oct 26, 2021

Merge branch 'master' into custom_findings

f577e66

Add feature to define findings manually in the model, similar to over… #180

Are you sure you want to change the base?

Add feature to define findings manually in the model, similar to over… #180

Uh oh!

Conversation

nozmore commented Oct 2, 2021

Uh oh!

ghost commented Oct 2, 2021 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodeSee Review Map:

Legend

Uh oh!

raphaelahrens commented Oct 2, 2021

Uh oh!

nozmore commented Oct 2, 2021

Uh oh!

raphaelahrens commented Oct 2, 2021

Uh oh!

colesmj commented Oct 2, 2021

Uh oh!

nozmore commented Oct 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

colesmj commented Oct 2, 2021

Uh oh!

nozmore commented Oct 2, 2021

Uh oh!

colesmj commented Oct 2, 2021

Uh oh!

nozmore commented Oct 5, 2021

Uh oh!

Uh oh!

raphaelahrens commented Oct 5, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nozmore Oct 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

colesmj commented Oct 5, 2021

Uh oh!

nozmore commented Oct 5, 2021

Uh oh!

nineinchnick commented Oct 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ghost commented Oct 2, 2021 •

edited by ghost

Loading

nozmore commented Oct 2, 2021 •

edited

Loading

nozmore Oct 5, 2021 •

edited

Loading

nineinchnick commented Oct 6, 2021 •

edited

Loading