Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API outline #1

Open
14 of 34 tasks
Col-E opened this issue Jul 8, 2023 · 1 comment
Open
14 of 34 tasks

API outline #1

Col-E opened this issue Jul 8, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@Col-E
Copy link
Collaborator

Col-E commented Jul 8, 2023

API outline

Inputs

The items that can be fed into the API that populate the model (outlined below)

  • From file paths
  • From memory (byte[])

Model

The in-memory model of all inputs, which is used to facilitate scanning operations

  • Source
    • Associated file path (null if from memory)
    • Classes defined in the source

Scanning

The primary logic of our library. Like most anti-malware products, there are different scan operations. For us, we can offer static and dynamic scanning. Static will be classic signature matching (With ASM, because YARA matching Java bytecode is not feasible for high quality signatures) and dynamic will be a custom "behavior" matching system backed by SSVM

  • Common scanning
    • Intermediate scanning model types
    • Serialization support for scanning models
  • Static scanning
  • Dynamic scanning
    • Loading the input model into a SSVM instance
    • Determining where to start execution from known input formats
      • Modding frameworks
        • Forge
        • Fabric
        • Quilt
      • Server plugins
        • Bukkit
        • Spigot
    • Creating an algorithm to determine additional entry points to maximize code coverage after executions spawned from the known input locations complete
    • Virtually executing from the known inputs
      • Loading the input model into SSVM
      • Piping execution state into an API that we can use to create simple "behavior" matches with
      • Skip execution of control flow paths already visited to save time
      • Force control flow of all paths in the model to ensure full coverage of code to be scanned

Output / reporting

Different use cases may want different output formats. A common report model should exist with a variety of printers/exporters for different output formats.

  • Common report model
  • Text formats
    • JSON (Basic serialization of common reporting model, so the file can be deserialized back into the class type)
    • HTML (For visual reports that can have interactive components)
    • Simple text
      • With color codes for ANSI enabled consoles if the output is System.out
  • Configurable Yes/no outputs
    • For platforms using the tool as a filtering tool for uploaded files, we may want to make a system that is easy to set and forget. Likely config file driven, which declares rates and percentages of allowed offending materials, whitelisting and blacklisting of certain match values, etc. Once set, they can do something like java -jar app.jar --config foo.conf --input <jar>.
@Col-E Col-E added the enhancement New feature or request label Jul 8, 2023
@Col-E
Copy link
Collaborator Author

Col-E commented Jul 8, 2023

Some challenges to consider:

  • Jar in jar representation in the model

@Col-E Col-E pinned this issue Jul 10, 2023
@Col-E Col-E changed the title Initial outline API outline Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant