Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement #334: Integrate cross-project experiment into pipeline #346

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

M8is
Copy link
Collaborator

@M8is M8is commented Apr 17, 2018

Implement #334: Integrate cross-project experiment into pipeline

@salsolatragus I thought you may want to check the current state. The tasks are now integrated into the pipeline and should work the same way as before (plus pipeline features like the default config). I've set them up to be called separately, but that is easy to reconfigure.

@M8is M8is self-assigned this Apr 17, 2018
@M8is M8is requested a review from salsolatragus April 17, 2018 06:45
Copy link
Collaborator

@salsolatragus salsolatragus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provide option --with-xp for ex2 and ex3. If chosen, provide example projects as training data.


class CrossProjectPrepareTask:
MAX_SUBTYPES_SAMPLE_SIZE = 25
MAX_PROJECT_SAMPLE_SIZE = 50
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these two constants to config parameters. Use same names.

@@ -238,6 +249,32 @@ def __add_run_ex3_subprocess(available_detectors: List[str], available_datasets:
__setup_run_arguments(experiment_parser, available_detectors)


def __add_run_cross_project_create_index(available_datasets: List[str], subparsers) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicitly run create index with prepare.

__setup_filter_arguments(parser, available_datasets)


def __add_run_cross_project_prepare(subparsers) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provide as top-level task ./mubench checkout-xp.

@salsolatragus salsolatragus changed the base branch from master-dev to master April 20, 2018 14:31
@salsolatragus
Copy link
Collaborator

Try collecting results from AsterskTasks for passing prepared xp data on to subsequent tasks. If this does not work (easily), we persist indexes version-wise and load these after preparation.

@M8is
Copy link
Collaborator Author

M8is commented May 4, 2018

Turns out we really didn't need anything more since we can write one index per version. Woops.

Well, now that it's implemented we might as well keep it. It isn't that complicated and seems to be a useful feature. Essentially, we can now split something up into arbitrarily small tasks and just accumulate the results. At this point, I'm wondering if TaskRunner is Turing complete...

Anyway, we can now run ex2 and ex3 --with-xp, which prepares the cross project examples and passes all sources paths to the detector via the training sources argument.

@M8is
Copy link
Collaborator Author

M8is commented May 4, 2018

@salsolatragus I just remembered that the solution I've implemented now is exactly what you suggested. There's definitely a lot of space for improvements, but it should work for now.

Copy link
Collaborator

@salsolatragus salsolatragus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only 2 minors!

return {
key_detector_mode: DetectAllFindingsTask.__DETECTOR_MODE,
key_target_src_paths: version_compile.original_sources_paths,
key_target_classes_paths: version_compile.original_classes_paths,
key_dependency_classpath: version_compile.get_full_classpath()
key_dependency_classpath: version_compile.get_full_classpath(),
key_training_src_path: xp_sources_paths
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there is no xp paths.

@@ -40,7 +47,7 @@ def _get_findings_path(self, detector: Detector, version: ProjectVersion, misuse
def _get_detector_arguments(version_compile: VersionCompile, misuse_compile: MisuseCompile):
return {
key_detector_mode: DetectProvidedCorrectUsagesTask.__DETECTOR_MODE,
key_training_src_path: misuse_compile.correct_usage_sources_path,
key_training_src_path: [misuse_compile.correct_usage_sources_path],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this.

@M8is
Copy link
Collaborator Author

M8is commented May 15, 2018

This should be ready now.
Note: if you want to use this in the detectors, you'll have to catch the java.io.FileNotFoundException: no training source path provided. What is happening is pretty clear from the exception message at least. We'd have to release another mubench.cli version to add a default.

@salsolatragus salsolatragus self-assigned this May 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants