Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JPlag not producing 100% similarity on identical submissions #10152

Closed
AjayvirS opened this issue Jan 14, 2025 · 1 comment
Closed

JPlag not producing 100% similarity on identical submissions #10152

AjayvirS opened this issue Jan 14, 2025 · 1 comment
Labels
bug component:Plagiarism Detection plagiarism Pull requests that affect the corresponding module

Comments

@AjayvirS
Copy link
Contributor

Describe the bug

When submitting two identical programming exercises, plagiarism detection will never produce a 100% similarity because the template code is considered part of the total tokens by which the tool presumably divides the similarity value.

Screenshots

No response

Which version of Artemis are you seeing the problem on?

9.8.3

What browsers do you see the problem on?

Firefox

Additional context

No response

Relevant log output

No response

To Reproduce

  1. Create a programming exercise Sort exercise template, e.g., Java
  2. log in as User 1 and submit the following piece of code in BubbleSort.java:
    public static boolean isPalindrome(String input) {
        if (input == null) {
            return false;
        }
        String normalized = input.replaceAll("[^a-zA-Z0-9]", "").toLowerCase();
        return normalized.equals(new StringBuilder(normalized).reverse().toString());
    }
  1. Login as User 2 and submit the same piece of code in BubbleSort.java
  2. log in as a tutor and navigate to Plagiarism in the exercise dashboard
  3. Run plagiarism detection
  4. Observe a <100% similarity despite the identical repositories for both submissions

As we increase the number of identical lines, we see that the similarity tends to be 100% but never reaches 100%, which makes me assume that the template code is somehow factored in. I confirmed this with a Python exercise from which I removed the template code and only submitted identical code with both users, producing a 100% similarity.

Submissions with multiple identical lines:

similarity.mp4

Expected behavior

Similarity value should reflect the actual content similarity of the submissions.
Solution approach suggestion: Add a checkbox that does not consider template code as part of the input to JPlag

Screenshots

No response

Which version of Artemis are you seeing the problem on?

9.8.3

What browsers are you seeing the problem on?

Firefox

Additional context

No response

Relevant log output

No response

@AjayvirS AjayvirS added the bug label Jan 14, 2025
@github-actions github-actions bot added assessment Pull requests that affect the corresponding module exercise Pull requests that affect the corresponding module lecture Pull requests that affect the corresponding module plagiarism Pull requests that affect the corresponding module programming Pull requests that affect the corresponding module labels Jan 14, 2025
@AjayvirS AjayvirS added component:Plagiarism Detection and removed assessment Pull requests that affect the corresponding module exercise Pull requests that affect the corresponding module lecture Pull requests that affect the corresponding module programming Pull requests that affect the corresponding module labels Jan 14, 2025
@krusche
Copy link
Member

krusche commented Jan 14, 2025

seems to be a duplicate of #7174
I suggest to continue the discussion in #7174

@krusche krusche closed this as completed Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component:Plagiarism Detection plagiarism Pull requests that affect the corresponding module
Projects
None yet
Development

No branches or pull requests

2 participants