Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plagiarism detection: Introduce Continuos Plagiarism Control MVP #7090

Closed
wants to merge 20 commits into from

Conversation

jakubriegel
Copy link
Contributor

@jakubriegel jakubriegel commented Aug 21, 2023

Checklist

General

Server

  • Important: I implemented the changes with a very good performance and prevented too many (unnecessary) database calls.
  • I followed the coding and design guidelines.
  • I added multiple integration tests (Spring) related to the features (with a high test coverage).
  • I added pre-authorization annotations according to the guidelines and checked the course groups for all new REST Calls (security).
  • I documented the Java code using JavaDoc style.

Client

  • Important: I implemented the changes with a very good performance, prevented too many (unnecessary) REST calls and made sure the UI is responsive, even with large data.
  • I followed the coding and design guidelines.
  • Following the theming guidelines, I specified colors only in the theming variable files and checked that the changes look consistent in both the light and the dark theme.
  • I added multiple integration tests (Jest) related to the features (with a high test coverage), while following the test guidelines.
  • I documented the TypeScript code using JSDoc style.
  • I added multiple screenshots/screencasts of my UI changes.
  • I translated all newly inserted strings into English and German.

Changes affecting Programming Exercises

  • I tested all changes and their related features with all corresponding user types on Test Server 1 (Atlassian Suite).
  • I tested all changes and their related features with all corresponding user types on Test Server 2 (Jenkins and Gitlab).

Motivation and Context

This PR introduces continuous plagiarism control (cpc). Plagiarism checks for exercises with the cpc enables will be executed automatically every night. Implemented logic uses existing plagiarism mechanisms, so the plagiarism checks algorithm is untouched and the results of checks started by cpc are presented in the interface familiar to existing users. This feature doesn't open plagiarism cases automatically, but they can be opened manually just like with manual plagiarism checks.

This PR completes 7 milestones continuous plagiarism control (cpc) implementation. More about it can be discovered in my thesis proposal: https://confluence.ase.in.tum.de/display/ArTEMiS/MA+Jakub+Riegel?preview=/157431627/166332701/riegel_proposal.pdf

Description

This is a joined PR for changes introduced in #6666, #6804, #6924, #6972.

General

Changes:

  • automated daily plagiarism checks:
    • checks are triggered once a day at night
    • checks execute sequentially
    • subsequent submissions are checked automatically the next night
    • when cpc is enabled instructors can trigger plagiarism checks again manually
    • when plagiarism is detected the submission receives a result with 0 score
    • such result is recognised by the client and student is notified about it by a label with warning
    • cpc results can be disputed after exercise due date in the same way as standard results are being disputed
  • server-stored plagiarism configuration:
    • configuration fields added to the exercise form (creation & edit)
    • existing exercises will have configurations with default values created on the fly
    • manual plagiarism checks use server stored configuration

Detailed

Warning about plagiarism detected
On the submission page and editor (if applicable) the student sees a gentle note '⚠️ Suspicion of plagiarism! '. This gives student a clear information that this is needs to be fixed, while not imposing any negative consequences.

Score reduced to 0
Submission with detected plagiarism receives a Result with 0 score and a comment that this is due to detected plagiarism. Artemis treats and displays this result as any other results.

For programming exercises submitting next submission will generate a new tests result which overrides plagiarism results. In case that new submission is also plagiarism the next cpc run will add 0 score again

For text and modeling exercises submitting again doesn't remove 0 score automatically. It is removed during the next cpc run.

Disputing cpc results
Disputing cpc results is possible after the deadline of the exercise. Students use the same interface as for disputing assessment results. Instructors handle those disputes the same way they handle assessments disputes.

Post submission deadline tests
Instructors can use cpc to scan submissions posted on the due date. Such scans are optional so that instructors are free to run plagiarism checks manually after the due data (for example when they want asses submissions right after the deadline and do not wait for cpc to trigger).

Steps for Testing

  • 1 Instructor
  • 2 Students
  • 1 Course
  • for convenience use the cpc-1min branch which triggers cpc every 1 minute

General:

  1. Create programming exercise with continuous plagiarism control enabled (use Enabled checkbox on exercise creation form)
  2. Create two similar submission for the exercise
  3. Wait 2 minutes (results need to synchronise)
  4. Verify that ⚠️ Suspicion of plagiarism warning was placed and 0 points result was posted (see screenshots at the bottom)
  5. Change one submission to something completely different
  6. Wait 2 minutes
  7. Verify that warnings and results are deleted

Disputing results:

  1. Generate submissions with plagiarism detected by the cpc (using the steps above)
  2. Change exercise due date to the past
  3. As a student go into your submission page and dispute the result
  4. As instructor approve and decline the dispute
  5. If approved: new scores are displayed; if not: old results are displayed

Review Progress

Performance Review

  • I (as a reviewer) confirm that the client changes (in particular related to REST calls and UI responsiveness) are implemented with a very good performance
  • I (as a reviewer) confirm that the server changes (in particular related to database calls) are implemented with a very good performance

Code Review

  • Code Review 1
  • Code Review 2

Manual Tests

  • Test 1
  • Test 2

Test Coverage

Class/File Line Coverage Confirmation (assert/expect) Notes
ContinuousPlagiarismControlService.java 93%
PlagiarismChecksService.java 100%
TextPlagiarismResultConverter.java 54% This class unified logic already present in Artemis. It wasn't tested before, there is an issue for it: #6966
JPlagSubmissionDataExtractor.java 86% -
programming-exercise-update.component.ts 80%
modeling-exercise-update.component.ts 84%
text-exercise-update.component.ts 85%

Screenshots

Exercise form

New fields (for programming, text and modelling):
image

image

The option is checkbox disabled when cpc is not selected:
image

Plagiarism page

The button for checking plagiarism has new copy with now word. It better communicates to the user that this will be on-demand check, independent from continuous plagiarism control.

Initial view:
image

View with all buttons visible:
image

Programming exercise page header when cpc detects plagiarism:
image

jakubriegel and others added 8 commits August 12, 2023 18:09
…checks (#6666)

Co-authored-by: Julian Christl <[email protected]>
Co-authored-by: Maximilian Sölch <[email protected]>
Co-authored-by: Laurenz Blumentritt <[email protected]>
Co-authored-by: Markus Paulsen <[email protected]>
Co-authored-by: Paul Schwind <[email protected]>
Co-authored-by: Stephan Krusche <[email protected]>
…nd reduce their score (#6804)

Co-authored-by: Julian Christl <[email protected]>
Co-authored-by: Maximilian Sölch <[email protected]>
Co-authored-by: Laurenz Blumentritt <[email protected]>
Co-authored-by: Markus Paulsen <[email protected]>
Co-authored-by: Paul Schwind <[email protected]>
Co-authored-by: Stephan Krusche <[email protected]>
…plagiarism and reduce their score (#6924)

Co-authored-by: Julian Christl <[email protected]>
Co-authored-by: Maximilian Sölch <[email protected]>
Co-authored-by: Laurenz Blumentritt <[email protected]>
Co-authored-by: Markus Paulsen <[email protected]>
Co-authored-by: Paul Schwind <[email protected]>
Co-authored-by: Stephan Krusche <[email protected]>
Co-authored-by: Julian Christl <[email protected]>
Co-authored-by: Maximilian Sölch <[email protected]>
Co-authored-by: Laurenz Blumentritt <[email protected]>
Co-authored-by: Markus Paulsen <[email protected]>
Co-authored-by: Paul Schwind <[email protected]>
Co-authored-by: Stephan Krusche <[email protected]>
Co-authored-by: Patrick Bassner <[email protected]>
@jakubriegel jakubriegel self-assigned this Aug 21, 2023
@github-actions github-actions bot added tests server Pull requests that update Java code. (Added Automatically!) client Pull requests that update TypeScript code. (Added Automatically!) database Pull requests that update the database. (Added Automatically!). Require a CRITICAL deployment. config-change Pull requests that change the config in a way that they require a deployment via Ansible. labels Aug 21, 2023
@jakubriegel jakubriegel changed the title Plagiarism detection: Continuos Plagiarism Control MVP Plagiarism detection: Introduce Continuos Plagiarism Control MVP Aug 21, 2023
@jakubriegel jakubriegel marked this pull request as ready for review August 25, 2023 08:03
@jakubriegel jakubriegel requested a review from a team as a code owner August 25, 2023 08:03
Copy link
Contributor

@tobias-lippert tobias-lippert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in testing session. Everything worked as expected.

Copy link
Collaborator

@MaximilianAnzinger MaximilianAnzinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in testing session. Works fine 👍

Copy link
Contributor

@milljoniaer milljoniaer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in testing session, works fine 👍

Copy link
Contributor

@DominikRemo DominikRemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in testing session

Copy link
Contributor

@laadvo laadvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested during the testing session, plagiarism detection is working as expected

Copy link
Contributor

@nityanandaz nityanandaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual test, test server 2 (2023-08-20)

Did detect plagiarism; did update to changes to avoid plagiarism

Copy link
Contributor

@lennart-keller lennart-keller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in testing session

Copy link
Contributor

@dearjasmina dearjasmina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested during the testing session, works as expected

Copy link
Contributor

@MarkusPaulsen MarkusPaulsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good + Tested on TS2

@MarkusPaulsen MarkusPaulsen added the maintainer-approved The feature maintainer has approved the PR label Aug 31, 2023
@dfuchss
Copy link
Contributor

dfuchss commented Aug 31, 2023

Just to be sure ..

Suspicion of plagiarism warning

This warning will only be shown to instructors? Also the point deduction will not take place immediately, right ?

Warning is shown to the students and points are deducted immediately

Then I'm not sure whether we can use such a feature tbh.
JPlag does not say that "X is plagiarism". It calculates the similarity to enable manual reviews. As one of the JPlag maintainers, I see a high risk here.
Based on practical use of JPlag, I know that thresholds for this similarity score (when to consider something as plagiarism) highly depends on the task.
Besides the false positives and confusion to those students, this feature enables the students to forge plagiarisms by simply committing until the warning to disappear.

Do I miss anything @jakubriegel ?

/cc @tsaglam

@dfuchss
Copy link
Contributor

dfuchss commented Aug 31, 2023

The comment above may also be important for @ls1intum/artemis-maintainers
I think the feature should clearly state what the effects of automatic plagiarism detection might be (e.g., confusion of students)

@dfuchss
Copy link
Contributor

dfuchss commented Aug 31, 2023

For completeness :) /cc @sebinside @larissaschmid

@sebinside
Copy link

sebinside commented Aug 31, 2023

As a maintainer of JPlag since before its integration into Artemis, I have technical, legal, and ethical concerns.

First of all, you actively give students feedback on the quality of their potentially intentional plagiarism. Such feedback has already successfully been used to attack plagiarism detection in the past, see this publication

Second, applying plagiarism detection in this early phase can be seen as general suspicion (dt. Generalverdacht) which can quickly bring up legal problems.

Third, plagiarism detection without human intervention is highly ethically questionable, see this information. JPlag (and similarly, all other tools with the same functionality) have never been intended to be used in an automatic pipeline. Only humans are able to evaluate plagiarism.

@krusche krusche removed ready to merge maintainer-approved The feature maintainer has approved the PR labels Sep 1, 2023
@krusche
Copy link
Member

krusche commented Sep 1, 2023

Before we merge this PR, we need to discuss some aspects of the process first

@github-actions
Copy link

There hasn't been any activity on this pull request recently. Therefore, this pull request has been automatically marked as stale and will be closed if no further activity occurs within seven days. Thank you for your contributions.

@jakubriegel
Copy link
Contributor Author

This PR remains open as a reference. It will be closed after new version of cpc is ready for review

@jakubriegel
Copy link
Contributor Author

New version of the cpc implemented in #7302

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client Pull requests that update TypeScript code. (Added Automatically!) component:Plagiarism Detection config-change Pull requests that change the config in a way that they require a deployment via Ansible. database Pull requests that update the database. (Added Automatically!). Require a CRITICAL deployment. no-stale server Pull requests that update Java code. (Added Automatically!) tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.