-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored data clumps with the help of LLMs (research project) #9352
base: master
Are you sure you want to change the base?
Conversation
Yay, your first pull request towards Jenkins core was created successfully! Thank you so much! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is common to all classes were you added the ProcessProperties is that they inherit from UnixProcess. So instead adding a new class just for the properties wouldn't it be better to just define the things in UnixProcess?
private int ppid = -1; | ||
private EnvVars envVars; | ||
private List<String> arguments; | ||
private ProcessProperties properties = new ProcessProperties(-1, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, in all other places the ProcessProperties are defined transient, why not here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats seems to be a oversight by me. Spotbug complained that I should add transient everywhere and when it stopped complaining I didn't look more. Strange :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me why the fields are being made transient. If it's SE_BAD_FIELD
, wouldn't making ProcessProperties
Serializable
address this without potentially causing serialization trouble?
(FWIW removing the transient
doesn't fail Spotbugs for me locally.)
Thank you for the feedback. In your particular case, that might be a better solution. But the LLM chooses the approach that always works, But I agree that pulling up those fields can also be a solution to solve data clumps :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall #9352 (review) seems preferable to a new type. The following should do it, all we'd lose is the final
ity of ppid
, no different from this proposal.
diff --git a/core/src/main/java/hudson/util/ProcessTree.java b/core/src/main/java/hudson/util/ProcessTree.java
index 8fbb80c8a8..80155d3d37 100644
--- a/core/src/main/java/hudson/util/ProcessTree.java
+++ b/core/src/main/java/hudson/util/ProcessTree.java
@@ -796,6 +796,10 @@ public abstract class ProcessTree implements Iterable<OSProcess>, IProcessTree,
* A process.
*/
public abstract class UnixProcess extends OSProcess {
+ protected final int ppid = -1;
+ protected EnvVars envVars;
+ protected List<String> arguments;
+
protected UnixProcess(int pid) {
super(pid);
}
@@ -877,9 +881,6 @@ public abstract class ProcessTree implements Iterable<OSProcess>, IProcessTree,
}
class LinuxProcess extends UnixProcess {
- private int ppid = -1;
- private EnvVars envVars;
- private List<String> arguments;
LinuxProcess(int pid) throws IOException {
super(pid);
@@ -1001,13 +1002,9 @@ public abstract class ProcessTree implements Iterable<OSProcess>, IProcessTree,
*/
private final boolean b64;
- private final int ppid;
-
private final long pr_envp;
private final long pr_argp;
private final int argc;
- private EnvVars envVars;
- private List<String> arguments;
private AIXProcess(int pid) throws IOException {
super(pid);
@@ -1327,7 +1324,6 @@ public abstract class ProcessTree implements Iterable<OSProcess>, IProcessTree,
*/
private final boolean b64;
- private final int ppid;
/**
* Address of the environment vector.
*/
@@ -1337,8 +1333,6 @@ public abstract class ProcessTree implements Iterable<OSProcess>, IProcessTree,
*/
private final long argp;
private final int argc;
- private EnvVars envVars;
- private List<String> arguments;
private SolarisProcess(int pid) throws IOException {
super(pid);
@@ -1596,9 +1590,6 @@ public abstract class ProcessTree implements Iterable<OSProcess>, IProcessTree,
}
private class DarwinProcess extends UnixProcess {
- private final int ppid;
- private EnvVars envVars;
- private List<String> arguments;
DarwinProcess(int pid, int ppid) {
super(pid);
@@ -1881,10 +1872,6 @@ public abstract class ProcessTree implements Iterable<OSProcess>, IProcessTree,
private class FreeBSDProcess extends UnixProcess {
- private final int ppid;
- private EnvVars envVars;
- private List<String> arguments;
-
FreeBSDProcess(int pid, int ppid) {
super(pid);
this.ppid = ppid;
import hudson.EnvVars; | ||
import java.util.List; | ||
|
||
public class ProcessProperties { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a public class?
private int ppid = -1; | ||
private EnvVars envVars; | ||
private List<String> arguments; | ||
private ProcessProperties properties = new ProcessProperties(-1, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me why the fields are being made transient. If it's SE_BAD_FIELD
, wouldn't making ProcessProperties
Serializable
address this without potentially causing serialization trouble?
(FWIW removing the transient
doesn't fail Spotbugs for me locally.)
@@ -0,0 +1,16 @@ | |||
package hudson.util; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add license header.
Thank you very much for the feedback. @daniel-beck You are correct that your proposal is better. I haven't encountered this corner case where fields are shared in derived classes before so it is interesting that the LLM did not spot this. I can update this PR to use your "pulling fields up proposal" when I find time :) |
adc07ee
to
28e413e
Compare
Hello maintainers,
I am conducting a master thesis project focused on enhancing code quality through automated refactoring of data clumps, assisted by Large Language Models (LLMs).
Data clump definition
A data clump exists if
See also the following UML diagram as an example
I believe these refactoring can contribute to the project by reducing complexity and enhancing readability of your source code.
Pursuant to the EU AI Act, I fully disclose the use of LLMs in generating these refactorings, emphasizing that all changes have undergone human review for quality assurance.
Even if you decide not to integrate my changes to your codebase (which is perfectly fine), I ask you to fill out a feedback survey, which will be scientifically evaluated to determine the acceptance of AI-supported refactorings. You can find the feedback survey under https://campus.lamapoll.de/Data-clump-refactoring/en
Thank you for considering my contribution. I look forward to your feedback. If you have any other questions or comments, feel free to write a comment, or email me under [email protected] .
Best regards,
Timo Schoemaker
Department of Computer Science
University of Osnabrück
Proposed changelog entries
refactored data clumps
Proposed upgrade guidelines
N/A
Submitter checklist
Desired reviewers
Before the changes are marked as
ready-for-merge
:Maintainer checklist