Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SONARKT-542 Fix CPD with string templates #560

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

antonioaversa
Copy link
Contributor

@antonioaversa antonioaversa commented Feb 5, 2025

SONARKT-542

See here for a complete analysis of the issue.

The adopted solution is to treat the entire string template as a single LITERAL token. In the previous solutions, each KtStringTemplateEntry emitted a LITERAL token.

Downsides

This may not be perfect, as we may not detect duplication patterns in expressions within interpolations of a string template.
For example, the following string:

var x5 = """ 
    a ${<interpolation1>} b 
    c ${<interpolation2>} d 
    ${<interpolation3>} f""".trimIndent()

We would have a single LITERAL for the entire string template: varx5=LITERAL.
Therefore, we will not see duplications between <interpolation1>, <interpolation2>, and <interpolation3>.

We will also not see equivalent interpolation structures.
For example, given the following two strings:

var x1 = """"
    a $b c $d e $f ... $something
    """"
var x2 = """"
    a $b c $d e $f ... $somethingElse
    """"

We will see just two LITERAL: one for x1 and one for x2, instead of a sequence of LITERAL which would be the same, up to $something/$somethingElse

For both scenarios mentioned above, the probability that a duplication would be detected (i.e. at least 100 equal tokens on at least 10 different lines) seems close to zero, so it's better to treat the entire template as a literal to avoid issues like the one explained in the community post.

@antonioaversa antonioaversa force-pushed the antonio/SONARKT-542-fix-cpd-with-string-templates branch from 2b3529e to d0dfa64 Compare February 5, 2025 15:52
@antonioaversa antonioaversa force-pushed the antonio/SONARKT-542-fix-cpd-with-string-templates branch from d0dfa64 to 6eb9b96 Compare February 5, 2025 15:57
@antonioaversa antonioaversa marked this pull request as ready for review February 5, 2025 16:05
Copy link
Contributor

@leveretka leveretka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the approach looks good, thanks for having a look and fixing this.

I've just noticed 2 unused imports in the test.

@antonioaversa
Copy link
Contributor Author

@leveretka master is locked, so I'd let you merge the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants