Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: dedup extracted requirements #158

Merged
merged 1 commit into from
Jan 31, 2025
Merged

fix: dedup extracted requirements #158

merged 1 commit into from
Jan 31, 2025

Conversation

camshaft
Copy link
Member

@camshaft camshaft commented Jan 23, 2025

Issue #, if available:

Resolves #74

Problem

The current extraction logic has a bug where it sometimes emits multiple requirement statements for the same text. It mostly comes down to two different scenarios:

Duplicate text

Sometimes a requirement sentence can appear multiple times in a single section.

## Section 1

The implementation MUST do this.

**snip**

The implementation MUST do this.

Compound requirement

Sometimes a requirement can appear in the same sentence as another:

## Section 1

The implementation MUST do this and also MUST do that.

Previously the report command "accidentally" deduplicated these strings when generating the report, since it was using a BTreeSet<Annotation>. With the recent refactors that added line numbers to the Annotation struct, these are no longer considered equivalent.

Solution

This change deduplicates requirements at the extract phase by normalizing the text and checking if it's already been mentioned as a requirement. If it has, then it skips emitting it a second time.

Note that even with this change we still have the gaps described in #159 and #161. Solutions for those will come separately.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@camshaft camshaft force-pushed the camshaft/extract-dedup branch from d78b30d to 2e4986d Compare January 23, 2025 20:32
@camshaft camshaft marked this pull request as ready for review January 23, 2025 20:33
@camshaft camshaft requested a review from a team as a code owner January 23, 2025 20:33
@camshaft camshaft merged commit 0b16105 into main Jan 31, 2025
14 checks passed
@camshaft camshaft deleted the camshaft/extract-dedup branch January 31, 2025 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extraction behavior seems a bit off
1 participant