-
Notifications
You must be signed in to change notification settings - Fork 2.1k
fix(apply-patch): skip initial new-file hunk header emitted by some AI models #695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
All contributors have signed the CLA ✍️ ✅ |
I have read the CLA Document and I hereby sign the CLA |
Gemini 2.0 Flash emits hunk headers in Add File directives (e.g. "@@ -0,0 +1 @@" or "@@ -0,0 +1,5 @@"). Use "/^@@\s+-0,0\s+\+1(?:,[1-9]\d*)?\s+@@/" to detect and skip these headers. Signed-off-by: Masami HIRATA <[email protected]>
77bdda0
to
389d877
Compare
…I models and add unit tests - TypeScript parser: introduce NEW_FILE_HUNK regex + isFirst flag - Rust parser: same regex+first logic in parse_one_hunk - Tests (TS & Rust) covering skip-header behavior and regex self-test Signed-off-by: Masami HIRATA <[email protected]>
…codex into fix/apply-patch-skip-gemini-hunk
I know we are on the hook for putting up an actual proposal, but we are planning to introduce a concept of "plugins." Accepting a patch format that OpenAI models consider to be invalid output (from its own internal specification) seems like something that should live in a plugin rather in the core of Codex itself. |
Hi @bolinfest, Thank you for your comment and for sharing the insight about the planned "plugins" concept. That sounds like an interesting direction for enhancing extensibility in the long run. Regarding the immediate issue with the
These issues combined create a significant usability problem. AI agents interacting with I understand from your comment that you envision functionality like accepting varied patch formats potentially living in a plugin, rather than the core. That makes sense from an architectural perspective to keep the core clean. However, given the significant usability hurdle the current behavior presents, and acknowledging that developing the plugin system might take time, I'd like to ask if exploring any interim improvements to the core Looking ahead to the plugin system, I think it's a great opportunity. To ensure plugins (and potentially core components too) work effectively in collaboration with AI agents, I'd strongly suggest considering these two principles as part of the design guidelines or best practices:
Incorporating these principles would greatly enhance the value and usability of Codex CLI as an AI-collaborative tool. Thank you again for engaging with this feedback and considering the PR. I'm happy to discuss further. Best regards, |
What & Why
Some AI-generated patches (e.g. from Gemini 2.0 Flash) prepend a full diff hunk header like:
@@ -0,0 +1,2 @@ SectionName
before any
+
lines. Our parser treats that as invalid, causing parse failures or stray blank lines when adding new files.This PR makes both the TypeScript and Rust parsers ignore only the first such "new-file" hunk header, then proceed normally.
How
TypeScript
NEW_FILE_HUNK
regex +isFirst
flag inparse_add_file()
Rust
regex::Regex
new_file_re
+ "first" logic inparse_one_hunk()
Tests
TS
process_patch
– add file skips full hunk headersRust
test_add_file_skip_full_hunk_header_with_section
test_new_file_hunk_regex