Skip to content

Ruby: add support for extracting overlay databases #19684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

nickrolfe
Copy link
Contributor

There will need to be followup changes on the QL side before querying will work, but this is enough to be able to build an overlay database for Ruby.

@nickrolfe nickrolfe added the Ruby label Jun 6, 2025
@github-actions github-actions bot added QL-for-QL Rust Pull requests that update Rust code labels Jun 6, 2025
@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-overlay-extraction branch 2 times, most recently from 33c2550 to fb89f22 Compare June 6, 2025 14:03
@nickrolfe nickrolfe marked this pull request as ready for review June 10, 2025 16:45
@Copilot Copilot AI review requested due to automatic review settings June 10, 2025 16:45
@nickrolfe nickrolfe requested review from a team as code owners June 10, 2025 16:45
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for overlay database extraction in Ruby by introducing a new databaseMetadata relation and path‐prefix transformation logic.

  • Introduces a --add-metadata-relation flag to the shared extractor generator and implements create_database_metadata().
  • Adds PathTransformer and normalize_and_transform_path in the shared extractor for prefix rewrites.
  • Updates the Ruby extractor to skip unchanged files, load overlay changes, apply path transformations, and emit an empty base‐metadata file.

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
shared/tree-sitter-extractor/src/generator/mod.rs Added add_metadata_relation flag and create_database_metadata
shared/tree-sitter-extractor/src/file_paths.rs Implemented PathTransformer, load_path_transformer, and path normalization with transformation
shared/tree-sitter-extractor/src/extractor/*.rs Propagated transformer through file population and trap writing
ruby/ql/lib/ruby.dbscheme Added new databaseMetadata relation definition
config/dbscheme-fragments.json Registered the Database metadata fragment
ruby/extractor/src/extractor.rs Added overlay‐change filtering, transformer usage, and metadata output
ruby/extractor/src/generator.rs Enabled metadata relation in the Ruby generator
Comments suppressed due to low confidence (1)

ruby/extractor/src/extractor.rs:347

  • [nitpick] New get_overlay_changed_files logic and load_path_transformer are not covered by existing tests. Consider adding unit tests for parsing the JSON changes file and path transformation behavior.
fn get_overlay_changed_files() -> Option<HashSet<PathBuf>> {

Comment on lines +364 to +370
.map(|change| {
change
.as_str()
.map(|s| PathBuf::from(s).canonicalize().ok())
.flatten()
})
.collect()
Copy link
Preview

Copilot AI Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collecting an iterator of Option<PathBuf> will not produce HashSet<PathBuf>. You should use filter_map to drop None entries and wrap the resulting HashSet in Some(...) before returning.

Suggested change
.map(|change| {
change
.as_str()
.map(|s| PathBuf::from(s).canonicalize().ok())
.flatten()
})
.collect()
.filter_map(|change| {
change
.as_str()
.and_then(|s| PathBuf::from(s).canonicalize().ok())
})
.collect::<HashSet<PathBuf>>()
.into()

Copilot uses AI. Check for mistakes.

@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-overlay-extraction branch 2 times, most recently from 576f0d2 to 4029b89 Compare June 11, 2025 11:45
@nickrolfe
Copy link
Contributor Author

@aibaars do you have capacity to review this?

Copy link
Contributor

@aibaars aibaars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me. A few comments though. I think it would be fine to add the new features to all extractors .

@@ -32,6 +33,16 @@ pub fn generate(

writeln!(dbscheme_writer, include_str!("prefix.dbscheme"))?;

// Eventually all languages will have the metadata relation (for overlay support), at which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just shove the metadata tables directly in prefix.dbscheme. An unused table shouldn't hurt.

@@ -212,7 +212,7 @@ impl TrapFile {
);
}
pub fn emit_file(&mut self, absolute_path: &Path) -> Label<generated::File> {
let untyped = extractor::populate_file(&mut self.writer, absolute_path);
let untyped = extractor::populate_file(&mut self.writer, absolute_path, None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just add support for path transformers to the Rust extractor.

@@ -15,7 +15,7 @@ impl Archiver {
}

fn try_archive(&self, source: &Path) -> std::io::Result<()> {
let dest = file_paths::path_for(&self.root, source, "");
let dest = file_paths::path_for(&self.root, source, "", None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just add support for path transformers to the Rust extractor.

.filter(|line| !line.is_empty())
.collect::<Vec<String>>();

if lines.len() != 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard would it be to just implement path transformers completely?

@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-overlay-extraction branch from 4029b89 to fa24ee6 Compare June 19, 2025 14:57
@nickrolfe
Copy link
Contributor Author

I've force-pushed a couple of the suggested fixes. Regarding the others:

  • prefix.dbscheme: since this will change the QL-for-QL dbscheme, I'll do it as a quick follow-up PR. That way, if any issues arise, it can be reverted without affecting Ruby overlay support.
  • similarly, I'll add the Rust path-transformer support as a separate PR.
  • for implementing full path-transformer support, I don't think it would be an enormous amount of work to port either the existing Java or C++ implementations, along with their unit tests, but it's probably enough work that we discussed within our team and decided not to do it (since it's not on the critical path for overlay support).

This is required for overlay support.
Supports only a minimal subset of the project layout specification;
enough to work with the transformers produced by the CLI when building
an overlay database.
@nickrolfe nickrolfe force-pushed the nickrolfe/ruby-overlay-extraction branch from fa24ee6 to 665df4b Compare June 19, 2025 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QL-for-QL Ruby Rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants