Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] This relativizes the PDF's filepaths after importing through "Find Unlinked Files" #10582

Closed
wants to merge 13 commits into from

Conversation

IceMajor2
Copy link

@IceMajor2 IceMajor2 commented Oct 25, 2023

Fixes JabRef#549

I'll start explaining the changes I've made from the top of the method chain.

  1. ImportHandler
    It is the place where the logic of importing a PDF file starts being executed.
    At line 119, there's an importPDFContent (of ExternalFilesContentImporter class) method executed, which takes in a filepath as a parameter. In order to access the "General file directory" library property, I've modified this method (see 2nd point).
  2. ExternalFilesContentImporter
    As mentioned, importPDFContent method gets a new parameter: an instance of MetaData. The "General file directory" path is stored there.
  3. PdfMergeMetadataImporter
    Finally, the actual logic is being executed here. The class is an implementation of an abstract class Importer. Among others, it overrides importDatabase(Path) method. Previous step's method importPDFContent calls PdfMergeMetadataImporter's importDatabase method, where the filepath can be manipulated. Because of extending that method, I can provide importDatabase with MetaData object. At last - just at the end of the method - I've included some basic logic that:
  • checks whether the "General file directory" is present
  • if it is, the filepath is relativized (`C
  • if it is not, the absolute path is displayed

Please, do provide me with feedback on the above implementation. I'm particularly interested whether the extensions with MetaData instance is an okay decision. I'm also wondering if the "General file directory" property may be accessed in some other way.

Note that I've not included any changes to the CHANGELOG.md as this is still a work-in-progress pull request and it does not yet solve the issue of other than PDF files.

Mandatory checks

  • Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

@@ -116,7 +116,7 @@ protected List<ImportFilesResultItemViewModel> call() {

try {
if (FileUtil.isPDFFile(file)) {
var pdfImporterResult = contentImporter.importPDFContent(file);
var pdfImporterResult = contentImporter.importPDFContent(file, bibDatabaseContext.getMetaData());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use this directory: Optional firstExistingFileDir = bibDatabaseContext.getFirstExistingFileDir(filePreferences);

This will make sure that you always get the "right" file directory (global or from metadata)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've included 2 more commits in this PR. The first commit was about a slight change that made checkstyle task fail: space character was missing after if.

Your suggestion to use the "first existing file directory" does indeed seem like a better approach. Now, the path is relativized even if the "global file directory" property is missing. That would nicely go along with the approach that "relative paths are better", which was mentioned here.

@@ -116,7 +116,8 @@ protected List<ImportFilesResultItemViewModel> call() {

try {
if (FileUtil.isPDFFile(file)) {
var pdfImporterResult = contentImporter.importPDFContent(file, bibDatabaseContext.getMetaData());
Optional<Path> globalFileDir = bibDatabaseContext.getFirstExistingFileDir(preferencesService.getFilePreferences());
var pdfImporterResult = contentImporter.importPDFContent(file, globalFileDir);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optionals should not be passed as parameter, they should only be used for return types. Better

Path workingDirectory = databaseContext.getFirstExistingFileDir(filePreferences)
.orElse(filePreferences.getWorkingDirectory());

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. This feels like a code smell. I've pushed 2 new commits: the 1st removes the Optional and the 2nd one is about renaming the Path variable.

@Siedlerchr
Copy link
Member

Looks good so far

@Siedlerchr Siedlerchr added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Oct 28, 2023
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add (at least) one test case.

You can find one at the description "How to reproduce" at JabRef#549.

@IceMajor2
Copy link
Author

Please add (at least) one test case.

You can find one at the description "How to reproduce" at koppor#549.

I've committed a new test case.

I also wanted to add a more exhaustive test by testing not only the low-level PdfMergeMetadataImporter, but also the ImportHandler. However, I ran into some issues.

In ImportHandlerTest I've written something like this:

    @Test
    void handlePathRelativizationOnImportTest() throws Exception {
        // Arrange
        Path file = Path.of(ImportHandler.class.getResource("/pdfs/example.pdf").toURI());
        Path workingDir = Path.of(ImportHandler.class.getResource("/pdfs/").toURI());

        FilePreferences filePreferences = preferencesService.getFilePreferences();
        when(bibDatabaseContext.getFirstExistingFileDir(filePreferences)).thenReturn(Optional.of(workingDir));

        List<ImportFilesResultItemViewModel> resultList = new ArrayList<>();

        // Act
        BackgroundTask<List<ImportFilesResultItemViewModel>> importFilesInBackground = importHandler
                .importFilesInBackground(Collections.singletonList(file))
                .onSuccess(resultList::addAll);
        importFilesInBackground.executeWith(new CurrentThreadTaskExecutor());

        // Assert
        // ...
    }

But I'd end up with Illegal State Exception: Toolkit not initialized. Then, I tried going even one more level above by testing the UnlinkedFilesDialogViewModel's startImport method, but had trouble to do so, too.

I'd like to continue to work on the test case in ImportHandler, but I need help on how to run the BackgroundTask in test environment.

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issues

// Expecting relative path
expected.setField(StandardField.FILE, ":minimal.pdf:PDF");

assertEquals(Collections.singletonList(expected), result);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertEquals(Collections.singletonList(expected), result);
assertEquals(List.of(expected), result);

Comment on lines 105 to 107
BibEntry expected = new BibEntry(StandardEntryType.InProceedings);
expected.setField(StandardField.AUTHOR, "1 ");
expected.setField(StandardField.TITLE, "Hello World");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to use the JabRef style:

Suggested change
BibEntry expected = new BibEntry(StandardEntryType.InProceedings);
expected.setField(StandardField.AUTHOR, "1 ");
expected.setField(StandardField.TITLE, "Hello World");
BibEntry expected = new BibEntry(StandardEntryType.InProceedings)
.withField(StandardField.AUTHOR, "1 ")
.withField(StandardField.TITLE, "Hello World");

(also rewrite next line with the .with... pattern

Comment on lines 131 to 136
if (workingDir != null) {
Path relativizedFilePath = workingDir.relativize(filePath);
entry.addFile(new LinkedFile("", relativizedFilePath, StandardFileType.PDF.getName()));
} else {
entry.addFile(new LinkedFile("", filePath, StandardFileType.PDF.getName()));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add JavaDoc explaning the param workingDir and state when this can be null.

I was about to write that you should switch the if branches to avoid negation (and use workingDir == null), but this one is IMHO more readable

Suggested change
if (workingDir != null) {
Path relativizedFilePath = workingDir.relativize(filePath);
entry.addFile(new LinkedFile("", relativizedFilePath, StandardFileType.PDF.getName()));
} else {
entry.addFile(new LinkedFile("", filePath, StandardFileType.PDF.getName()));
}
if (workingDir != null) {
filePath = workingDir.relativize(filePath);
}
entry.addFile(new LinkedFile("", filePath, StandardFileType.PDF.getName()));

@koppor
Copy link
Member

koppor commented Nov 2, 2023

I'd like to continue to work on the test case in ImportHandler, but I need help on how to run the BackgroundTask in test environment.

Avoid the background task in tests. Isn't there another method you can check directly?

You would run a UI test then. Search for @GUITest annotations and read how these tests are run.

You changed logic, thus you should have a test in org.jabref.logic. Because your thing should be independent from the UI. I think, tests are in PdfMergeMetadataImporterTest?

@IceMajor2
Copy link
Author

Avoid the background task in tests. Isn't there another method you can check directly?

I was talking here about a "high-level" method - at the top of the method chain. In this draft PR, I've so far modified 3 main classes: ImportHandler, ExternalFilesContentImporter and PdfMergeMetadataImporter.

I tested the PdfMergeMetadataImporter's importDatabase(Path, Path) method that I've created (by extending with another Path parameter). So, yes, there's a method that I can check directly: I've tested it already. However, I wanted to test ImportHandler's importFilesInBackground(List<Path>) method, because it's the place where all the logic that I've implemented resides. Maybe there's still some other method that I can't find.

I can't figure out how to test it, though. It's a BackgroundTask and, going by your advice, I was unable to create a @GUITest as I can't get the "unlinked files" dialog menu working. I'm not sure how it should be initialized... I tried looking at other @GUITest tests, but couldn't manage to use those approaches in my test.


Anyway, please see new commits. I've implemented your suggestions and wrote a bit of JavaDoc.

@IceMajor2 IceMajor2 marked this pull request as ready for review November 14, 2023 20:48
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. I just saw the PR again.

The implementation is not working for all cases. I provided a sketch of the required implementation. Should be 10 lines of code.

Would be nice if you could try it. If not, we can do.

Comment on lines +120 to +121
Path workingDir = bibDatabaseContext.getFirstExistingFileDir(filePreferences)
.orElse(filePreferences.getWorkingDirectory());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JabRef's logic of storing files is more complicated.

JabRef searchers in EACH directory, not in ONE.

See org.jabref.logic.util.io.FileUtil#getListOfLinkedFiles.

Comment on lines +141 to +143
if (workingDir != null) {
filePath = workingDir.relativize(filePath);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call org.jabref.logic.util.io.FileUtil#relativize here.

The required directories are provided by databaseContext.getFileDirectories(filePreferences)

@koppor koppor removed the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Jan 11, 2024
@koppor koppor marked this pull request as draft January 15, 2024 21:10
@koppor
Copy link
Member

koppor commented Feb 28, 2024

@IceMajor2 I saw no activity the last weeks, thus I am closing. In case you are going to resume work, feel free to reopen or file a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Path should be relative after import at "Find Unlinked Files"
4 participants