Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates for DSpace 7.6 #69

Merged
merged 7 commits into from
Nov 9, 2023

Conversation

nwoodward
Copy link
Contributor

@nwoodward nwoodward commented Jan 20, 2023

Minor updates to make dspace-replicate compatible with DSpace 7.6.

I also had to add a few more exclusions to the pom.xml from the installation instructions: https://wiki.lyrasis.org/display/DSPACE/ReplicationTaskSuite#ReplicationTaskSuite-InstallationonDSpace7.x.

<dependency>
   <groupId>org.dspace</groupId>
   <artifactId>dspace-replicate</artifactId>
   <version>7.4</version>
   <exclusions>
      <exclusion>
         <groupId>com.amazonaws</groupId>
         <artifactId>aws-java-sdk-core</artifactId>
      </exclusion>
      <exclusion>
         <groupId>com.amazonaws</groupId>
         <artifactId>aws-java-sdk-sqs</artifactId>
      </exclusion>
      <exclusion>
         <groupId>org.apache.commons</groupId>
         <artifactId>commons-compress</artifactId>
      </exclusion>
      <exclusion>
         <groupId>org.hibernate.javax.persistence</groupId>
         <artifactId>hibernate-jpa-2.1-api</artifactId>
      </exclusion>
      <exclusion>
         <groupId>org.apache.httpcomponents</groupId>
         <artifactId>httpmime</artifactId>
      </exclusion>
      <exclusion>
         <groupId>org.springframework.security</groupId>
         <artifactId>spring-security-core</artifactId>
      </exclusion>
   </exclusions>
</dependency>

@amgciadev
Copy link

We are planning to use dspace-replicate with our DSpace 7.5 installation. Happy to give this a test if that helps.

* Remove SupervisorService

* Remove SupervisedItemService, add SubscribeService

* Add log4j dependencies for compilation + testing

* Fix typo
@nwoodward nwoodward changed the title Updates for DSpace 7.4 Updates for DSpace 7.6 Oct 24, 2023
@nwoodward
Copy link
Contributor Author

Hi @amgciadev. My apologies on the delay, but if you are still planning on deploying with the RTS we would appreciate help with testing these updates.

@amgciadev
Copy link

Hi @nwoodward , your request is very timely as we are working towards using the RTS in our production environment, so yes, happy to test this out in our DSpace 7.6 test environments. As we are a bit new on to deploying PRs for the module, may I ask how do you normally do this? I am trying to avoid deploying the libraries manually by locally building the RTS module separately and they copy those libraries across the DSpace installation

@amgciadev
Copy link

Hi @nwoodward , your request is very timely as we are working towards using the RTS in our production environment, so yes, happy to test this out in our DSpace 7.6 test environments. As we are a bit new on to deploying PRs for the module, may I ask how do you normally do this? I am trying to avoid deploying the libraries manually by locally building the RTS module separately and they copy those libraries across the DSpace installation

For the time being I'll copy over those jars to be able to start testing for you now.

@nwoodward
Copy link
Contributor Author

Hi @amgciadev. Great! The way I usually test changes to the RTS is to pull them down into my local dspace-replicate repository and run mvn clean install which will place a copy of the SNAPSHOT jar file in my user's ~/.m2 folder. Then when I'm building DSpace I edit the [dspace-src]/dspace/modules/additions/pom.xml like the RTS instructions say to do, but I set the <version> tag to the SNAPSHOT version of RTS that I just built. Maven checks the local ~/.m2 folder for the jar first, so it gets pulled in. I also verify that it worked by comparing the jar file in ~/.m2 with the jar file in the [dspace]/lib folder after it's deployed to make sure they match. Copying over jar files should work, too. Thanks!

@amgciadev
Copy link

@nwoodward, this is a summary of the testing I have done so far:

  • "curation-task.task.estaipsize.label": "Estimate Storage Space for AIP(s)",
  • "curation-task.task.readodometer.label": "Read Odometer",
  • "curation-task.task.transmitaip.label": "Transmit AIP(s) to Storage",
  • "curation-task.task.transmitsingleaip.label": "Transmit Single Object AIP to Storage",
  • "curation-task.task.verifyaip.label": "Verify AIP(s) exist in Storage",
  • "curation-task.task.fetchaip.label": "Fetch AIP(s) from Storage",
  • "curation-task.task.auditaip.label": "Audit against AIP(s)",
  • "curation-task.task.removeaip.label": "Remove AIP(s) from Storage",
  • "curation-task.task.replacewithaip.label": "Replace Existing Object(s) with AIP(s)",

I have also tested the automated synchronisation, i.e. verified that when an item is changed, the creation of a new AIP is scheduled, and I have then run the "replication" curation task that clears the queue and all has worked as expected.

This is the configuration we have used:

# ADD the "replicate" consumer to the end of the list of 'default.consumers' (This enables the consumer)
event.dispatcher.default.consumers = versioning, discovery, eperson, doi, replicate
# Configure consumer to manage BagIt AIP content replication
# METSReplicateConsumer enabled by default. Options:
# event.consumer.replicate.class = org.dspace.ctask.replicate.BagItReplicateConsumer
event.consumer.replicate.class = org.dspace.ctask.replicate.METSReplicateConsumer
event.consumer.replicate.filters = Community|Collection|Item+Install|Modify|Modify_Metadata|DeleteADD the "replicate" consumer to the end of the list of 'default.consumers' (This enables the consumer)
event.dispatcher.default.consumers = versioning, discovery, eperson, doi, replicate
# Configure consumer to manage BagIt AIP content replication
# METSReplicateConsumer enabled by default. Options:
# event.consumer.replicate.class = org.dspace.ctask.replicate.BagItReplicateConsumer
event.consumer.replicate.class = org.dspace.ctask.replicate.METSReplicateConsumer
event.consumer.replicate.filters = Community|Collection|Item+Install|Modify|Modify_Metadata|Delete

And then some very odd behaviour that I believe is not related at all to this PR and may be me misunderstanding how some tasks work:

If I run this task "curation-task.task.restorefromaip.label": "Restore Missing Object(s) from AIP(s)", with a handle of an item that currently exists in DSpace, the task runs successfully but all the metadata is duplicated (I have not checked files). If instead I attempt to run this task after having deleted an item, to restore from the AIP (is this the use case for this task??), then I get an error when running the task relating to the object not being found in the DB.

When I've run the synchroniser, which runs the transmitsingleaip task I've noticed that the auditaip task reports incorrectly changes to checksums, whereas if I use instead the transmitaip with the handle of the object for which I want to regenerate the aip, and then run the auditaip task, the task reports correct results then.

And finally, questions rather than testing about these three tasks and exactly what use cases they cover
"curation-task.task.restorekeepexisting.label": "Restore Missing Object(s) but Keep Existing Objects",
"curation-task.task.restoresinglefromaip.label": "Restore Single Object from AIP",
"curation-task.task.replacesinglewithaip.label": "Replace Single Object with AIP",

@nwoodward
Copy link
Contributor Author

@amgciadev I think at least some of the odd behavior you mention is related to the checksum issue that you've noted elsewhere - #70. I've also seen that in our DSpace logs. Since this PR is mostly about updating logging classes and a few DSpace classes used for testing I suspect the strangeness is unrelated to these changes. You could review the dspace.log file to make sure.

@amgciadev
Copy link

@nwoodward I agree that the odd behaviour I am seeing is totally unrelated to this PR. I'll do more investigation and open separate issues for those.

Other than that, I think all these changes look good and all the tasks I've tested work as expected.

@amgciadev
Copy link

@tdonohue is there any timeframe / estimate for when these will be available as a new release for the toolkit?

@tdonohue
Copy link
Member

tdonohue commented Nov 6, 2023

@amgciadev and @nwoodward : I've been waiting on final testing from those who use the Replication Task Suite on a regular basis (namely you two and the DSpaceDirect team via @mikejritter or others).

So, as soon as all of you are satisfied that this is working (and I see some +1 votes on this PR), then I'm OK with merging it and starting a new release of the Replication Task Suite. Unfortunately though, I don't have a test setup for RTS, so I'm unable to test this myself. The best I can do is verify the code changes "look sane". So, I'm reliant on all of you (and anyone else who happens to be watching this PR) to test this and report back once you've approved it.

@amgciadev
Copy link

@tdonohue thanks for the rsponse! +1 from me

Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks @nwoodward . Code changes here look correct to me. Since this has had testers (thanks @amgciadev ), I'll merge this immediately and aim to re-release Replication Task Suite next week sometime (likely alongside the upcoming DSpace 7.6.1 release...which is also due next week)

@tdonohue tdonohue added the dependencies Pull requests that update a dependency file label Nov 9, 2023
@tdonohue tdonohue merged commit f7b9e2d into DSpace:main Nov 9, 2023
1 check passed
@tdonohue
Copy link
Member

tdonohue commented Nov 9, 2023

@nwoodward : Could you ensure the Installation Documentation is updated for dspace-replicate 7.6 based on this PR? https://wiki.lyrasis.org/display/DSPACE/ReplicationTaskSuite#ReplicationTaskSuite-InstallationonDSpace7.x

There's not an immediate rush, but would be nice to have corrected documentation there before the new release of dspace-replicate next week.

@nwoodward
Copy link
Contributor Author

@tdonohue Sure thing. I'll update it.

@tdonohue
Copy link
Member

Replication Task Suite 7.6 has been released: https://github.com/DSpace/dspace-replicate/releases/tag/dspace-replicate-7.6
(It should appear in Maven Central in an hour or so)

I did my best guess update of the installation docs, but I probably need @nwoodward (or someone else) to double check them: https://wiki.lyrasis.org/display/DSPACE/ReplicationTaskSuite#ReplicationTaskSuite-InstallationonDSpace7.x

@nwoodward nwoodward deleted the dspace-replicate-7.4 branch November 21, 2023 20:56
@amgciadev
Copy link

@nwoodward sorry to add the question here: ahead of our preparations to start using the RTS in our production repository we are testing performing a full AIP backup of the entire repository but are finding the command really slow (mem, cpu and nfs storage performance seems OK) and we have actually stopped DSpace as there is no need to have it up as it is a test instance. As you are using the RTS in your production site, could you share any insights about how it performs for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants