Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reingest process from Uncompressed AIP to Compressed 7Z with BZip does not create zip #321

Open
ross-spencer opened this issue Feb 9, 2018 · 2 comments
Labels
Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.

Comments

@ross-spencer
Copy link
Contributor

ross-spencer commented Feb 9, 2018

To test:

  • Set up a defaultMCPProcessing.xml with the option None for AIP Compression.
  • Ingest first as Uncompressed when given the option
  • Download and verify we receive a tarred set of folders from the Storage Service
  • Once complete, reingest with 7Zip with Bzip
  • Download and verify we still only receive a tarred set of folders from the Storage Service
  • Further, check the ingest/AIP storage times in the AM Archival Storage tab, the time we see will be from the original transfer and so hasn't been updated.

For transfer with UUID aeb6568f-a0d6-43dd-bfe4-42892cfb31a1 in http://am17x.qa.archivematica.org/archival-storage/aeb6568f-a0d6-43dd-bfe4-42892cfb31a1/

We see the following screen following re-ingest:

image

We can see the compression task happened:

image

But there are no other clues in the log that suggest where the 7z didn't then become the canonical AIP.

The compressed to uncompressed workflow have been shown to work okay:

Compressed -> Uncompressed Workflow: UUID bc429ed0-9f62-465a-9105-ce319542f6bc

Ingest #1:

image

Ingest #2:

image

The select compression page: http://am17x.qa.archivematica.org/tasks/078fddc0-5a5f-4621-9e08-3b57328b2c02/

NB The AIP store date is not updated, but the file path is (an elastic search issue on top of the AIP re-ingest?)

@ross-spencer ross-spencer added the Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result. label Feb 9, 2018
@ross-spencer ross-spencer changed the title Re-ingest process from Uncompressed AIP to Compressed 7Z with BZip does not create zip Reingest process from Uncompressed AIP to Compressed 7Z with BZip does not create zip Apr 4, 2018
@ross-spencer
Copy link
Contributor Author

ross-spencer commented May 17, 2018

I am seeing this in my work today. It seems the .7z persists up until Verify AIP but then it seems like it might be disappearing during Store AIP see here for the exception output by Archivematica:

image
Command:

Module storeAIP_v0.0
"/api/v2/location/default/AS/" "/var/archivematica/sharedDirectory/currentlyProcessing/uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9/uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9.7z" "35f7048f-cfed-4dbc-a3be-9c724809b1d9" "uncomp_3" "AIP-REIN"

Stdout:

Storage service created AIP-REIN:
{u'error_message': u"[Errno 21] Is a directory: u'/var/archivematica/sharedDirectory/www/AIPsStore/35f7/048f/cfed/4dbc/a3be/9c72/4809/b1d9/uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9'",
 u'traceback': u'Traceback (most recent call last):\n\n  File "/usr/local/lib/python2.7/site-packages/tastypie/resources.py", line 220, in wrapper\n    response = callback(request, *args, **kwargs)\n\n  File "/usr/local/lib/python2.7/site-packages/tastypie/resources.py", line 460, in dispatch_detail\n    return self.dispatch(\'detail\', request, **kwargs)\n\n  File "/usr/local/lib/python2.7/site-packages/tastypie/resources.py", line 483, in dispatch\n    response = method(request, **kwargs)\n\n  File "/usr/local/lib/python2.7/site-packages/tastypie/resources.py", line 1472, in put_detail\n    updated_bundle = self.obj_update(bundle=bundle, **self.remove_api_resource_names(kwargs))\n\n  File "/src/storage_service/locations/api/resources.py", line 723, in obj_update\n    bundle = self.obj_update_hook(bundle, **kwargs)\n\n  File "/src/storage_service/locations/api/resources.py", line 761, in obj_update_hook\n    aip_subtype=aip_subtype)\n\n  File "/src/storage_service/locations/models/package.py", line 2112, in finish_reingest\n    premis_agents=premis_agents, aip_subtype=aip_subtype)\n\n  File "/src/storage_service/locations/models/package.py", line 805, in _create_pointer_file_write_to_disk\n    self.fetch_local_path(), checksum_algorithm).hexdigest()\n\n  File "/src/storage_service/common/utils.py", line 285, in generate_checksum\n    with open(file_path, \'rb\') as f:\n\nIOError: [Errno 21] Is a directory: u\'/var/archivematica/sharedDirectory/www/AIPsStore/35f7/048f/cfed/4dbc/a3be/9c72/4809/b1d9/uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9\'\n'}

Stderr:

storeAIP.py: DEBUG     2018-05-17 14:54:51,468  urllib3.connectionpool:_new_conn:208:  Starting new HTTP connection (1): archivematica-storage-service
storeAIP.py: DEBUG     2018-05-17 14:54:51,522  urllib3.connectionpool:_make_request:396:  http://archivematica-storage-service:8000 "GET /api/v2/pipeline/ca8ffeea-c513-43fc-bcdb-b623876b7a5d/ HTTP/1.1" 200 None
storeAIP.py: DEBUG     2018-05-17 14:54:51,540  urllib3.connectionpool:_new_conn:208:  Starting new HTTP connection (1): archivematica-storage-service
storeAIP.py: DEBUG     2018-05-17 14:54:51,612  urllib3.connectionpool:_make_request:396:  http://archivematica-storage-service:8000 "GET /api/v2/location/?pipeline__uuid=ca8ffeea-c513-43fc-bcdb-b623876b7a5d&purpose=CP&offset=0 HTTP/1.1" 200 None
storeAIP.py: DEBUG     2018-05-17 14:54:51,735  urllib3.connectionpool:_new_conn:208:  Starting new HTTP connection (1): archivematica-storage-service
storeAIP.py: DEBUG     2018-05-17 14:54:51,778  urllib3.connectionpool:_make_request:396:  http://archivematica-storage-service:8000 "GET /api/v2/pipeline/ca8ffeea-c513-43fc-bcdb-b623876b7a5d/ HTTP/1.1" 200 None
storeAIP.py: INFO      2018-05-17 14:54:51,781  archivematica.common:create_file:305:  Creating file with {'current_path': 'uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9.7z', 'related_package_uuid': None, 'agents': [('agent', {'xsi:schema_location': 'info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/v2/premis-v2-2.xsd', 'version': '2.2'}, ('agent_identifier', ('agent_identifier_type', u'preservation system'), ('agent_identifier_value', u'Archivematica-1.7')), ('agent_name', u'Archivematica'), ('agent_type', u'software')), ('agent', {'xsi:schema_location': 'info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/v2/premis-v2-2.xsd', 'version': '2.2'}, ('agent_identifier', ('agent_identifier_type', u'repository code'), ('agent_identifier_value', u'test')), ('agent_name', u'test'), ('agent_type', u'organization')), ('agent', {'xsi:schema_location': 'info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/v2/premis-v2-2.xsd', 'version': '2.2'}, ('agent_identifier', ('agent_identifier_type', u'Archivematica user pk'), ('agent_identifier_value', u'1')), ('agent_name', u'username="test", first_name="", last_name=""'), ('agent_type', u'Archivematica user'))], 'current_location': '/api/v2/location/default/AS/', 'size': 20864, 'origin_path': u'currentlyProcessing/uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9/uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9.7z', 'origin_pipeline': u'/api/v2/pipeline/ca8ffeea-c513-43fc-bcdb-b623876b7a5d/', 'uuid': '35f7048f-cfed-4dbc-a3be-9c724809b1d9', 'package_type': 'AIP', 'origin_location': u'/api/v2/location/8ecc7d58-036c-46cc-9a88-daaadfd6dab6/', 'aip_subtype': 'Archival Information Package', 'events': [('event', {'xsi:schema_location': 'info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/v2/premis-v2-2.xsd', 'version': '2.2'}, ('event_identifier', ('event_identifier_type', 'UUID'), ('event_identifier_value', u'62fa3519-fa5f-4c65-bc76-3cbaa1de5599')), ('event_type', u'compression'), ('event_date_time', '2018-05-17T14:31:06.276515+00:00'), ('event_detail', u'program=7z; algorithm=bzip2; version=p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,4 CPUs)\n'), ('event_outcome_information', ('event_outcome', u''), ('event_outcome_detail', ('event_outcome_detail_note', u'Standard Output=""; Standard Error=""'))), ('linking_agent_identifier', ('linking_agent_identifier_type', u'preservation system'), ('linking_agent_identifier_value', u'Archivematica-1.7')), ('linking_agent_identifier', ('linking_agent_identifier_type', u'repository code'), ('linking_agent_identifier_value', u'test')), ('linking_agent_identifier', ('linking_agent_identifier_type', u'Archivematica user pk'), ('linking_agent_identifier_value', u'1'))), ('event', {'xsi:schema_location': 'info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/v2/premis-v2-2.xsd', 'version': '2.2'}, ('event_identifier', ('event_identifier_type', 'UUID'), ('event_identifier_value', u'501c91ae-da42-43d6-abf3-0972abe4804e')), ('event_type', u'fixity check'), ('event_date_time', '2018-05-17T14:54:48.260486+00:00'), ('event_detail', u'program="python, bag"; module="hashlib.sha256()"'), ('event_outcome_information', ('event_outcome', u'Pass'), ('event_outcome_detail', ('event_outcome_detail_note', u'All 4 checksums generated at start of transfer match those generated by BagIt (bag).'))), ('linking_agent_identifier', ('linking_agent_identifier_type', u'preservation system'), ('linking_agent_identifier_value', u'Archivematica-1.7')), ('linking_agent_identifier', ('linking_agent_identifier_type', u'repository code'), ('linking_agent_identifier_value', u'test')), ('linking_agent_identifier', ('linking_agent_identifier_type', u'Archivematica user pk'), ('linking_agent_identifier_value', u'1')))]}
storeAIP.py: DEBUG     2018-05-17 14:54:51,794  urllib3.connectionpool:_new_conn:208:  Starting new HTTP connection (1): archivematica-storage-service
storeAIP.py: DEBUG     2018-05-17 14:54:55,041  urllib3.connectionpool:_make_request:396:  http://archivematica-storage-service:8000 "PUT /api/v2/file/35f7048f-cfed-4dbc-a3be-9c724809b1d9/ HTTP/1.1" 500 None
storeAIP.py: INFO      2018-05-17 14:54:55,044  archivematica.mcp.client.storeAIP:store_aip:178:  Storage service created AIP-REIN:
{u'error_message': u"[Errno 21] Is a directory: u'/var/archivematica/sharedDirectory/www/AIPsStore/35f7/048f/cfed/4dbc/a3be/9c72/4809/b1d9/uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9'",
 u'traceback': u'Traceback (most recent call last):\n\n  File "/usr/local/lib/python2.7/site-packages/tastypie/resources.py", line 220, in wrapper\n    response = callback(request, *args, **kwargs)\n\n  File "/usr/local/lib/python2.7/site-packages/tastypie/resources.py", line 460, in dispatch_detail\n    return self.dispatch(\'detail\', request, **kwargs)\n\n  File "/usr/local/lib/python2.7/site-packages/tastypie/resources.py", line 483, in dispatch\n    response = method(request, **kwargs)\n\n  File "/usr/local/lib/python2.7/site-packages/tastypie/resources.py", line 1472, in put_detail\n    updated_bundle = self.obj_update(bundle=bundle, **self.remove_api_resource_names(kwargs))\n\n  File "/src/storage_service/locations/api/resources.py", line 723, in obj_update\n    bundle = self.obj_update_hook(bundle, **kwargs)\n\n  File "/src/storage_service/locations/api/resources.py", line 761, in obj_update_hook\n    aip_subtype=aip_subtype)\n\n  File "/src/storage_service/locations/models/package.py", line 2112, in finish_reingest\n    premis_agents=premis_agents, aip_subtype=aip_subtype)\n\n  File "/src/storage_service/locations/models/package.py", line 805, in _create_pointer_file_write_to_disk\n    self.fetch_local_path(), checksum_algorithm).hexdigest()\n\n  File "/src/storage_service/common/utils.py", line 285, in generate_checksum\n    with open(file_path, \'rb\') as f:\n\nIOError: [Errno 21] Is a directory: u\'/var/archivematica/sharedDirectory/www/AIPsStore/35f7/048f/cfed/4dbc/a3be/9c72/4809/b1d9/uncomp_3-35f7048f-cfed-4dbc-a3be-9c724809b1d9\'\n'}

@ross-spencer
Copy link
Contributor Author

Tested in 1.10.

  • Problem does not exist on RPM test server.
  • Problem still exists in Docker deployment.

Both on stable/1.10.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.
Projects
None yet
Development

No branches or pull requests

1 participant