Skip to content

Commit

Permalink
Merge pull request #310 from IATI/develop
Browse files Browse the repository at this point in the history
Merge Develop into main
  • Loading branch information
odscjames authored Jan 10, 2024
2 parents d4dd7be + d97a63f commit 4b63883
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 6 deletions.
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
# IATI Refresher

A Python application which has the responsibility of tracking IATI data from around the Web and refreshing the core IATI software's data stores.
# Summary

Product | IATI Refresher
--- | ---
Description | A Python application which has the responsibility of tracking IATI data from around the Web and refreshing the core IATI software's data stores.
Website | None
Related | [datastore-search](https://github.com/IATI/datastore-search), [validator-web](https://github.com/IATI/validator-web)
Documentation | Rest of readme
Technical Issues | See https://github.com/IATI/refresher/issues
Support | https://iatistandard.org/en/guidance/get-support/


Its responsibilities include:

Expand Down
10 changes: 7 additions & 3 deletions src/library/lakify.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,14 +91,18 @@ def process_hash_list(document_datasets):
if identifiers:
id_hash = utils.get_hash_for_identifier(
clean_identifier(identifiers[0]))

# XML
activity_xml = etree.tostring(activity, encoding='utf-8')
activity_json = recursive_json_nest(activity, {})
act_blob_client = blob_service_client.get_blob_client(
container=config['ACTIVITIES_LAKE_CONTAINER_NAME'], blob='{}.xml'.format(id_hash))
container=config['ACTIVITIES_LAKE_CONTAINER_NAME'], blob='{}/{}.xml'.format(doc_id, id_hash))
act_blob_client.upload_blob(activity_xml, overwrite=True)
act_blob_client.set_blob_tags({"dataset_hash": file_hash})

# JSON
activity_json = recursive_json_nest(activity, {})
act_blob_json_client = blob_service_client.get_blob_client(
container=config['ACTIVITIES_LAKE_CONTAINER_NAME'], blob='{}.json'.format(id_hash))
container=config['ACTIVITIES_LAKE_CONTAINER_NAME'], blob='{}/{}.json'.format(doc_id, id_hash))
act_blob_json_client.upload_blob(
json.dumps(activity_json, ensure_ascii=False).replace(
'{http://www.w3.org/XML/1998/namespace}', 'xml:').encode('utf-8'),
Expand Down
4 changes: 2 additions & 2 deletions src/library/solrize.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ def process_hash_list(document_datasets):
for fa in flattened_activities[0]:
hashed_identifier = utils.get_hash_for_identifier(
fa['iati_identifier'])
blob_name = '{}.xml'.format(hashed_identifier)
blob_name = '{}/{}.xml'.format(file_id, hashed_identifier)

try:
blob_client = blob_service_client.get_blob_client(
Expand All @@ -165,7 +165,7 @@ def process_hash_list(document_datasets):
raise SolrizeSourceError('Could not identify charset for blob: ' + blob_name +
', file hash: ' + file_hash + ', iati-identifier: ' + fa['iati_identifier'])

json_blob_name = '{}.json'.format(hashed_identifier)
json_blob_name = '{}/{}.json'.format(file_id, hashed_identifier)

try:
json_blob_client = blob_service_client.get_blob_client(
Expand Down

0 comments on commit 4b63883

Please sign in to comment.