Skip to content

Commit

Permalink
fix wacky apostrophes
Browse files Browse the repository at this point in the history
  • Loading branch information
gremau committed Dec 17, 2024
1 parent 59c1bdc commit 42185ed
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
8 changes: 4 additions & 4 deletions guide-special-cases/code.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ Example 2: Minimum recommended codemeta.json example for unnamed projects.
}
```

Example 3: sample otherEntity metadata for example 2’s codemeta.json.
Example 3: sample otherEntity metadata for example 2's codemeta.json.

```xml
<otherEntity>
Expand Down Expand Up @@ -170,11 +170,11 @@ Example 4: A more complete CodeMeta example for named projects. Example taken fr

When archiving software, we strongly recommend including a user guide with installation and usage instructions if such would not already be apparent to the typical user. Take into account that the user might not have access to certain inputs that the software/scripts require. Include when feasible at least some example data, and configure the script so that it is ready to run with the example data.

Aside from the software/code itself and its dependencies, other pieces of information may be important should a user wish to reproduce results, such as the operating system and version and the system locale. Include this information in the data package’s methods/methodStep/description. For certain tools, there are ways to easily generate this information, e.g., a call to sessionInfo() in the R console. If the system outputs this information in a standardly formatted plain text file, that might be included as an otherEntity.
Aside from the software/code itself and its dependencies, other pieces of information may be important should a user wish to reproduce results, such as the operating system and version and the system locale. Include this information in the data package's methods/methodStep/description. For certain tools, there are ways to easily generate this information, e.g., a call to sessionInfo() in the R console. If the system outputs this information in a standardly formatted plain text file, that might be included as an otherEntity.

### Linking code and data

There are a few solutions for providing explicit machine-readable linkages between different entities/packages (the distinction between code/data doesn’t matter too much here). For most cases we recommend the simplest approach, which is to use the methods/methodStep/description element of EML. More advanced users may wish to utilize the other solutions described herein.
There are a few solutions for providing explicit machine-readable linkages between different entities/packages (the distinction between code/data doesn't matter too much here). For most cases we recommend the simplest approach, which is to use the methods/methodStep/description element of EML. More advanced users may wish to utilize the other solutions described herein.

#### Descriptive approach

Expand All @@ -192,7 +192,7 @@ Nested under methods/methodStep, dataSource elements describe other data package

Large community-backed tools or proprietary software such as ArcGIS Pro or Microsoft Excel do not need to be archived. However, if they have had any impact on the final data (e.g., ArcGIS Pro was used to modify spatial rasters), the EML methods section should describe the routines performed. Within the data package, indicate linkage to external software as follows.

* Briefly describe the software/code and its relationship to the data in EML’s methods/methodStep/description element.
* Briefly describe the software/code and its relationship to the data in EML's methods/methodStep/description element.
* Names of all software used. Include both the common acronym and the full spelling.
* The URL(s) to all models/software used. Stable, persistent URLs pointing to exact version(s) are preferable, rather than generic links such as a project homepage. If the archived model has a DOI, then include a full citation to the model in the methods/methodStep/description text. The exception to this is when referencing tools such as Excel that have achieved global household name status.
* Broadly, the system setup used, if relevant.
Expand Down
4 changes: 2 additions & 2 deletions guide-special-cases/images-and-docs.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -177,12 +177,12 @@ Table 2: Data packages in EDI providing examples of best practices from this doc

### Considerations for digitizing documents

Following are some general considerations and recommendations for digitizing paper or other 'hard-copy' documents for archival. This is not meant to be an exhaustive list. For further and more detailed information, please refer to the U.S. National Archives and Records Administration (NARA)’s _[Technical Guidelines for Digitizing Archival Materials for Electronic Access](https://www.archives.gov/files/preservation/technical/guidelines.pdf)_.
Following are some general considerations and recommendations for digitizing paper or other 'hard-copy' documents for archival. This is not meant to be an exhaustive list. For further and more detailed information, please refer to the U.S. National Archives and Records Administration (NARA) _[Technical Guidelines for Digitizing Archival Materials for Electronic Access](https://www.archives.gov/files/preservation/technical/guidelines.pdf)_.

* **Effort.** The decision to digitize documents, as well as the digitization method, involves trade-offs in the accessibility and ease of using particular hardware and/or software technologies, the quality of the digitization, and the overall effort spent. Digitization efforts may be significant, for example, when dealing with a large number of documents requiring meaningful file names, text recognition, and/or high resolution for improved accessibility.
* **Equipment.** Instruments for digitizing hard-copy documents range from high resolution scanners (less accessible, less user-friendly, more expensive, better quality) to smartphone cameras (ubiquitous, easy-to-use, lower quality). For example, taking a smartphone image in the field may be utilized for quick and easy digitization of field notes.
* **Document resolution and file size.** This is an important consideration that should be guided by the content and purpose of the document. Detailed paper maps should probably be scanned at high resolution and large file size, while field sheets may not need as much detail.
* **Optical Character Recognition (OCR):** When digitizing documents that include text, we recommend using scanning or other software with OCR capabilities (e.g., Adobe, ABBYY, Tesseract) to convert the text into machine readable characters so that the documents are searchable and thus, more usable. OCR does not work well for handwritten text, older fonts, or documents with busy backgrounds (speckled, dirty, faded, etc.).
* **Sensitive Information and Human Subjects:** Regardless of the digitization method, one should be mindful of sensitive information that shouldn’t be archived or otherwise redacted (e.g., photographs of human subjects, field notebooks containing personal messages, gate combinations, and/or telephone numbers). In all cases in which human subjects are involved, Institutional Review Board (IRB) restrictions must be heeded. A signed IRB consent form for the associated research project represents a contract between researcher and human subject. It is important to note that IRB restrictions can differ among research studies within the same project. For further information, see the [EDI Data Initiative Data Policy](https://edirepository.org/about/edi-policy#sensitive-data).
* **Sensitive Information and Human Subjects:** Regardless of the digitization method, one should be mindful of sensitive information that shouldn't be archived or otherwise redacted (e.g., photographs of human subjects, field notebooks containing personal messages, gate combinations, and/or telephone numbers). In all cases in which human subjects are involved, Institutional Review Board (IRB) restrictions must be heeded. A signed IRB consent form for the associated research project represents a contract between researcher and human subject. It is important to note that IRB restrictions can differ among research studies within the same project. For further information, see the [EDI Data Initiative Data Policy](https://edirepository.org/about/edi-policy#sensitive-data).

While transcription is a digitization method that can be performed on certain types of documents (e.g., audio/video recordings, field notebooks) and can enhance search capabilities, transcript generation requires substantially more effort than other digitization methods, and is prone to error. Moreover, in the case where the original documents contain drawings, transcripts may be incomplete or otherwise inaccurate. _Thus, we recommend digitizing documents by other means, using the considerations described above._

0 comments on commit 42185ed

Please sign in to comment.