Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove references to FITS #476

Merged
merged 2 commits into from
Sep 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -393,13 +393,6 @@ Some of the default preservation action rules can take considerable processing
time and resources. We have found the following rules useful to change in some
cases.

**Turn off default characterization rule:** `FITS`_ is used to characterise files
that don't have a recognised file format. Executing this rule takes processing
time and adds raw output to the METS file that can be low value for some
formats. For example, in scientific datasets with large numbers of generic text
files, or binary files created by instruments in scientific experiments, the
output can be verbose without being useful.

**Reduce number of image characterization rules:** Archivematica has rules
defined for all image and audio-visual formats to use ExifTool, Mediainfo and
ffprobe for characterisation. Using multiple tools ensures as much
Expand Down Expand Up @@ -432,5 +425,4 @@ to create (and check) than the alternatives (e.g. SHA-256).
.. _Archivematica user forum: https://groups.google.com/forum/#!forum/archivematica
.. _Dashboard: https://github.com/artefactual/archivematica/tree/6ead2083f7bdd8b10ca76d41a7bff9c5aee23eb3/src/dashboard/install
.. _benchmarking: https://www.itforarchivists.com/siegfried/benchmarks
.. _FITS: https://projects.iq.harvard.edu/fits/home
.. _Database is locked: https://docs.djangoproject.com/en/1.8/ref/databases/#database-is-locked-errors
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ information, see :ref:`Advanced <advanced>`.
the Percona and MariaDB alternatives.

Some of the tools run by Archivematica require Java to be
installed (primarily Elasticsearch and fits). On Ubuntu 22.04, Open JDK 8
installed (primarily Elasticsearch). On Ubuntu 22.04, Open JDK 8
is used, but Open JDK 11 is the default. It is possible to use Oracle Java 8
instead.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,6 @@ sudo service archivematica-mcp-client restart
sudo service archivematica-storage-service start
sudo service archivematica-dashboard restart
sudo service nginx restart
sudo systemctl enable fits-nailgun
sudo service fits-nailgun start

sudo ufw allow 22/tcp
sudo ufw allow 80/tcp
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,6 @@ sudo -u root sed -i 's/^Example//g' /etc/clamd.d/scan.conf

sudo -u root systemctl enable archivematica-mcp-client
sudo -u root systemctl start archivematica-mcp-client
sudo -u root systemctl enable fits-nailgun
sudo -u root systemctl start fits-nailgun
sudo -u root systemctl enable clamd@scan
sudo -u root systemctl start clamd@scan
sudo -u root systemctl restart archivematica-dashboard
Expand Down
1 change: 0 additions & 1 deletion admin-manual/maintenance/maintenance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -502,7 +502,6 @@ Other services that Archivematica depends on are:
* ElasticSearch
* Gearman
* MySQL (Ubuntu) or MariaDB (Rocky Linux)
* Nailgun
* Nginx

Each service can be started/stopped/restarted with:
Expand Down
14 changes: 0 additions & 14 deletions getting-started/overview/external-tools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,6 @@ to identify the file formats of digital objects.

**License**: `Apache License Version 2.0`_

`File Information Tool Set (FITS)`_
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File format identification and validation software integration tool.

**License**: `GNU Lesser General Public License`_

`Gearman`_
^^^^^^^^^^
Gearman provides a generic application framework to farm out work to other
Expand Down Expand Up @@ -120,12 +114,6 @@ data for video and audio files.

**License**: `BSD (2-clause)`_

`Nailgun`_
^^^^^^^^^^
A client, protocol, and server for running Java programs from the command line.

**License**: `Apache License Version 2.0`_

`NFS-common`_
^^^^^^^^^^^^^
Network File System Access - allows access to files on network storage devices.
Expand Down Expand Up @@ -203,7 +191,6 @@ The Unarchiver is an archive unpacker program.
.. _ExifTool: https://exiftool.org/index.html
.. _FFmpeg: http://ffmpeg.org/
.. _fido: https://github.com/openpreserve/fido/
.. _File Information Tool Set (FITS): https://projects.iq.harvard.edu/fits
.. _Gearman: http://gearman.org/
.. _Ghostscript: https://www.ghostscript.com/
.. _GNU Affero General Public License (AGPL): https://www.gnu.org/licenses/agpl-3.0.en.html
Expand All @@ -219,7 +206,6 @@ The Unarchiver is an archive unpacker program.
.. _JHOVE: https://github.com/openpreserve/jhove
.. _MediaConch: https://mediaarea.net/MediaConch
.. _MediaInfo: https://mediaarea.net/en/MediaInfo
.. _Nailgun: https://github.com/facebook/nailgun
.. _NFS-common: https://linux-nfs.org
.. _p7zip: http://p7zip.sourceforge.net/
.. _Python-lxml: https://lxml.de/
Expand Down
9 changes: 4 additions & 5 deletions getting-started/troubleshooting/error-handling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -178,9 +178,9 @@ The user may choose to continue processing the SIP despite any normalization
errors.

The user may choose to redo normalization, as well. For instance, if
the user chose to normalize based on FITS-JHOVE results and experienced
the user chose to normalize based on one tool's results and experienced
failures, the user may wish to redo normalization and choose to normalize
based on FITS-DROID results instead.
based on another tool's results instead.

.. figure:: images/Normdropdown-10.*
:align: center
Expand Down Expand Up @@ -269,10 +269,9 @@ Other common error behaviours
Below is a list of common errors that, like normalization, will produce an
error report but will not fail the transfer.

#. Characterize and extract metadata: if FITS processing fails, the
#. Characterize and extract metadata: if the characterization tool fails, the
micro-service will fail and the transfer will continue processing.
Similarly, if a tool within FITS fails, like JHOVE, you will see the pink
error bar but be able to continue processing.
You will see the pink error bar but be able to continue processing.

#. Remove thumbs.db file: if Archivematica is unable to remove a thumbs.db
file, the microservice will fail and the SIP will continue processing.
Expand Down
21 changes: 8 additions & 13 deletions user-manual/preservation/preservation-planning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -178,10 +178,9 @@ Tools
^^^^^

Archivematica acts as a wrapper for many open source tools used to carry out
preservation actions. These include digital preservation specific tools like
`FITS`_, used for extracting technical metadata from files, as well as tools for
handling different file formats like `Inkscape`_, which is a design program used
to handle vector images.
preservation actions. These include digital preservation specific tools (such as
those to carry out file format identification and characterization) as well as
generalist tools used for actions such as normalization.

The full list of tools can be accessed in the left-hand sidebar by selecting
**Tools** under the *Format policy registry* heading.
Expand Down Expand Up @@ -503,13 +502,11 @@ Archivematica has four characterization tools available upon installation. Which
tool will run on a given file depends on the type of file, as determined by
the identification tool.

The default characterization tool is FITS; it will be used if no specific
characterization rule exists for the file being scanned. It is possible to
create new default characterization commands, which can either replace FITS or
run alongside it on every file.
It is possible to create new default characterization commands, which can run
on every file.

Depending on the type of the file being scanned, one or more of these tools may
be called instead of FITS.
be called.

* `FFprobe <FFprobe_>`_, a characterization tool built on top of the same core as
FFmpeg, the normalization software used by Archivematica.
Expand Down Expand Up @@ -539,9 +536,8 @@ For more information about writing a command, see :ref:`Writing commands
Characterization rules
^^^^^^^^^^^^^^^^^^^^^^

A characterization rule must be created to connect a characterizatio command to
a particular format. Note that formats that do not have a rule will be
characterized by FITS by default.
A characterization rule must be created to connect a characterization command to
a particular format.

For more information about creating a rule, see :ref:`Changing rules
<changing-rules>` above.
Expand Down Expand Up @@ -1087,7 +1083,6 @@ You do not need to create rules for verification.
.. _Ghostscript: https://www.ghostscript.com/
.. _ps2pdf: https://www.ps2pdf.com/
.. _Archivematica issues repo: https://github.com/archivematica/Issues
.. _FITS: https://projects.iq.harvard.edu/fits/home
.. _JHOVE: https://openpreservation.org/products/jhove/
.. _MediaConch documentation: https://mediaarea.net/MediaConch/Documentation/HowToUse
.. _MediaConchOnline: https://mediaarea.net/MediaConchOnline/
Expand Down
8 changes: 2 additions & 6 deletions user-manual/transfer/forensic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,8 @@ in the processing configuration by setting the *Extract packages* job to "No".

If you would like to extract forensic image formats anyway, you can set the
*Extract packages* job to "Yes". We recommend testing this at scale to ensure
that it is a viable workflow for your deployment. One scalability option that
can help to mitigate the processing load caused by extraction is turning off
`FITS`_, which is the default characterization tool that will run on every
extracted file. For more information, see :ref:`Preservation Action Rules
<disable-fpr-rules>` on the Scalability page.
that it is a viable workflow for your deployment. For more information, see
:ref:`Preservation Action Rules <disable-fpr-rules>` on the Scalability page.

Delete packages after extraction
++++++++++++++++++++++++++++++++
Expand Down Expand Up @@ -160,5 +157,4 @@ desired, by using the backlog arrangement functionality in Archivematica.


.. _Bulk Extractor: https://github.com/simsong/bulk_extractor/wiki
.. _FITS: https://projects.iq.harvard.edu/fits/home
.. _fiwalk: https://forensicswiki.xyz/wiki/index.php?title=Fiwalk
Loading