Skip to content

Data Model

Martin Wahnschaffe edited this page Oct 25, 2022 · 4 revisions

General

SORMAS is based on the IDSR (Integrated Disease Surveillance and Response) framework and thus comes with quite specific process for case surveillance, contact tracing and similar.

Based on that it is necessary to have an explicit data model that defines the core entities and their relations, as-well as essential fields like disease, report date, responsible region, classification.

UUID

Universal Unique Identifiers are used to uniquely identify all entities. The corresponding Java implementation of the UUID is used to generate the 128-bit number. The UUID is generated uniquely and system-independently, which means that no central allocation system is required. The UUIDv4 implementation uses random numbers as the source. The Java implementation is SecureRandom, which uses an unpredictable value as the seed to generate random numbers to reduce the chance of collisions. In SORMAS, this number is represented as a Base32-encoded string (e.g. TFRGBU-UVB25S-VSUAFH-QLYFKKOA). The first 6 characters are used as a short ID in tables and other overviews.

To be replaced with a standard UUID: https://github.com/hzi-braunschweig/SORMAS-Project/issues/7086

Report & creation date

An important distinction in the system is between the report date and the creation date. The three core entities Case, Contact and Event each have a report date (reportDate). This represents when the information became known to those responsible. Independently of this, there is a creation date and a change date for each data object in the system, as well as, as a rule, an assigned creating user. This data cannot be edited by the user.

Disease & country specific data

Fields that are only relevant for a subset of the available disease use the @Diseases annotation to define those diseases. Fields that are only relevant for a subset of countries use the @HideForCountries or @HideForCountriesExcept annotation.

For enums there are currently two approaches:

  • Deprecated: Provide a getValues method that returns the values applicable for a disease and country.
    Example: ContactProximity
  • Use the CustomizableEnum class to define instance (e.g. country) specific values that can also be disease specific.
    Example: DiseaseVariant

We are looking for ways on how to make this easier to individualize the data model for countries. One option may be to use openEHR to define and store the part of the data model that is not essential for the SORMAS processes.

Core Data

An important basic principle is that apart from infrastructure data and personal data, all essential data in the system is event-based. A case, for example, does not represent a person, but the fact that a person has contracted an infection at a certain point in time.

Those events are the core parts of the data model. The relevant data entities are display in dark blue in the following diagram:

grafik Source

Person

The person represents a human being in the system. This includes basic personal data, such as name, date of birth and gender, as well as contact details, occupation information and more. A person's data can be recorded, for example, in the context of contact with a sick person, and can subsequently be used if the person becomes sick themselves.

Case (core)

A case represents that a person is or could be ill with an infectious disease at a certain point in time. In addition to the classification of the case, symptoms, epidemiological data, hospitalisations and more are comprehensively documented.

Sample & Test

A sample represents a laboratory specimen, such as a throat swab. One or more tests with results can be stored for a laboratory sample. A sample & test can also be create from a LabMessage. LabMessages are sent or pulled from an external system (e.g. DEMIS) and semi-automatically assigned to case/contact/event participant via an inbox view.

Contact (core) & Visit

A contact represents that a person had contact with a case at a certain time. Different types of contact are distinguished (e.g. 15-minute facial contact) and divided into categories. For contacts, follow-up is carried out in the form of daily calls for the duration of the incubation period. During such a call or visit, the symptoms of the person are recorded. If necessary, a case can be created from a contact - both then refer to the same person.

Event (core), Event Participant (core) & Action

An event represents an occurrence in the real world (e.g. a concert or a flight) that is related to the spread of an infectious disease. An event can also represent a local outbreak. All persons related to the event are documented as event participants in SORMAS. This in turn can result in a case, which then refers to the same person. Actions document measures that were taken in the context of the event. For example, the closure of a facility or informing the population about a local outbreak.

Immunization (core)

Documents whether a person has an immunization for a disease - either based on one or multiple vaccinations or on recovering from the disease (related case).

Travel entry (core)

Represents a travel the person has done that is relevant with respect to a certain disease (e.g. travel to a COVID risk region).

Task

Tasks are used to manage and document tasks to be completed by SORMAS users. This can be, for example, the identification of contacts of a case. Tasks have a due date and a responsible user. They can be created with reference to a case, contact, event or as a general task.

Infrastructure Data

grafik Source

Area, Region, District, Community

These four data types represent the regional division of a country. Typically, the allocation is based on areas of responsibility within the health system. For example, in Germany the region corresponds to a federal state, the district to a county and the community to a municipality. The Area is an optional super-ordinate group (e.g. North).

Continent, Subcontinent, Country

Cross-country representation of territories.

Facility

A facility represents an institution. This can be a medical facility, such as a hospital or a laboratory, or any other type of facility, such as a nursing home.

Campaign Data

The SORMAS "Campaign" feature is a new feature that was implemented with US CDC as project partner. The prototype is used in Afghanistan to conduct polio vaccination campaigns.

A campaign represents, for example, a vaccination campaign lasting several days, in which teams go from house to house and carry out vaccinations. Data is collected on the vaccinations carried out, but this can vary greatly from campaign to campaign.

SORMAS allows input forms to be defined using JSON in order to be able to collect the desired data that is also stored in a json type column.

To evaluate the data, a dashboard with diagrams can also be freely defined using JSON.