Skip to content

resource identifiers

Peter Van den Bosch edited this page Sep 7, 2021 · 7 revisions

Scope

Establish guidelines how to design new resource identifiers (string, integer, URI, UUID, ...).

This includes codes, which can be seen as a special type of identifiers:

  • they have an exhaustive list of possible values that doesn't change frequently over time
  • each code identifies a concept (example: a country, a gender, ...).

Related issue: https://github.com/belgif/rest-guide/issues/81

Requirements

A resource identifier should be stable over time, i.e. should not change when properties of the resource change, nor should it point to different resources over time.

Furthermore, one or more of following requirements may apply when designing resource identifiers, according to use case:

  • easy to memorize (e.g. textual identifier like problem types)
  • input by user (e.g. web form, over phone/mail)
    • easy to type (ignore special separator chars, difference between lower/capital case), limited length
    • validation of typing errors, e.g. by checksum, fixed length, ...
    • hint on format to recognize purpose of identifier based on its value
  • printable (length restricted)
  • evolvable structure
  • generate identifiers at multiple independent sources (e.g. prefix by source, uuids, ...)
  • stable across different deployment environments (e.g. problem type codes)
  • hide any business information (e.g. no sequential number that indicates number of resources created)
  • easy to represent in URL parameter
  • sortable (for technical reasons e.g. pagination)

Current Guidelines

The identity key is preferably a natural business identifier, uniquely identifying the business resource. If such key does not exist, a surrogate or technical key (like UUID) can be used.

[...]

on enum values

A fixed list of possible values of a property can be specified using enum. However, this may make it harder to change the list of possible values, as client applications will often depend on the specified list e.g. by using code generation.

enum SHOULD only be used when the list of values is unlikely to change or when changing it has a big impact on clients of the API.

Enumerated string values SHOULD be declared in lowerCamelCase, just as property names.

[...]

When defining the type for a property representing a numerical code or identifier:

  • if the values constitute a list of sequentially generated codes (e.g. gender ISO code), type: integer SHOULD be used. It is RECOMMENDED to further restrict the format of the type (e.g. format: int32).
  • if the values are of fixed length or not sequentially generated, type: string SHOULD be used (e.g. Ssin, EnterpriseNumber). This avoids leading zeros to be hidden.

When using a string data type, each code SHOULD have a unique representation, e.g. don’t allow representations both with and without a leading zeros or spaces for a single code. If possible, specify a pattern with a regular expression restricting the allowed representations.

Prior art

Zalando

https://opensource.zalando.com/restful-api-guidelines/#228 MUST use URL-friendly resource identifiers: [a-zA-Z0-9:._-/]* [228]

https://opensource.zalando.com/restful-api-guidelines/#144 SHOULD only use UUIDs if necessary [144]

  • lists usability disadvantages of UUIDs
  • UUIDs should be avoided when not needed for large scale id generation. Instead, for instance, server side support with id generation can be preferred
  • Please be aware that sequential, strictly monotonically increasing numeric identifiers may reveal critical, confidential business information, like order volume, to non-privileged clients.
  • In any case, we should always use string rather than number type for identifiers. This gives us more flexibility to evolve the identifier naming scheme. Accordingly, if used as identifiers, UUIDs should not be qualified using a format property.

PayPal

https://github.com/paypal/api-standards/blob/master/api-style-guide.md#resource-identifiers

APIs MUST NOT use the database sequence number as the resource identifier. A UUID, Hashed Id (HMAC based) is preferred as a resource identifier. Resource IDs SHOULD try to use either Resource Identifier Characters or ASCII characters. There SHOULD NOT be any ID using UTF-8 characters. Enumeration values can be used as sub-resource IDs. String representation of the enumeration value SHOULD be used.

Austrialian Government

https://api.gov.au/standards/national_api_standards/definitions.html#resource-identifier

As long as the identifier is unique within your application it can be any string of characters or numbers.

Google

https://cloud.google.com/apis/design/resource_names#resource_id

API services should use URL-friendly resource IDs when feasible.

Vlaanderen

Een resource identifier dient één enkele (desnoods applicatie geconcateneerde) unieke key te zijn waarmee één resource zal terug te vinden zijn.

De unieke key moet voldoen aan semantische coherentie. Dat wil zeggen dat het niet toegelaten is dat dezelfde unieke key zal gebruikt worden om doorheen de tijd naar verschillende objecten te verwijzen. De toestand van dat object waarnaar het verwijst kan wel wijzigen. Gebruik geen URI als resource identifier. Bij voorkeur worden GUID's of long integers aangeraden. Bestaande businessdomein identificatoren zoals ISBN-nummers, CAPA key, rijksregisternummers, ... vormen ook een goede keuze.

Motivatie: URI's als resource identifier zijn lang én vereisen het toepassen van URL encoding, wat de leesbaarheid en het gebruik van de API bemoeilijkt.

[ ... ]

Indien men identificatoren wenst te ontsluiten op het web, dan verwijzen we naar de URI standaard voor persistente identificatoren.

Heroku

https://geemus.gitbooks.io/http-api-design/content/en/responses/provide-resource-uuids.html

Give each resource an id attribute by default. Use UUIDs unless you have a very good reason not to. Don’t use IDs that won’t be globally unique across instances of the service or other resources in the service, especially auto-incrementing IDs.

Options

  • string-based
    • UUID
    • URI
    • enum of textual values (for codes)
    • unconstrained, regexp pattern, fixed/max length, ...
  • integer
    • unconstrained, min/max value, ...

New rule (draft)

Where there's already an existing business domain identifiers for the resource (e.g. ISBN numbers, CAPA key, social security identification number, ...), they should be used. When designing the structure new codes and identifiers, we advise to apply these guidelines:

  • use a string-based format: uuid, uri, string enum or other, according to requirements of the identifier
    • probably impossible to have a single format that covers all requirements
    • (include possible list of requirements of this wiki page in REST guide)
  • multiple string representations of a single code/identifier shouldn't be allowed (leading zeros, spaces)
  • No business meaning should be attributed to parts of the identifier. This should be captured in separate data fields. Parts with technical meaning like checksum is allowed.
    • note: some identifiers in the guide have been constructed with parts with some business meaning (e.g problem type identifiers), but no application logic should depend on this
  • a regexp may be specified if helpful for input validation or hint of structure, but shouldn't be too restrictive in order to evolve the format
  • for codes:
    • string values SHOULD be declared in lowerCamelCase, just as property names. (even if not listed in enum)
    • A fixed list of possible values can be specified using enum. However, this may make it harder to change the list of possible values, as client applications will often depend on the specified list e.g. by using code generation. enum SHOULD only be used when the list of values is unlikely to change or when changing it has a big impact on clients of the API.