Skip to content

Commit

Permalink
Merge pull request #1806 from trade-tariff/BAU-redeploy-staging
Browse files Browse the repository at this point in the history
BAU: Lint markdown files
  • Loading branch information
willfish authored Apr 10, 2024
2 parents a169d7b + f6bb61b commit 0fbb8cf
Show file tree
Hide file tree
Showing 7 changed files with 30 additions and 25 deletions.
1 change: 0 additions & 1 deletion docs/adr/2023-02-07_goods-nomenclature-nested-set.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,3 @@ Move to accessing the Goods Nomenclatures hierarchy via a modified nested set pa
* We can remove the slow overnight generation of the Headings cache in Elastic Search - replacing with direct querying of the database.
* Our interpretation of the hierarchy when presented with invalid data via CDS or Taric may change, eg if indent levels are incorrect.
* We have the tools to optimise other parts of the codebase such as Additional Code search

8 changes: 4 additions & 4 deletions docs/adr/2023-06-28_add_cache_headers.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,17 @@ Caching of backend responses will be enabled in two phases - first internally on

### Phase 1

The backend will start setting headers to control caching. These will have a very short TTL to allow for quick updating will utilise ETags to allow for very quick (~2ms) HEAD 304 responses from the backend for the majority of requests which have already been seen by the caching client.
The backend will start setting headers to control caching. These will have a very short TTL to allow for quick updating will utilise ETags to allow for very quick (~2ms) HEAD 304 responses from the backend for the majority of requests which have already been seen by the caching client.

Any non-GET or HEAD requests will not set cache headers

Some endpoints (eg News) will require custom ETags based on the the relevant data lifecycle.
Some endpoints (eg News) will require custom ETags based on the the relevant data lifecycle.

The backends Cache-Control header is overwritten by the frontend, then ignored
by the CDN anyway so this should not have any downstream impact.

The Frontend will have a http cache added to its api client, which means it will
in turn cache the api responses it is receiving from the backend (where the
in turn cache the api responses it is receiving from the backend (where the
backends response headers direct it to).

### Phase 2
Expand All @@ -40,4 +40,4 @@ The frontend will mark its own responses as no-store to prevent caching
After the above has been deployed the CDN will be updated to follow the caching headers from its upstream -;

* In the case of web access this should always be 'no-store'
* In the case of proxied API access from the backend, this should be the ETagged behaviour dictated by the backend
* In the case of proxied API access from the backend, this should be the ETagged behaviour dictated by the backend
22 changes: 11 additions & 11 deletions docs/caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ We utilise caching to avoid doing repeated work whilst presenting a largely read
Working from 'inside' out, we cache at multiple levels

* Backend uses a Redis backed Rails cache to store **some** API responses
- this avoids repeated and sometimes expensive database queries
- cleared after a Tariff sync occurs
* this avoids repeated and sometimes expensive database queries
* cleared after a Tariff sync occurs
* Backend sets HTTP cache headers instructing its clients how its responses may be cached
- by default this is set to cache for 2 minutes then revalidate for anything older.
- A response is valid unless a backend Deployment or a Sync has happened
* by default this is set to cache for 2 minutes then revalidate for anything older.
* A response is valid unless a backend Deployment or a Sync has happened
* Frontend uses these cache headers to control how it caches responses in it API client
* CDN ignores the cache headers and caches anything under `/api`, eg `/api/v2/sections.json` for 30 minutes
* CDN does not cache HTML pages from the frontend
Expand All @@ -26,13 +26,13 @@ Working from 'inside' out, we cache at multiple levels

Our rails cache is backed by Redis on the AWS servers, and an in memory cache for local development.

Some high load API endpoints are manually cached by writing the API response to the rails cache prior to delivery, eg in `CachedCommodityService`. Requests will check for a cached response, and deliver this if present and if not, will render and store the response.
Some high load API endpoints are manually cached by writing the API response to the rails cache prior to delivery, eg in `CachedCommodityService`. Requests will check for a cached response, and deliver this if present and if not, will render and store the response.

These cached responses, along with any other contents of Rails.cache, are cleared by the background job after we download our daily data update from CDS / Taric.

_Note: the in-memory cache used in local development is cleared automatically when the application is restarted._

Headings and Subheadings are pre-cached for the current day, ie generated ahead of time and written to the Rails.cache. This is done because the API outputs for Headings and Subheadings can be generated from the same set of loaded data meaning it only takes a couple of minutes to pre-render _every_ heading and subheading response.
Headings and Subheadings are pre-cached for the current day, ie generated ahead of time and written to the Rails.cache. This is done because the API outputs for Headings and Subheadings can be generated from the same set of loaded data meaning it only takes a couple of minutes to pre-render _every_ heading and subheading response.

These responses are pre-cached for the following day at 10pm, and then regenerated again on the day after the Tariff sync (if one occurs).

Expand All @@ -51,7 +51,7 @@ end

## HTTP caching

Where as the Rails Cache holds responses on the server, HTTP caching works by telling the HTTP client (or a proxy in the middle) what responses can be cached and it is up to the HTTP client to perform that caching.
Where as the Rails Cache holds responses on the server, HTTP caching works by telling the HTTP client (or a proxy in the middle) what responses can be cached and it is up to the HTTP client to perform that caching.

Server controls, Client implements. It combines 2 concepts

Expand All @@ -66,7 +66,7 @@ This is currently set to **2 minutes** and is set via a constant in the `EtagCac

### Response Validity

Determining whether a response stored by the client is still valid and can continue to be used. This is controlled via the combination of the `Last-Modified` header and more crucially the `ETag` header.
Determining whether a response stored by the client is still valid and can continue to be used. This is controlled via the combination of the `Last-Modified` header and more crucially the `ETag` header.

An ETag is a hashed identifier for the response contents, it is passed back to the HTTP server during the HTTP request, and the server determines whether the response is still valid.

Expand All @@ -88,7 +88,7 @@ When any of the above change, then the ETag changes and the HTTP client will dow

### Alternative: Caching for a fixed period of time

You can force a controller to only cache its responses for a fixed period of time, after which the full response will be rendered
You can force a controller to only cache its responses for a fixed period of time, after which the full response will be rendered

* if this matches what the client already had then it is determined to be valid and an empty `304` returned
* if it doesn't then a regular `200` is returned
Expand Down Expand Up @@ -137,9 +137,9 @@ We have 2 http clients we control - our Frontend app and our CDN

### Frontend

Our API requests happen via Faraday and we include Faraday's HTTP caching plugin. This plugin follows the defined by our backend described above.
Our API requests happen via Faraday and we include Faraday's HTTP caching plugin. This plugin follows the defined by our backend described above.

In practical terms, this means something like a call to `Commodity.find('1234567890')` will request all the Sections from the backend.
In practical terms, this means something like a call to `Commodity.find('1234567890')` will request all the Sections from the backend.

A subsequent call under 2 minutes later, will return the cached response. A call _after 2 minutes_ will re-request data from the backend

Expand Down
2 changes: 1 addition & 1 deletion docs/exchange_rates.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@ It has happened too often that data has had to be adjusted in the DB

Average rates are calculated based on the live countries in the last 12 months from the date selected in the `ExchangeRates::AverageExchangeRatesService` class. Normally the worker class `AverageExchangeRatesWorker` Will run on the 31st March and 31st Dec. It will select all the countries that have had a live rate for the last year through working out the end of the month date selected *(eg. if the service is run on the 12th May then it will use 31st May for that year going back to the 1st of June for the previous year hgathering all country and currency parings).* This solved the issue if a country might have multiple currencies in one year and we have to display the average for currencies that country has had even if its just one day.

You can then navigate to https://www.trade-tariff.service.gov.uk/exchange_rates/average and the latest data will be available to view online plus files.
You can then navigate to <https://www.trade-tariff.service.gov.uk/exchange_rates/average> and the latest data will be available to view online plus files.

You can check the exchange rates for the last year by running this command: `ExchangeRateCurrencyRate.by_type(ExchangeRateCurrencyRate::AVERAGE_RATE_TYPE).where(validity_end_date: Date.new(2023,12,31)).all` Chnaging the date to the end of the period you are checking for (this example uses end of dec 2023)
14 changes: 10 additions & 4 deletions docs/goods-nomenclature-nested-set.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ The primary index used for hierarchy lookups is `depth` + `position`.
`validity_start_date` and `validity_end_date` are not in the index because once the Postgres has filtered by `depth` and `position`, there are not many records to walk through so there is not much performance benefit. There would be a memory cost though, because every entry in the index would also require 2 dates that will be much larger then the `int` + `bigint` for `depth` + `position`

There are also 2 other indexes

* `goods_nomenclature_sid` - this allows for efficient JOINs to the goods_nomenclatures table
* `oid` (unique) - this is the `oid` from the indents table. Refreshing a materialized view concurrently (ie without blocking reads from the view) requires the materialized view to have a unique index.

Expand All @@ -74,12 +75,14 @@ There are also 2 other indexes
### Ancestors

These can be queried by fetching the maximium `position` at every `depth`, where -;

* `depth` is less than the `depth` of the origin `tree_node` record
* and the `position` is less than the `position` of the origin record

### Next sibling

The `tree_nodes` record with

* same `depth` as the origin record
* the lowest `position` that is still greater than the origin records `position`

Expand All @@ -88,19 +91,22 @@ The `tree_nodes` record with
_Note: Due to how we read the tree this is less useful then next sibling_

The `tree_nodes` record with

* same `depth` as the origin record
* and has the highest `position` that is still less than the origin records `position`

### Children

This is every `tree_nodes` record where -;

* the child nodes `depth` is exactly 1 greater than the `depth` of the origin record
* and the child nodes `position` is greater than the `position` of the origin `tree_nodes` record
* and the child nodes `position` is less than the `position` of next sibling of the origin record

### Descendents

This is every `tree_nodes` record where -;

* the child nodes `depth` is greater than the `depth` of the origin record
* and the child nodes `position` is greater than the `position` of the origin `tree_nodes` record
* and the child nodes `position` is less than the `position` of next sibling of the origin record
Expand Down Expand Up @@ -135,9 +141,9 @@ GoodsNomenclature records loaded for one relationship are often relevant to othe

* `ancestors` also populates `parent` on self and all ancestors
* `descendants` also populates
* `parent` for all descendants
* `children` for self plus all descendants
* `ancestors` for all descendants _if_ self already has ancestors loaded
* `parent` for all descendants
* `children` for self plus all descendants
* `ancestors` for all descendants _if_ self already has ancestors loaded

The above means you can get a nice recursive tree of children, so in the following example the first line will generate 2 queries and the second line will generate 0 queries.

Expand Down Expand Up @@ -205,7 +211,7 @@ If you need to eager load relationships below measures, you'll need to duplicate

```
MEASURE_EAGER = {
measures: [:measure_type,
measures: [:measure_type,
{ measure_conditions: :measure_condition_code }]
}
Chapter.actual
Expand Down
2 changes: 1 addition & 1 deletion docs/reporting.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,4 @@ The code used to generate the various reports is held under [app/lib/reporting](
The actual reports generated by each of the workers can be seen in the relevant workers, see

* [Daily Reports Worker](https://github.com/trade-tariff/trade-tariff-backend/blob/main/app/workers/report_worker.rb)
* [Differences Report Worker](https://github.com/trade-tariff/trade-tariff-backend/blob/main/app/workers/differences_report_worker.rb)
* [Differences Report Worker](https://github.com/trade-tariff/trade-tariff-backend/blob/main/app/workers/differences_report_worker.rb)
6 changes: 3 additions & 3 deletions docs/rules_of_origin.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@

## Rules of Origin API

The primary API endpoint for rules of origin is
The primary API endpoint for rules of origin is

```
/rules_of_origin_schemes/<commodity_code>/<country_code>
```

This will return a JSON-API response containing a list of all applicable schemes and their rules which are relevant to the `commodity_code`

There are also;
There are also;

```
/rules_of_origin_schemes/<commodity_code>
Expand Down Expand Up @@ -92,8 +92,8 @@ for Northern Ireland,
/roo_schemes_xi/articles/<scheme_code>/<article_name>
```


## How to validate the RoO data files

Two rake tasks are available to validate the RoO files:

```bash
Expand Down

0 comments on commit 0fbb8cf

Please sign in to comment.