diff --git a/docs/adr/2023-02-07_goods-nomenclature-nested-set.md b/docs/adr/2023-02-07_goods-nomenclature-nested-set.md index 189d0aec2..ccd529d94 100644 --- a/docs/adr/2023-02-07_goods-nomenclature-nested-set.md +++ b/docs/adr/2023-02-07_goods-nomenclature-nested-set.md @@ -22,4 +22,3 @@ Move to accessing the Goods Nomenclatures hierarchy via a modified nested set pa * We can remove the slow overnight generation of the Headings cache in Elastic Search - replacing with direct querying of the database. * Our interpretation of the hierarchy when presented with invalid data via CDS or Taric may change, eg if indent levels are incorrect. * We have the tools to optimise other parts of the codebase such as Additional Code search - diff --git a/docs/adr/2023-06-28_add_cache_headers.md b/docs/adr/2023-06-28_add_cache_headers.md index b489061dd..45eab93ed 100644 --- a/docs/adr/2023-06-28_add_cache_headers.md +++ b/docs/adr/2023-06-28_add_cache_headers.md @@ -18,17 +18,17 @@ Caching of backend responses will be enabled in two phases - first internally on ### Phase 1 -The backend will start setting headers to control caching. These will have a very short TTL to allow for quick updating will utilise ETags to allow for very quick (~2ms) HEAD 304 responses from the backend for the majority of requests which have already been seen by the caching client. +The backend will start setting headers to control caching. These will have a very short TTL to allow for quick updating will utilise ETags to allow for very quick (~2ms) HEAD 304 responses from the backend for the majority of requests which have already been seen by the caching client. Any non-GET or HEAD requests will not set cache headers -Some endpoints (eg News) will require custom ETags based on the the relevant data lifecycle. +Some endpoints (eg News) will require custom ETags based on the the relevant data lifecycle. The backends Cache-Control header is overwritten by the frontend, then ignored by the CDN anyway so this should not have any downstream impact. The Frontend will have a http cache added to its api client, which means it will -in turn cache the api responses it is receiving from the backend (where the +in turn cache the api responses it is receiving from the backend (where the backends response headers direct it to). ### Phase 2 @@ -40,4 +40,4 @@ The frontend will mark its own responses as no-store to prevent caching After the above has been deployed the CDN will be updated to follow the caching headers from its upstream -; * In the case of web access this should always be 'no-store' -* In the case of proxied API access from the backend, this should be the ETagged behaviour dictated by the backend +* In the case of proxied API access from the backend, this should be the ETagged behaviour dictated by the backend diff --git a/docs/caching.md b/docs/caching.md index d6ff83100..bb266ba10 100644 --- a/docs/caching.md +++ b/docs/caching.md @@ -13,11 +13,11 @@ We utilise caching to avoid doing repeated work whilst presenting a largely read Working from 'inside' out, we cache at multiple levels * Backend uses a Redis backed Rails cache to store **some** API responses - - this avoids repeated and sometimes expensive database queries - - cleared after a Tariff sync occurs + * this avoids repeated and sometimes expensive database queries + * cleared after a Tariff sync occurs * Backend sets HTTP cache headers instructing its clients how its responses may be cached - - by default this is set to cache for 2 minutes then revalidate for anything older. - - A response is valid unless a backend Deployment or a Sync has happened + * by default this is set to cache for 2 minutes then revalidate for anything older. + * A response is valid unless a backend Deployment or a Sync has happened * Frontend uses these cache headers to control how it caches responses in it API client * CDN ignores the cache headers and caches anything under `/api`, eg `/api/v2/sections.json` for 30 minutes * CDN does not cache HTML pages from the frontend @@ -26,13 +26,13 @@ Working from 'inside' out, we cache at multiple levels Our rails cache is backed by Redis on the AWS servers, and an in memory cache for local development. -Some high load API endpoints are manually cached by writing the API response to the rails cache prior to delivery, eg in `CachedCommodityService`. Requests will check for a cached response, and deliver this if present and if not, will render and store the response. +Some high load API endpoints are manually cached by writing the API response to the rails cache prior to delivery, eg in `CachedCommodityService`. Requests will check for a cached response, and deliver this if present and if not, will render and store the response. These cached responses, along with any other contents of Rails.cache, are cleared by the background job after we download our daily data update from CDS / Taric. _Note: the in-memory cache used in local development is cleared automatically when the application is restarted._ -Headings and Subheadings are pre-cached for the current day, ie generated ahead of time and written to the Rails.cache. This is done because the API outputs for Headings and Subheadings can be generated from the same set of loaded data meaning it only takes a couple of minutes to pre-render _every_ heading and subheading response. +Headings and Subheadings are pre-cached for the current day, ie generated ahead of time and written to the Rails.cache. This is done because the API outputs for Headings and Subheadings can be generated from the same set of loaded data meaning it only takes a couple of minutes to pre-render _every_ heading and subheading response. These responses are pre-cached for the following day at 10pm, and then regenerated again on the day after the Tariff sync (if one occurs). @@ -51,7 +51,7 @@ end ## HTTP caching -Where as the Rails Cache holds responses on the server, HTTP caching works by telling the HTTP client (or a proxy in the middle) what responses can be cached and it is up to the HTTP client to perform that caching. +Where as the Rails Cache holds responses on the server, HTTP caching works by telling the HTTP client (or a proxy in the middle) what responses can be cached and it is up to the HTTP client to perform that caching. Server controls, Client implements. It combines 2 concepts @@ -66,7 +66,7 @@ This is currently set to **2 minutes** and is set via a constant in the `EtagCac ### Response Validity -Determining whether a response stored by the client is still valid and can continue to be used. This is controlled via the combination of the `Last-Modified` header and more crucially the `ETag` header. +Determining whether a response stored by the client is still valid and can continue to be used. This is controlled via the combination of the `Last-Modified` header and more crucially the `ETag` header. An ETag is a hashed identifier for the response contents, it is passed back to the HTTP server during the HTTP request, and the server determines whether the response is still valid. @@ -88,7 +88,7 @@ When any of the above change, then the ETag changes and the HTTP client will dow ### Alternative: Caching for a fixed period of time -You can force a controller to only cache its responses for a fixed period of time, after which the full response will be rendered +You can force a controller to only cache its responses for a fixed period of time, after which the full response will be rendered * if this matches what the client already had then it is determined to be valid and an empty `304` returned * if it doesn't then a regular `200` is returned @@ -137,9 +137,9 @@ We have 2 http clients we control - our Frontend app and our CDN ### Frontend -Our API requests happen via Faraday and we include Faraday's HTTP caching plugin. This plugin follows the defined by our backend described above. +Our API requests happen via Faraday and we include Faraday's HTTP caching plugin. This plugin follows the defined by our backend described above. -In practical terms, this means something like a call to `Commodity.find('1234567890')` will request all the Sections from the backend. +In practical terms, this means something like a call to `Commodity.find('1234567890')` will request all the Sections from the backend. A subsequent call under 2 minutes later, will return the cached response. A call _after 2 minutes_ will re-request data from the backend diff --git a/docs/exchange_rates.md b/docs/exchange_rates.md index ce3c966b4..a64de6781 100644 --- a/docs/exchange_rates.md +++ b/docs/exchange_rates.md @@ -13,6 +13,6 @@ It has happened too often that data has had to be adjusted in the DB Average rates are calculated based on the live countries in the last 12 months from the date selected in the `ExchangeRates::AverageExchangeRatesService` class. Normally the worker class `AverageExchangeRatesWorker` Will run on the 31st March and 31st Dec. It will select all the countries that have had a live rate for the last year through working out the end of the month date selected *(eg. if the service is run on the 12th May then it will use 31st May for that year going back to the 1st of June for the previous year hgathering all country and currency parings).* This solved the issue if a country might have multiple currencies in one year and we have to display the average for currencies that country has had even if its just one day. -You can then navigate to https://www.trade-tariff.service.gov.uk/exchange_rates/average and the latest data will be available to view online plus files. +You can then navigate to and the latest data will be available to view online plus files. You can check the exchange rates for the last year by running this command: `ExchangeRateCurrencyRate.by_type(ExchangeRateCurrencyRate::AVERAGE_RATE_TYPE).where(validity_end_date: Date.new(2023,12,31)).all` Chnaging the date to the end of the period you are checking for (this example uses end of dec 2023) diff --git a/docs/goods-nomenclature-nested-set.md b/docs/goods-nomenclature-nested-set.md index 19b18b976..e377c257e 100644 --- a/docs/goods-nomenclature-nested-set.md +++ b/docs/goods-nomenclature-nested-set.md @@ -64,6 +64,7 @@ The primary index used for hierarchy lookups is `depth` + `position`. `validity_start_date` and `validity_end_date` are not in the index because once the Postgres has filtered by `depth` and `position`, there are not many records to walk through so there is not much performance benefit. There would be a memory cost though, because every entry in the index would also require 2 dates that will be much larger then the `int` + `bigint` for `depth` + `position` There are also 2 other indexes + * `goods_nomenclature_sid` - this allows for efficient JOINs to the goods_nomenclatures table * `oid` (unique) - this is the `oid` from the indents table. Refreshing a materialized view concurrently (ie without blocking reads from the view) requires the materialized view to have a unique index. @@ -74,12 +75,14 @@ There are also 2 other indexes ### Ancestors These can be queried by fetching the maximium `position` at every `depth`, where -; + * `depth` is less than the `depth` of the origin `tree_node` record * and the `position` is less than the `position` of the origin record ### Next sibling The `tree_nodes` record with + * same `depth` as the origin record * the lowest `position` that is still greater than the origin records `position` @@ -88,12 +91,14 @@ The `tree_nodes` record with _Note: Due to how we read the tree this is less useful then next sibling_ The `tree_nodes` record with + * same `depth` as the origin record * and has the highest `position` that is still less than the origin records `position` ### Children This is every `tree_nodes` record where -; + * the child nodes `depth` is exactly 1 greater than the `depth` of the origin record * and the child nodes `position` is greater than the `position` of the origin `tree_nodes` record * and the child nodes `position` is less than the `position` of next sibling of the origin record @@ -101,6 +106,7 @@ This is every `tree_nodes` record where -; ### Descendents This is every `tree_nodes` record where -; + * the child nodes `depth` is greater than the `depth` of the origin record * and the child nodes `position` is greater than the `position` of the origin `tree_nodes` record * and the child nodes `position` is less than the `position` of next sibling of the origin record @@ -135,9 +141,9 @@ GoodsNomenclature records loaded for one relationship are often relevant to othe * `ancestors` also populates `parent` on self and all ancestors * `descendants` also populates - * `parent` for all descendants - * `children` for self plus all descendants - * `ancestors` for all descendants _if_ self already has ancestors loaded + * `parent` for all descendants + * `children` for self plus all descendants + * `ancestors` for all descendants _if_ self already has ancestors loaded The above means you can get a nice recursive tree of children, so in the following example the first line will generate 2 queries and the second line will generate 0 queries. @@ -205,7 +211,7 @@ If you need to eager load relationships below measures, you'll need to duplicate ``` MEASURE_EAGER = { - measures: [:measure_type, + measures: [:measure_type, { measure_conditions: :measure_condition_code }] } Chapter.actual diff --git a/docs/reporting.md b/docs/reporting.md index 25062964f..67568aa02 100644 --- a/docs/reporting.md +++ b/docs/reporting.md @@ -45,4 +45,4 @@ The code used to generate the various reports is held under [app/lib/reporting]( The actual reports generated by each of the workers can be seen in the relevant workers, see * [Daily Reports Worker](https://github.com/trade-tariff/trade-tariff-backend/blob/main/app/workers/report_worker.rb) -* [Differences Report Worker](https://github.com/trade-tariff/trade-tariff-backend/blob/main/app/workers/differences_report_worker.rb) \ No newline at end of file +* [Differences Report Worker](https://github.com/trade-tariff/trade-tariff-backend/blob/main/app/workers/differences_report_worker.rb) diff --git a/docs/rules_of_origin.md b/docs/rules_of_origin.md index d70273a73..1bc1d91c5 100644 --- a/docs/rules_of_origin.md +++ b/docs/rules_of_origin.md @@ -2,7 +2,7 @@ ## Rules of Origin API -The primary API endpoint for rules of origin is +The primary API endpoint for rules of origin is ``` /rules_of_origin_schemes// @@ -10,7 +10,7 @@ The primary API endpoint for rules of origin is This will return a JSON-API response containing a list of all applicable schemes and their rules which are relevant to the `commodity_code` -There are also; +There are also; ``` /rules_of_origin_schemes/ @@ -92,8 +92,8 @@ for Northern Ireland, /roo_schemes_xi/articles// ``` - ## How to validate the RoO data files + Two rake tasks are available to validate the RoO files: ```bash