Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ORG_CURIE_TO_MAPIT_AREA_TYPE mapping #2246

Open
chris48s opened this issue Sep 19, 2024 · 0 comments
Open

Remove ORG_CURIE_TO_MAPIT_AREA_TYPE mapping #2246

chris48s opened this issue Sep 19, 2024 · 0 comments

Comments

@chris48s
Copy link
Member

We have a script called import_divisionsets_from_csv.py
which imports divisionsets
from a csv

https://github.com/DemocracyClub/EveryElection/blob/master/every_election/apps/organisations/management/commands/import_divisionsets_from_csv.py

Our CSV contains a 3-letter local authority code.
One of the things this script does is it attempts to look up our area in this big dict

ORG_CURIE_TO_MAPIT_AREA_TYPE = {

to work out what sort of organisation it is.

The reason we want to know this is so that we can then look that up in

PARENT_TO_CHILD_AREAS = {
"DIS": ["DIW"],
"MTD": ["MTW"],
"CTY": ["CED"],
"LBO": ["LBW"],
"CED": ["CPC"],
"UTA": ["UTW", "UTE"],
"NIA": ["NIE"],
"COI": ["COP"],
"LGD": ["LGE"],
}

to work out what sort of divisions we're importing.

There are 2 problems with this:

  1. Because our organisations are primarily defined in the DB but this lookup is defined in code, every time we add an organisation to our DB, we also have to edit the code. This is a bit silly.
  2. In the case where the organisation is a UTA, its sub-divisions can be either UTW or UTE. We have no way to know which, so we just pick UTW The most common case is UTW but this means we are sometimes wrong. There are probably a handful of areas where we have assigned divisions UTW when they should be UTE. This will probably surface when we look at back-porting GSS codes onto areas and we'll need to fix it.

I suggest that we should move to storing this in the DB. We need to define 2 fields. One to store the area code for the organisation itself, and another to store the area code that child divisions of this organisation have (lets call them boundaryline_area_code and boundaryline_children_area_code). I think you could make an argument for either putting this on the Organisation model itself or on the OrganisationGeography. Both fields need to allow blank values because not every single Organisation object will have one of these codes. Most will but not all. To avoid duplication, I think we can also say we should only populate boundaryline_children_area_code when boundaryline_area_code is UTA (i.e: there is not a one-to-one mapping) and just defer to looking the others up based on a mapping:

PARENT_TO_CHILD_AREAS = {
    "DIS": "DIW",
    "MTD": "MTW",
    "CTY": "CED",
    "LBO": "LBW",
    "CED": "CPC",
    "NIA": "NIE",
    "COI": "COP",
    "LGD": "LGE",
}

So then this job is going to break down into the following tasks:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant