Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimension mismatch between technosphere and biosphere when exporting matrices #178

Open
Michael-ljn opened this issue Sep 8, 2024 · 9 comments

Comments

@Michael-ljn
Copy link

Michael-ljn commented Sep 8, 2024

Hi @romainsacchi,

When exporting matrices with the Remind model SSP2-pkbudg500, there is a dimension mismatch. There are more activities in the A matrix than the B matrix. I tried to look for the source of the issue, but I can hardly trace it back. I just updated to the last version, 2.1.3.

A: (28475, 28475)
B: (4709, 28462)

@romainsacchi
Copy link
Collaborator

Hi @Michael-ljn, can you paste here the script used? I'll try to re-run it on my end.

@Michael-ljn
Copy link
Author

Michael-ljn commented Sep 9, 2024

Hi @romainsacchi,

I have been looking into it but don't have time to propose a fix. The issue comes from the update() method and in particular with one of these updates:

ndb.update("trucks")
ndb.update("two_wheelers")
ndb.update("cars")
ndb.update("buses")

ndb.write_db_to_matrices("Technosphere2")
The database was defined as follows

ndb= NewDatabase(
        scenarios = [
            {"model":"REMIND", "pathway":"SSP2-PkBudg500", "year":2025,},
        ],        
        source_db="ecoinvent-3.9.1-cutoff",
        source_version="3.9.1",
        key='xxxxxxxxx',
        biosphere_name="ecoinvent-3.9.1-biosphere",
        keep_source_db_uncertainty=True,
        keep_imports_uncertainty=True)

@romainsacchi
Copy link
Collaborator

romainsacchi commented Sep 9, 2024

When running:

ndb= NewDatabase(
    scenarios = [
    {"model":"REMIND", "pathway":"SSP2-PkBudg500", "year":2025,},
    ],
    source_db="ecoinvent-3.9.1-cutoff",
    source_version="3.9.1",
    key='tUePmX_S5B8ieZkkM7WUU2CnO8SmShwmAeWK9x2rTFo=',
    biosphere_name="ecoinvent-3.9.1-biosphere",
    keep_source_db_uncertainty=True,
    keep_imports_uncertainty=True
)
ndb.update()
ndb.write_db_to_matrices()

(meaning, all sectors), I get correct shapes.
Not sure why I don't get exactly the same shape as you though.

Screenshot 2024-09-09 at 19 53 55

@romainsacchi
Copy link
Collaborator

Can you provide an exact case/script that leads to the error?

@Michael-ljn
Copy link
Author

Michael-ljn commented Sep 9, 2024

Hi @romainsacchi,

Apologies, I mixed up the scenarios, the dimensions because I was trying to locate where the issue, is coming from. The dimensions I provided are for one of the SSP-pkbudg500, I can't find the exact one. But the issue happens in all scenarios and models, actually.

First I start with a fresh environment, installing brightway (mac ARM install process) and all dependencies and then pip install premise ==2.1.3. Running the following:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

import bw2data as bd
import bw2io as bi
import ecoinvent_interface
import pickle
from premise import *
from datapackage import Package

name = "premise"
ei_name = "ecoinvent-3.9.1-cutoff"
bd.projects.set_current(name)
if "biosphere3" in bd.databases:
    print(f"biosphere3 has already been imported.")
elif ei_name in bd.databases:
    print(f"{ei_name} has already been imported.")      
else:
    bi.import_ecoinvent_release(version="3.9.1",system_model="cutoff",username="xxxxxxxx",password="xxxx")
    
clear_cache()
ndb = NewDatabase(
        scenarios = [
            {"model":"REMIND", "pathway":"SSP1-PkBudg500", "year":2025,},
        ],        
        source_db="ecoinvent-3.9.1-cutoff",
        source_version="3.9.1",
        key='xxxxxx,
        biosphere_name="ecoinvent-3.9.1-biosphere",
        keep_source_db_uncertainty=True,
        keep_imports_uncertainty=True
    )
ndb.update()
ndb.write_db_to_matrices("test_update_all")

The opening the produced csv files for A and B can clearly show the mismatch.

Screenshot 2024-09-10 at 07 21 47

now running the following

clear_cache()
ndb = NewDatabase(
        scenarios = [
            {"model":"REMIND", "pathway":"SSP1-PkBudg500", "year":2025,},
        ],        
        source_db="ecoinvent-3.9.1-cutoff",
        source_version="3.9.1",
        key='xxxxxxxxx',
        biosphere_name="ecoinvent-3.9.1-biosphere",
        keep_source_db_uncertainty=True,
        keep_imports_uncertainty=True
    )
ndb.update("electricity")
ndb.update("fuels")
ndb.update("heat")
ndb.update("emissions")
ndb.update("external")
ndb.update("biomass")
ndb.update("dac")
ndb.update("cement")
ndb.update("steel")
ndb.write_db_to_matrices("test_update_everything_no_vehicules")

Everything is good.

Screenshot 2024-09-10 at 07 30 34

Lastly, when running:

clear_cache()
ndb = NewDatabase(
        scenarios = [
            {"model":"REMIND", "pathway":"SSP1-PkBudg500", "year":2025,},
        ],        
        source_db="ecoinvent-3.9.1-cutoff",
        source_version="3.9.1",
        key='xxxxxxxxx',
        biosphere_name="ecoinvent-3.9.1-biosphere",
        keep_source_db_uncertainty=True,
        keep_imports_uncertainty=True
    )
ndb.update("trucks")
ndb.update("two_wheelers")
ndb.update("cars")
ndb.update("buses")
ndb.write_db_to_matrices("test_update_vehicules")

The mismatch happens again. It's a process of elimination, a bit long, but I guess you might know which one of the updates might be causing that.

Screenshot 2024-09-10 at 07 36 33

@romainsacchi
Copy link
Collaborator

I still have correct shapes when running only:

ndb.update("trucks")
ndb.update("two_wheelers")
ndb.update("cars")
ndb.update("buses")
Screenshot 2024-09-10 at 09 00 56

@romainsacchi
Copy link
Collaborator

Maybe let's look at the script you use to load the matrices. What shape do you get when running this?

from scipy import sparse
#from pypardiso import spsolve <-- use pypardiso if you use an Intel chip, it's much faster!
from scipy.sparse.linalg import spsolve
from pathlib import Path
from csv import reader
import numpy as np

fp="/Users/romain/Documents/export/remind/SSP2-PkBudg500/2025"

# the directory to the set of files produced by premise
DIR = Path(fp) 

# creates dict of activities <--> indices in A matrix
A_inds = dict()
with open(DIR / "A_matrix_index.csv", 'r') as read_obj:
    csv_reader = reader(read_obj, delimiter=";")
    for row in csv_reader:
        A_inds[(row[0], row[1], row[2], row[3])] = row[4]

A_inds_rev = {int(v):k for k, v in A_inds.items()}

# creates dict of bio flow <--> indices in B matrix
B_inds = dict()
with open(DIR / "B_matrix_index.csv", 'r') as read_obj:
    csv_reader = reader(read_obj, delimiter=";")
    for row in csv_reader:
        B_inds[(row[0], row[1], row[2], row[3])] = row[4]
        
B_inds_rev = {int(v):k for k, v in B_inds.items()}

# create a sparse A matrix
A_coords = np.genfromtxt(DIR / "A_matrix.csv", delimiter=";", skip_header=1)
I = A_coords[:, 0].astype(int)
J = A_coords[:, 1].astype(int)
A = sparse.csr_matrix((A_coords[:,2], (J, I)))

# create a sparse B matrix
B_coords = np.genfromtxt(DIR / "B_matrix.csv", delimiter=";", skip_header=1)
I = B_coords[:, 0].astype(int)
J = B_coords[:, 1].astype(int)
B = sparse.csr_matrix((B_coords[:,2] * -1, (I, J)), shape=(A.shape[0], len(B_inds)))

print(A.shape)
print(B.shape)

@romainsacchi
Copy link
Collaborator

@Michael-ljn have you tried the above example?

@Michael-ljn
Copy link
Author

Hi @romainsacchi,

I tried on my Macbook and I get the same mismatch as in the screenshots above. I have tried on a different windows laptop and I got the right dimensions. I guess it is related to my macbook. I haven't had the chance to look further into but in will in 2 weeks from now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants