Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support scenario s3 only #158

Merged

Conversation

kmarinushkin
Copy link
Collaborator

@kmarinushkin kmarinushkin commented Oct 1, 2022

Support OECM (S3 separate) benchmark

The changes touch only OECM benchmark for separate S3 scope. The default OECM benchmark for S1S2 scope is not affected.

To see the changes in GUI:

  • in the drop-down list "Select climate scenario", select "OECM (Scope 3) 1.5 degC"
  • click anywhere on the page, to make selection confirmed by the GUI
  • wait for the new temperature score to calculate and load

For this particular bemchmark, only S3 matters.
If following commits, we use the absence of S1S2
as an indicator of S3-scope calculations

Signed-off-by: Kirill Marinushkin <[email protected]>
@kmarinushkin
Copy link
Collaborator Author

kmarinushkin commented Oct 1, 2022

@MichaelTiemannOSC i don't know, how to set reviewers for the PR. Maybe i don't have permissions, to do that. In any case, i ask you to become a reviewer.
Please also try this branch on your machine, i am curious if what you will see is what you expect.
I am open for feedback on both low-level and high-level views

P.S. the count of modified lines in this PR is giant, but it's only because of the first commit

@MichaelTiemannOSC
Copy link
Contributor

I'll enhance your permissions in the GitHub org!

@kmarinushkin
Copy link
Collaborator Author

kmarinushkin commented Oct 2, 2022

(Edited) Resolved in the latest commit:

So perhaps a better terminology cleanup would be to describe this as OECM (S3 separate) benchmark. We don’t need to be so verbose everywhere, but don’t want you to be surprised if/when you talk about scenarios when you mean benchmarks.

"""
get the projected productions for list of companies in ghg_scope12
:param company_sector_region_info: DataFrame with at least the following columns :
ColumnsConfig.COMPANY_ID, ColumnsConfig.GHG_SCOPE12, ColumnsConfig.SECTOR and ColumnsConfig.REGION
:param scope: benchmark scope for projections
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here: scope projections for production make no sense.

A long time ago, the code had significant confusion around naming where things that were emissions were labeled as production, and vice versa. I fixed most of that, but believe this is a vestigial case. Emissions and Emissions intensity have scopes. Production, by itself, has no scope.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agree. We will track this discussion in a different comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked for, but not found, where we finalized the discussion about using a scope parameter for get_company_projected_production. In any case, unless you have a really great example of where production output is somehow scope-dependent, I'm going to revert all these. Note that in your test case (test_base_providers.py:test_get_projected_production where you pass EScope.S1S2 to one calculation and EScope.S3 to a second calculation), both result in the same answer (which you also remark in the comments). Because there will never be any difference, it's just much cleaner and simpler to remove that parameter from that functional chain, which I will do in my subsequent merge.

Scope to calculate is usually S1S2, unless benchmark doesn't specify it.
The second candidate is S3
If scope to calculate is S3, `DataWarehouse` shouldn't merge S3 into S1S2.

Signed-off-by: Kirill Marinushkin <[email protected]>
@MichaelTiemannOSC
Copy link
Contributor

Please don't merge yet. The approval was intended to apply only to a given commit, not the whole PR.

@MichaelTiemannOSC MichaelTiemannOSC dismissed their stale review October 13, 2022 07:00

The review was issued prematurely (GitHub UX not understood)

examples/quick_template_score_calc.ipynb Outdated Show resolved Hide resolved
examples/quick_template_score_calc.ipynb Outdated Show resolved Hide resolved
examples/quick_template_score_calc.ipynb Outdated Show resolved Hide resolved
@@ -608,12 +619,13 @@ def update_graph(
fig1.update_layout(legend=dict(orientation="h", yanchor="bottom", y=1, xanchor="center", x=0.5))

# Covered companies analysis
coverage = filt_df[['company_id', 'ghg_s1s2', 'cumulative_target']].copy()
scope_column = 'ghg_' + EI_bm.scope_to_calc.name.lower()
coverage = filt_df[['company_id', scope_column, 'cumulative_target']].copy()
zeroE = Q_(0, 't CO2')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in the following lines, the EXISTENCE of GHG_SCOPE12 is tested (which is value) but not the value (other than NaN, zero, or non-zero). Per earlier comments, that's fine--this is a way of telling whether data in a row exists. But any use of GHG_SCOPE12 as a value for actual weighting or computations would be highly suspect.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, what's wrong in these lines?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We clarified that this use of GHG_SCOPE12 is valid, but any use other than initializing projections is not really value (such as for a weighting).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelTiemannOSC
i found this article (https://sciencebasedtargets.org/resources/legacy/2020/09/Temperature-Rating-Methodology-V1.pdf)
it suggests (page 22), that GHG weights only apply only for S1S2S3.
the table 5 shows, that S1S2 goes w/o weights, and S3 goes w/o weights.

Are you sure we need to apply weights for this PR?
Looks like it's for S1S2S3 only

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I additionally looked through the os-climate internal document "BUILDING OPEN SOURCE PORTFOLIO ALIGNMENT CAPABILITIES
METHODOLOGY & PROCESS DOCUMENTATION".

On page 15, it says:

Temperature weighting: The two separately calculated temperature scores can be transparently combined using a (potentially) company-specific probability measure p(c) for the estimated likelihood that the target will be reached:
𝐼𝑇𝑅(𝑐) = 𝑝(𝑐) ∗ 𝐼𝑇𝑅(𝑐, 𝑡𝑎𝑟𝑔𝑒𝑡) + (1 – 𝑝(𝑐)) ∗ 𝐼𝑇𝑅(𝑐, 𝑡𝑟𝑎𝑗𝑒𝑐𝑡𝑜𝑟𝑦)

AFAIU, it confirms, that weights should apply to combined score S1S2S3, but not to individual scored S1S2 and S3.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on 2 references i posted earlier, i removed the commit, which earlier introduced weights to S1S2 and S3.
I think, there was a misunderstanding of the methodology, which from my perspective only recommends weights for combined score S1S2S3.
I understand that you are sure, that weight need to apply any way. If you still think that's the case, please reference some source of methodology, which convinces you in necessity of such move.

I left the adapted unit tests, so, we now have a better test coverage of S3 calculations.

Sanity check: for the company CBRE with high S3 emissions, the notebook s1s2_s3_calc.ipynb now calculates high S3 temperature score, as we might expect.

@@ -2,7 +2,7 @@
"benchmark_temperature": "1.5 delta_degC",
"benchmark_global_budget": "396 Gt CO2",
"is_AFOLU_included": false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S1S2S3 is 'AllScopes'. Also the TPI benchmarks are based on fully blended S1S2S3 numbers. TPI has a different global budget, different numbers, and different decarbonization pathways, but an internally consistent logic that we can follow and use. When a company reports S1, S2, and (categorized) S3 numbers, the tool can blend them to get a valid calc. Alas, there's no way to judge whether the S1+S2 components are heavy or light with respect to the S3 component, but it should still be possible to show how the two do stack up against the S1S2S3 number.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you suggest me to change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting you change 'AllScopes' to 'S1S2S3' both in the code and the benchmark. The OECM data processing in itr-data-pipeline should be able to generate a proper S1S2S3 benchmark (which would save you the trouble of creating it from two separate benchmarks). I'll be AFK for an hour, then can answer back.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We earlier discussed, that production bm are unrelated to scope.
"AllScopes" is an attempt to highlight this fact.
Changing to "S1S2S3" may be confusing, because it's a standalone "scope" within our logic across the tool.
Let's find a more suitable name for this item.

"ScopeIndependent"? "General"? "All"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed into "AnyScope".
This might be more clear

Modify production benchmark input to `AnyScope`

Signed-off-by: Kirill Marinushkin <[email protected]>
Add support for `scope` argument where necessary.
Give benchmark scope_to_calc as argument.

Signed-off-by: Kirill Marinushkin <[email protected]>
Use benchmark scope_to_calc for:

* creation of `TemperatureScore` object
* selection of scope content in the table in GUI

Signed-off-by: Kirill Marinushkin <[email protected]>
This allows us to calculate temperature score for companies,
which provided S3 data.
Otherwise, we will get an exception
"The value for S3 is missing for the following companies"

Signed-off-by: Kirill Marinushkin <[email protected]>
The change only touches GUI, where the term 'scenario' was used
for Emissions Intensity benchmark.

This could confuse users

Signed-off-by: Kirill Marinushkin <[email protected]>
GitHub token was required for this notebook, but never used.
At the same time, not all ITR users have a GitHub token,
which made this notebook not executable for them.

When i removed it, it's absense didn't effect execution
of the notebook. Which means, that it can be removed safely,
and make the notebook more accessible for ITR users.

Signed-off-by: Kirill Marinushkin <[email protected]>
This w/a resolves the exception:

~~~~
KeyError: "[('Europe', 'Construction Buildings', <EScope.S3: 'S3'>)]
not in index"
~~~~

TODO: Remove this w/a, when associated EI benchmarks become available

Signed-off-by: Kirill Marinushkin <[email protected]>
The notebook separately loads EI benchmarks for S1S2 and S3,
and separately calculates temperature scores for selected scopes.

The output is 2 separate tables with separately calculated scores

Signed-off-by: Kirill Marinushkin <[email protected]>
@MichaelTiemannOSC MichaelTiemannOSC merged commit 4461d4a into os-climate:develop Nov 29, 2022
@@ -57,14 +63,17 @@ def setUp(self) -> None:
self.base_warehouse = DataWarehouse(self.base_company_data, self.base_production_bm, self.base_EI_bm)
self.company_ids = ["US0079031078",
"US00724F1012",
"FR0000125338"]
"FR0000125338",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note for the future: this is an example of test data that we should re-sync with our sample public dataset. It's not helpful, as a test case, to have random ISINs and random production, emission, and intensity numbers that don't tie back to anything meaningful, unless the test data has been carefully constructed to trigger a particular code path. This is not an example of that; in the future we should fix this bit of code we inherited. Issue filed: #167

@@ -107,6 +107,12 @@ def __init__(self, EI_benchmarks: IEIBenchmarkScopes,
self._EI_benchmarks = EI_benchmarks
self.temp_config = tempscore_config
self.column_config = column_config
# Benchmark's scope, for which we calculate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes I made in template-v2-scopes change the paradigm from "what is scope you are trying to calculate, I'll give it to you" to "here are all the scope calculations relevant to the benchmark". In this way, the ITR code doesn't have to guess how the benchmark should be interpreted, it just gives all the answers and the caller decides how to aggregate the scopes. In this bit of code there's an internal logic prioritizing scopes S1+S2 over S3 if the first case is available. But that logic may not be correct for all cases (or cases where S1+S2+S3 should be computed). Best to let the caller aggregate (as the benchmark intends).

self.root = os.path.dirname(os.path.abspath(__file__))
self.company_json = os.path.join(self.root, "inputs", "json", "fundamental_data.json")
self.benchmark_prod_json = os.path.join(self.root, "inputs", "json", "benchmark_production_OECM.json")
self.benchmark_EI_json = os.path.join(self.root, "inputs", "json", "benchmark_EI_OECM_PC.json")
self.benchmark_EI_json = os.path.join(self.root, "inputs", "json", eibm_filename)
Copy link
Contributor

@MichaelTiemannOSC MichaelTiemannOSC Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later in this function, at new lines 48-49, the company data is re-parsed. But the fundamental company data (in JSON format) specifies intensity and target projections only for S1S2 data. With no S3 projections, the S3 budget cannot be calculated in the aggregation step. Copying the S1S2 data and declaring it as S3 data breaks other tests (because it changes other cumulative emissions calculations). We can create a separate Scope 3-only company data file with only Scope 3 data...but what, really, should we be testing? I think the answer can be found in the template-v2-scope changes of PR #166 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants