Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Financials Extraction from XBRL #74

Closed
dgunning opened this issue Jul 27, 2024 · 15 comments
Closed

Financials Extraction from XBRL #74

dgunning opened this issue Jul 27, 2024 · 15 comments
Labels
enhancement New feature or request

Comments

@dgunning
Copy link
Owner

Version 3 of XBRL financials

  1. Rewrite Financial extraction from XBRL
  2. Create a comprehensive test harness
  3. Document Financials V3
@dgunning dgunning added the enhancement New feature or request label Jul 27, 2024
@dgunning
Copy link
Owner Author

dgunning commented Jul 27, 2024

Copied from Issue 73
For some 10Q imports, some facts are missing when querying the facts table.

For example, in the latest 10Q (Q2 2024) for $GD, the 10Q contains rows for Costs of Products and Services (us-gaap:CostOfGoodsAndServicesSold) but this fact is never loaded into the facts table in Edgar-tools or in the income-statement printed.

Likewise for "us-gaap:InterestIncomeExpenseNet" and "us-gaap:OtherNonoperatingIncomeExpense" facts.

This may possibly be related to these fields having a role of "http://fasb.org/us-gaap/role/ref/legacyRef" while most of the facts that do get loaded have a role of "http://www.xbrl.org/2003/role/disclosureRef"

image

Progress so far
GDFinancials

@dgunning
Copy link
Owner Author

dgunning commented Jul 27, 2024

Note from https://github.com/emestee
Hey,

If this helps, here are the entry points from the FASB taxonomy that group the line items in the mandatory filing statements:

https://xbrlview.fasb.org/yeti/resources/yeti-gwt/Yeti.jsp#tax~(id~174*v~10231)!con~(id~5267870)!net~(a~3474*l~832)!lang~(code~en-us)!path~(g~99043*p~0)!rg~(rg~32*p~12)

@dgunning
Copy link
Owner Author

dgunning commented Aug 4, 2024

@emestee what do you know about standardized statements vs as-reported statements? Do you know what defines the standard concepts that all companies include in their statements?

@amitgandhinz
Copy link

circling back here as ive been playing with the new upgrades. Thanks for this - looks like a lot of work went into the rewrite.

Were you able to pull in the productMembers as referenced here #66 (comment)

I have been trying to pull in the concepts that feed into the Revenue Sales, but still can't figure that out correctly.

The is the code snippet I have (eg for the AAPL XBRL instance)

# ----
                # Extract detailed revenue items
                # ----
                revenue_sources = []
                
                rev_dimensions = instance.dimensions
                rev_dimension_value = rev_dimensions['srt:ProductOrServiceAxis']
                facts = rev_dimension_value.get_facts()

                period_date_str = latest_date.strftime('%Y-%m-%d')
               
                # get the facts for this latest period
                latest_period_facts = facts[facts['end_date'] == period_date_str][facts["duration"] == "3 months"]
                
                for index, row in latest_period_facts.iterrows():
                    if row.concept.startswith("us-gaap:Revenue"):
                        print(row.value, row.concept, row.dimensions)
                

Running this against AAPL Q2 10Q, I get the following facts:

61564000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'us-gaap:ProductMember'}
24213000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'us-gaap:ServiceMember'}
39296000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'aapl:IPhoneMember'}
7009000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'aapl:MacMember'}
7162000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'aapl:IPadMember'}
8097000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'aapl:WearablesHomeandAccessoriesMember'}

The problem is that the us-gaap:ProductMember actually represents the total of all the individual product lines, so I end up double counting the product values.

Like, how do you determine that the apple:* members are part of the productMember, while the serviceMember is on its own without any nesting?

I want this product breakdown to work for any 10-Q so I can present where a companies incoming revenue comes from.

@dgunning
Copy link
Owner Author

You can see the dimensions using the dimensions attribute
instance_dimensions

And you can query by dimensions

query_facts

@emestee
Copy link

emestee commented Aug 15, 2024

@emestee what do you know about standardized statements vs as-reported statements? Do you know what defines the standard concepts that all companies include in their statements?

I actually don't know, I imagine it is SEC regulation derived from federal law and incorporating FASB rules. I also don't think it should matter. SEC filings are validated upon submission and can be assumed to be compliant with technical requirements (otherwise the SEC parser will reject the filing). You should not assign any specific meaning to any items in the statement, other than their relationships to parent items, if any.

@dgunning
Copy link
Owner Author

I think I want to add a parameter that switches between as_reported - the rows and labels that the company wants to show and standard - a common set of values and labels that all companies are required to report.

I think Bloomberg operates like that no?

@amitgandhinz
Copy link

thats kind of what im doing where I use this library to pull in the data but then standardize things into my own fields.

You prev had a version of that in your old income statement code, but the challenge is getting a mapping of all the fields into something.

@amitgandhinz
Copy link

You can see the dimensions using the dimensions attribute instance_dimensions

And you can query by dimensions

query_facts

So with this, I am already getting the aapl:* dimensions. But see how there is also the 'us-gaap: ProductMember' in the srt:ProductOrServiceAxis. This ProductMember happens to be the sum of all the aapl dimensions. Meanwhile the ServicesMember in this example doesn't have sub items. Is there a way to know if a dimension is a total of other sub items (when you pull up the xbrl viewer on the sec site the items get indented so I assume the info is somewhere).

My use case is to be able to pull the product revenue sources generically for all 10K/10Q imports, so not necessarily Apple specific.

@Ahmedmagdy31
Copy link

thats kind of what im doing where I use this library to pull in the data but then standardize things into my own fields.

You prev had a version of that in your old income statement code, but the challenge is getting a mapping of all the fields into something.

I'm struggling in this standardization now, would you please share the approach or some code to give me an idea? --I'm new to financial data and SEC but want to standardize financial statements for many companies. @amitgandhinz

@Colem19
Copy link

Colem19 commented Sep 11, 2024

The new version is working much better on the financials statements! Thanks a lot for that!

I think there is a small thing that could be highly improved in the cash flow statement. If I look at Apple, it seems like a lot of the lines are positive instead of being negative (like Share repurchases).

@david08-08
Copy link

@Colem19 @dgunning The statements are great; Dwight has done a great job. You are right though, good eye on identifying this issue. As an example I pulled (ULTA)'s most recent 10-Q via the TENQ-Class and I have identified a few more line items that either should have a negative output , positive output or its in reverse order (meaning 2024 should be positive and 2023 should be a negative). @dgunning do you anticipate that this will be fixed or is something that is fixable?
Screenshot 2024-09-11 at 9 21 54 AM

@david08-08
Copy link

@Colem19 @dgunning To be clear I highlighted in yellow what should be negative for both years. And I commented what should be reversed and what should be positive.

@dgunning
Copy link
Owner Author

I verified and most values are in the right direction
UltraBeautyCashFlow

Deferred Income Tax does not match the Filing table in the HTML.

The raw value does not match in the XBRL instance file but I suspect that there seems to be a bug in the XBRL calculation file.

UltraBeautyXbrl

The calculation files usually have weights of -1 for negative values but it is missing in this case.
weight

Not much the code can do in this case

@dgunning
Copy link
Owner Author

dgunning commented Nov 6, 2024

Closing as fixed

@dgunning dgunning closed this as completed Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants