-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize getting Price data #16
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @scientes,
thank you for your hard work. I really appreaciate the time you took to develop the grouped price data import.
I took quite some time, too, to think about your implementation. Some of my ideas might be overkill, bullshit or need some further tuning. I hope, that you don't feel overrun by all of this and that we can stay in a dialog to get this into CoinTaxman. Let me know, how you feel about my review.
Some of my comments are my personal recommandation, you don't have to bend for them. Feel free to stay with your coding style.
As you already indicated, I would like to see some documentation, docstrings on important functions and type annotations.
Lets rock this together :)
I sadly do not have the time to work on this further this month, but ill try and finish it to the beginning of the next month. |
@provinzio i just pushed a PoC for a graph based approach. this approach is atm a standalone and not yet integrated into the actual program. Also it does not look pretty and could have been coded better at some places. But i wanted to share my current progress so you could have a first look at it and find stuff i missed. Things i know that need improvement:
But on the upside This should make the price calculation multi exchange ready and when implemented properly should provide ample fallback paths so that the price calculation should always work, even with every fiat currency. |
I like the general idea and would love to see the end result. I wonder why cctx does not support anything like this. Looks like a basic feature for me.
Are we able, to get all pairs with one request or something similar? The algorithm might profit from fetching a lot of data and preprocessing it. |
well yes in a way i can fetch information per exchange on which pairs it offers but i have not found a way to determine per pair from when to when it was active, and because of that i currently request the weekly candles per pair to determine the timeframe in which the pair was traded as a side bonus i get the volume information per week |
well cctx is supposed to be a libary which standartises the apis for the exchanges for the end user nothing more nothing less and i suppose that most exchanges do not provide the data which describes from when to when a pair was active. i could however sort the pairs/paths by current volume but that does not really help. also the volume feature is not really needed and is a small side bonus to the actual purpose which is to find a price for every trade, which i think that my current implementation can achieve |
The current implementation is tested with my data and works for me. the intial pair takes some time to find the right path when we have old data (2017) but due to caching after the initial pair it should be pretty fast |
@scientes could you rebase your branch before I review? |
i hope i didn't mess anything up. i do not rebase that often^^ |
after rebase i get this error: File "/home/bastian/Documents/git/CoinTaxman/src/main.py", line 20, in <module>
from book import Book
File "/home/bastian/Documents/git/CoinTaxman/src/book.py", line 25, in <module>
import misc
File "/home/bastian/Documents/git/CoinTaxman/src/misc.py", line 40, in <module>
L = TypeVar("L", bound=list[Any])
TypeError: 'type' object is not subscriptable |
What is your python version? This shouldn't be a problem with 3.9 |
well yeah code switched to 3.8.5 without me noticing |
I am surprised that I can not see the workflow checks directly in this PR.. You should run at least black and isort. It would be awesome if you can fix the mypy errors, which help incredibly to find bugs related to wrong types. |
Well i get Emails that the Pipelines are running and failing (flake8 and mypy) |
I don't really get how you combined the branches. Rebasing should work with |
Well i just used the rebase function in vs Code. And after that it showed 11 pulls from origin dunno why |
Ah i missed the force and thats why the merge 😄 |
The differences i Do Not understand |
I'd really would like to keep a clean commit history. But I guess it started already when you merged the newest changes into your main instead of rebasing (a40e2ef). But I don't want you to have a lot of work because of my stupid ideals. While fixing I even found an unwanted merging error. I fixed the history in my Branch You could do (from your branch If you have more data right now: there are different ways how we could bring them togther. For example: Commit all your changes to Please excuse my sturdiness. I hope I dont scare you off, but we might profit from a clean commit history later on. Edit: Btw you can compare the branches with the excellent VS Code Extension Git Tree compare to see for yourself. |
i think git doesnt like the branch name @scientes eToro and main work on reset but @scientes fails with
edit: forgot to fetch from all remotes (not only my fork) |
warning if exchange for CSV export is not in found in CCTX list
Hey @scientes, for the Console output (for each entry):
However, if you replace BTC by e.g. XXBT (the "dirty" Kraken coin name), the "old" API call is made and the prices look more realistic (they vary for each timestamp):
Have you experienced something similar? Could it be that the resolution of the OHLCV is too low (e.g. only weekly data)? Inaccurate price data in
Dummy CSV export file:
|
Shouldn't be. The weekly calls are only for looking for the time period in which the pair Was traded on the platform. For the actual data 1m candles are used.
Weird normally ccxt should convert the Symbols automatically to the exchange Variant. But maybe theres some Bug or wrong Implementation on Our side there Ill Look into it tomorrow |
this error is extremely weird and kraken specific i think:
as you can see the time frame requested and the timestamp of the candle which is reported back vary significantly. more interesting is that the timestamp reported back is not much older than 12 hours to execution time |
https://www.freqtrade.io/en/stable/exchanges/#historic-kraken-data
using trades instead of candles was another approach i wanted to implement in the future as an alternative anyways but I'd rather not do this in this PR. it has taken to long anyways |
@provinzio i think the current state is merge ready. what do you think? |
Would be great to have this available in a merged fashion together with the latest fix for kraken (#100) on main. Currently, I get also stuck on my analysis of binance trades (recursion problem and some buy/sell issue). |
I try to work on it as soon as possible, this is definitly an awesome PR. I appreciate everybody who helps reviewing the code. Feel free to give your feedback as well. |
* progress bar output: Reduced to one line * flake8 * variable and file naming * detailed warning if price already exists in database * do not preload prices for Kraken
Just noticed that ccxt has a very fast release cycle and has almost hourly package updates. I'd suggest unpinning the ccxt version in the requirements.txt package so that when Cointaxman is installed, the newest version of ccxt is always installed. |
Just to let you know, I've tested this PR with my Binance/Coinbase logs and it worked flawlessly :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @scientes,
I am very sorry that my review is taking so long.
Your work on this branch is impressive, but also comprehensive.
I'll try my best, to give your work a good review.
I merged origin/main into this branch so that we keep everything up to date. :)
Please have a look at the left over TODOs in the code and the open conversations.
@@ -516,6 +522,70 @@ def get_price( | |||
|
|||
return price | |||
|
|||
def get_missing_price_operations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to query the prices for all operation types, e.g. CoinLend and CoinLendEnd.
Although I have a local database with all necessary prices setup, the tool is fetching the same prices on every run.
2022-02-06 12:32:52,967 price_data DEBUG found path over BTC/EUR (binance)
2022-02-06 12:32:51,186 price_data INFO getting data from 16-May-2021 (02:42) to 16-May-2021 (02:42) for BTC
2022-02-06 12:32:51,526 price_data DEBUG found path over BTC/EUR (binance)
2022-02-06 12:32:52,625 price_data INFO getting data from 17-May-2021 (02:46) to 17-May-2021 (02:46) for BTC
On every run, this function returns, that the prices for the operation CoinLend
is missing, the log output states, that the prices are fetched.. but on the next run, the prices are fetched again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be sufficient to drop all elements of operations
that are coin lend, coin lend end, staking, staking end etc. at the beginning of the function? Theoretically this could be different for each country, so it might make sense to differentiate here before dropping the elements.
@@ -60,3 +60,4 @@ def IS_LONG_TERM(buy: datetime, sell: datetime) -> bool: | |||
DATA_PATH = Path(BASE_PATH, "data") | |||
EXPORT_PATH = Path(BASE_PATH, "export") | |||
FIAT = FIAT_CLASS.name # Convert to string. | |||
EXCHANGES = ["binance", "coinbasepro"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment for the EXCHANGES
variable. Whats the benefit of the variable? Why do we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OHLCV feature is not supported for all exchanges (e.g. Kraken, it works without error, but the returned data is inaccurate and not useable). So I see two options: Define all supported (and tested) exchanges in a list, or define exchanges that should not be used for this feature. I would go with the first option (which is how it is implemented now).
Closes #14
This tries to fetch the price data in batches from an supported exchange (must be supported by ccxt)
if this fails for certain datapoints it does nothing and thus lets the current implementation try to fetch the price data.
The current state works, but is not very well documented and may leave room for improvement in certain areas.
This was only tested using my personal dataset of transactions with Binance.
Fetching Batches from multiple exchanges is not Supported yet.
Also produces very much output in the logs due to ccxt on logging.DEBUG. This should maybe be fixed.
Suggestions Welcome