Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]gQuant/plugins/gquant_plugin/notebooks/cuIndicator/indicator_demo.ipynb does not work #155

Open
complyue opened this issue Jan 19, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@complyue
Copy link

Describe the bug

The cuIndicator demo notebook has various issues to reproduce its result.

Steps/Code to reproduce bug

First, an identified issue and possible fix:

#154 (comment)

Then

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [16], in <module>
     17     return df.query('datetime<@end_date and datetime>=@beg_date')
     19 indicator_lists = ['Accumulation Distribution', 'ADMI', 'Average True Range', 'Bollinger Bands',
     20                    'Chaikin Oscillator', 'Commodity Channel Index', 'Coppock Curve', 'Donchian Channel',
     21                    'Ease of Movement', 'EWA', 'Force Index', 'Keltner Channel', 'KST Oscillator', 'MA', 'MACD',
     22                    'Mass Index', 'Momentum', 'Money Flow Index', 'On Balance Volume', 'Parabolic SAR',
     23                    'Rate of Change', 'RSI', 'Stochastic Oscillator D', 'Stochastic Oscillator K', 'TRIX',
     24                    'True Strength Index', 'Ultimate Oscillator', 'Vortex Indicator',]
---> 26 task_stocks_list = [task_stock_symbol]
     27 task_stocks_graph = TaskGraph(task_stocks_list)
     28 list_stocks = task_stocks_graph.run(outputs=['stock_symbol.stock_name'])[0].to_pandas().set_index('asset_name').to_dict()['asset']

NameError: name 'task_stock_symbol' is not defined

(I tried to give some value to that variable but further strange errors occurred, so maybe someone familiar with it should better have a look)

Expected behavior

The notebook should be reproducible.

Environment overview (please complete the following information)

Environment details

N/A

Additional context

#154

@complyue complyue added the bug Something isn't working label Jan 19, 2022
@avolkov1
Copy link
Contributor

A lucky guess took me a step forward w.r.t. indicator_demo.ipynb, if I change:

task_load_csv_data = {
    TaskSpecSchema.task_id: "load_csv_data",
    TaskSpecSchema.node_type: "CsvStockLoader",
    TaskSpecSchema.conf: {"file": "../data/stock_price_hist.csv.gz"},
    TaskSpecSchema.inputs: {}
}

To:

task_load_csv_data = {
    TaskSpecSchema.task_id: "load_csv_data",
    TaskSpecSchema.node_type: "CsvStockLoader",
    TaskSpecSchema.conf: {"file": "../data/stock_price_hist.csv.gz"},
    TaskSpecSchema.inputs: {},
    TaskSpecSchema.module: 'greenflow_gquant_plugin.dataloader',
}

Then it'll fail with Exception: Cannot find the Node Class:SortNode instead of Exception: Cannot find the Node Class:CsvStockLoader.

So is this the way codeful graph nodes are supposed to be written? Should I report a bug against indicator_demo.ipynb and fix it somehow?

So in general, one should set the module. So it should be like this:

task_load_csv_data = {
    TaskSpecSchema.task_id: "load_csv_data",
    TaskSpecSchema.node_type: "CsvStockLoader",
    TaskSpecSchema.conf: {"file": "../data/stock_price_hist.csv.gz"},
    TaskSpecSchema.inputs: {},
    TaskSpecSchema.module: 'greenflow_gquant_plugin.dataloader'
}

task_sort = {
    TaskSpecSchema.task_id: "sort",
    TaskSpecSchema.node_type: "SortNode",
    TaskSpecSchema.conf: {"keys": ['asset', 'datetime']},
    TaskSpecSchema.inputs: {"in": "load_csv_data.cudf_out"},
    TaskSpecSchema.module: 'greenflow_gquant_plugin.transform'
}

task_stock_symbol = {
    TaskSpecSchema.task_id: "stock_symbol",
    TaskSpecSchema.node_type: "StockNameLoader",
    TaskSpecSchema.conf: {"file": "../data/security_master.csv.gz"},
    TaskSpecSchema.inputs: {},
    TaskSpecSchema.module: 'greenflow_gquant_plugin.dataloader'
}

But greenflow is supposed to be smart enough to find the node automatically without specifying the module explicitly as long as some plugin provides this node. Looks like this automatic search functionality is broken right now. I would have to debug and fix greenflow.

I had trouble running this indicator demo notebook regardless. The "bqplot" was giving me trouble plotting, and there's a path lookup in the "cuInidicator/viz" package.

load_modules(os.getenv('MODULEPATH')+'/rapids_modules/')
from rapids_modules.cuindicator import . . .

Replace all "rapids_modules" with "greenflow_gquant_plugin" which I think should be the correct way to do it. And remove load_modules(os.getenv('MODULEPATH')+'/rapids_modules/').

Alternatively, to get it working without modifying any code I made a symbolic link in "gQuant/plugins/gquant_plugin/modules"

rapids_modules -> <absolute_path_to>/gQuant/plugins/gquant_plugin/greenflow_gquant_plugin/

Then started jupyter lab from directory: "gQuant/plugins/gquant_plugin"

# MODUELPATH corresponds to: "gQuant/plugins/gquant_plugin/modules"
export MODULEPATH=${PWD}/modules
jupyter lab --ip=0.0.0.0 # etc...

Even with all the fixes, I couldn't get bqplot to work. It's not plotting correctly in that notebook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants