ValueError: all the input array dimensions except for the concatenation axis must match exactly #6

Daiiszuki · 2023-04-20T12:23:41Z

I'm trying to download the training data using 0_dl_trainval_data.py. It looks like the data downloaded successfully but the pre-processing causes the error. Specifically price_array = np.hstack([price_array, df[df.tic == tic][['close']].values]) causes the error.

What are some potential causes
What are some potential solutions?
Wouldn't it be better to isolate the data downloading and cleaning?

The text was updated successfully, but these errors were encountered:

Daiiszuki · 2023-04-21T13:18:14Z

At first I suspected that the fact that I was requesting for data from 2017, hence the different tickers had different no. records. I tried to use a more recent range (2020-01-01) and now Im getting

 main()
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/0_dl_trainval_data.py", line 97, in main
    data_from_processor, price_array, tech_array, time_array = process_data()
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/0_dl_trainval_data.py", line 52, in process_data
    data_from_processor, price_array, tech_array, time_array = DataProcessor.run(
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/processor_Binance.py", line 81, in run
    price_array, tech_array, time_array = self.df_to_array(data, if_vix)
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/processor_Binance.py", line 186, in df_to_array
    price_array = np.hstack([price_array, df[df.tic == tic][['close']].values])
  File "<__array_function__ internals>", line 200, in hstack
  File "/usr/local/lib/python3.9/dist-packages/numpy/core/shape_base.py", line 370, in hstack
    return _nx.concatenate(arrs, 1, dtype=dtype, casting=casting)
  File "<__array_function__ internals>", line 200, in concatenate
ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 1359835 and the array at index 1 has size 1359834

Daiiszuki · 2023-04-21T13:21:54Z

configured as

# General Training Settings
#######################################################################################################
#######################################################################################################

trade_start_date = '2023-02-01 00:00:00'
trade_end_date = '2023-04-19 00:00:00'

SEED_CFG = 2390408
TIMEFRAME = '1m'
H_TRIALS = 50
KCV_groups = 5
K_TEST_GROUPS = 2
NUM_PATHS = 4
N_GROUPS = NUM_PATHS + 1
NUMBER_OF_SPLITS = nCr(N_GROUPS, N_GROUPS - K_TEST_GROUPS)

print(NUMBER_OF_SPLITS)

no_candles_for_train = 1111200
no_candles_for_val = 250000

Daiiszuki · 2023-04-21T14:09:55Z

I've tried using the default configuration and it works with no issue, so I guess it's the no_candles_for_train that's causing this

What are the valid date ranges per timeframe for the date?

I assumed it was January 2020 if using binance?

Daiiszuki · 2023-04-22T08:32:31Z

So after trying different values for my number training candles, I found that around 300,000 was the max and was able to download with that. But now I get the following error

ValueError: No trials are completed yet.

Full trace:


10
TRAIN_START_DATE:  2021-11-23 23:20:00
VAL_END_DATE:  2023-01-31 23:59:00

Starting CPCV optimization with:
drl algorithm:        ppo
name_test:            model
gpu_id:               0 


##### Launched hyperparameter optimization with CPCV  #####

TIMEFRAME                   1m
TRAIN SAMPLES               500000
TRIALS NO.                  50
N                           5
K test groups               2
SPLITS                      10


TRAIN SAMPLES               500000
VAL_SAMPLES                 125000
TRAIN_START_DATE            2021-11-23 23:20:00
TRAIN_END_DATE              2022-11-06 04:39:00
VAL_START_DATE              2022-11-06 04:40:00
VAL_END_DATE                2023-01-31 23:59:00 

TICKER LIST                 ['XRPUSDT', 'BTCUSDT', 'ETHUSDT', 'BNBUSDT', 'HBARUSDT', 'UNIUSDT'] 

/usr/local/lib/python3.9/dist-packages/optuna/samplers/_tpe/sampler.py:282: ExperimentalWarning: ``multivariate`` option is an experimental feature. The interface can change in the future.
  warnings.warn(
[I 2023-04-22 07:46:37,255] A new study created in memory with name: no-name-2f2030fe-f672-4585-95e6-bc15be720c90

LOADING DATA FOLDER:  ./data/1m_625000 

No. Train Samples: 374994 

| Arguments Remove cwd: ./train_results/cwd_tests/model_CPCV_ppo_1m_50H_625k
################################################################################
ID     Step    maxR |    avgR   stdR   avgS  stdS |    expR   objC   etc.
[W 2023-04-22 07:46:53,455] Trial 0 failed with parameters: {'learning_rate': 0.03, 'batch_size': 512, 'gamma': 0.99, 'net_dimension': 1024, 'target_step': 937500, 'eval_time_gap': 60, 'break_step': 45000.0, 'lookback': 1, 'norm_cash': 0.000244140625, 'norm_stocks': 0.00390625, 'norm_tech': 3.0517578125e-05, 'norm_reward': 0.0009765625, 'norm_action': 10000} because of the following error: ValueError('operands could not be broadcast together with shapes (6,) (10,) ').
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/optuna/study/_optimize.py", line 200, in _run_trial
    value_or_values = func(trial)
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/1_optimize_cpcv.py", line 328, in obj_with_argument
    return objective(trial, name_test, model_name, cwd, res_timestamp, gpu_id)
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/1_optimize_cpcv.py", line 269, in objective
    sharpe_bot, sharpe_eqw, drl_rets_tmp = train_and_test(trial, price_array, tech_array, train_indices,
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/function_train_test.py", line 33, in train_and_test
    train_agent(price_array,
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/function_train_test.py", line 72, in train_agent
    agent.train_model(model=model,
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/drl_agents/elegantrl_models.py", line 86, in train_model
    train_and_evaluate(model)
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/train/run.py", line 40, in train_and_evaluate
    trajectory = agent.explore_env(env, target_step)
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/drl_agents/agents/AgentPPO.py", line 74, in explore_one_env
    next_s, reward, done, _ = env.step(get_a_to_e(ten_a)[0].numpy())
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/environment_Alpaca.py", line 113, in step
    for index in np.where(actions < -self.minimum_qty_alpaca)[0]:
ValueError: operands could not be broadcast together with shapes (6,) (10,) 
[W 2023-04-22 07:46:54,924] Trial 0 failed with value None.
Traceback (most recent call last):
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/1_optimize_cpcv.py", line 360, in <module>
    optimize(name_test, name_model, gpu_id)
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/1_optimize_cpcv.py", line 341, in optimize
    study.optimize(
  File "/usr/local/lib/python3.9/dist-packages/optuna/study/study.py", line 425, in optimize
    _optimize(
  File "/usr/local/lib/python3.9/dist-packages/optuna/study/_optimize.py", line 66, in _optimize
    _optimize_sequential(
  File "/usr/local/lib/python3.9/dist-packages/optuna/study/_optimize.py", line 174, in _optimize_sequential
    callback(study, frozen_trial)
  File "/content/drive/MyDrive/DE-FI DE-GEN/FinRL_Crypto-master/1_optimize_cpcv.py", line 98, in save_best_agent
    if study.best_trial.number != trial.number:
  File "/usr/local/lib/python3.9/dist-packages/optuna/study/study.py", line 159, in best_trial
    return copy.deepcopy(self._storage.get_best_trial(self._study_id))
  File "/usr/local/lib/python3.9/dist-packages/optuna/storages/_in_memory.py", line 250, in get_best_trial
    raise ValueError("No trials are completed yet.")
ValueError: No trials are completed yet.

Any suggestions?

Daiiszuki · 2023-04-22T14:02:57Z

New evidence suggests that the issue might be with my number of ticker symbols, I will try to download, as in the example configuration, 10 tickers instead of 6. Fingers crossed

Please, I think any amount of your input would be a major help

Daiiszuki · 2023-04-26T05:59:42Z

Definitely the number of tickers, but I still thank the issue needs attention so I'll leave it open

mehdicauche · 2023-10-31T12:56:33Z

The error you are encountering, ValueError: all the input array dimensions for the concatenation axis must match exactly, is occurring because the arrays you are trying to concatenate using np.hstack() have different sizes along dimension 0 (i.e., different numbers of rows).

In this context, this could mean that different tickers have a different number of data points (candles) in the DataFrame, causing the mismatch during concatenation.
Solution:

You might want to ensure that each ticker has the same number of data points before attempting to concatenate. Here's how you could modify the df_to_array method to handle this.

mehdicauche · 2023-10-31T12:57:25Z

I'd like to update the code but i am not very familiar to contributing to open source projects

Daiiszuki mentioned this issue Apr 21, 2023

Colab installation Issue #5

Open

YangletLiu added the bug Something isn't working label Apr 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: all the input array dimensions except for the concatenation axis must match exactly #6

ValueError: all the input array dimensions except for the concatenation axis must match exactly #6

Daiiszuki commented Apr 20, 2023

Daiiszuki commented Apr 21, 2023 •

edited

Loading

Daiiszuki commented Apr 21, 2023

Daiiszuki commented Apr 21, 2023

Daiiszuki commented Apr 22, 2023

Daiiszuki commented Apr 22, 2023 •

edited

Loading

Daiiszuki commented Apr 26, 2023

mehdicauche commented Oct 31, 2023

mehdicauche commented Oct 31, 2023

ValueError: all the input array dimensions except for the concatenation axis must match exactly #6

ValueError: all the input array dimensions except for the concatenation axis must match exactly #6

Comments

Daiiszuki commented Apr 20, 2023

Daiiszuki commented Apr 21, 2023 • edited Loading

Daiiszuki commented Apr 21, 2023

Daiiszuki commented Apr 21, 2023

Daiiszuki commented Apr 22, 2023

Daiiszuki commented Apr 22, 2023 • edited Loading

Daiiszuki commented Apr 26, 2023

mehdicauche commented Oct 31, 2023

mehdicauche commented Oct 31, 2023

Daiiszuki commented Apr 21, 2023 •

edited

Loading

Daiiszuki commented Apr 22, 2023 •

edited

Loading