Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error only occurs in theano backend when kur train cifar.yml #78

Open
EmbraceLife opened this issue May 16, 2017 · 1 comment
Open

Error only occurs in theano backend when kur train cifar.yml #78

EmbraceLife opened this issue May 16, 2017 · 1 comment

Comments

@EmbraceLife
Copy link
Contributor

When I added some codes into my own kur, but it runs fine on deepgram/kur/examples/cifar.yml, when it runs on my cifar.yml below:

---

settings:

  # Where to get the data
  cifar: &cifar
    url: "https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz"
    checksum: "6d958be074577803d12ecdefd02955f39262c83c16fe9348329d7fe0b5c001ce"
    path: "/Users/Natsume/Downloads/data_for_all/cifar"

  # Backend to use
  backend:
    name: keras
    backend: tensorflow
    # name: pytorch
    # there is a problem with receptive fields size being even and 'same' border for pytorch convolution

  # Hyperparameters
  cnn:
    kernels: [64, 32]
    size: [2, 2]
    strides: [1, 1]

# The model itself.
# This is parsed immediately after the "parameters" block.
model:
  - input: images
    sink: yes # sink somehow make this input layer accessible as an layout output
  - convolution:
      kernels: 64
      size: [2,2]
      strides: [1,1]
      border: valid
  - activation: #leakyrelu
      type: leakyrelu # leakyrelu # relu
      alpha: 0.01 # if alpha not exist or empty as None, default value is 0.3
    # make this activation layer accessible
    sink: yes
    name: conv_layer1
  - convolution:
      kernels: 32
      size: [2,2]
      strides: [1,1]
      border: valid
  - activation: #leakyrelu
      # interest hierarchy, go trace activation object
  # check container.parse(), _parse_core() for details of set up `sink`, `name`
      type: leakyrelu # leakyrelu # relu
      alpha: 0.01 # if alpha not exist or empty as None, default value is 0.3
    sink: yes
    name: conv_layer2
  - flatten:
  - dense: 10
    sink: yes
    name: dense1
  - activation: softmax
    #   name: softmax
    name: labels # this is output rather than labels of inputs???

train:
  data:
    - cifar:
        <<: *cifar
        parts: [1, 2, 3, 4]
  provider:
    batch_size: 32
    num_batches: 1
  log: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar-log
  epochs:
    number: 2
    mode: additional
  stop_when:
    epochs: 1 # null or infinite : to train forever
    elapsed:
      minutes: 10
      hours: 0
      days: 0
      clock: all # (time spend on all things) or all | train | validate | batch
    mode: additional # additional | total, if set total, then elapsed above define total training time in history added

  hooks:
    - plot_weights:
        # plot and save the layers
        layer_names: [images, conv_layer1, conv_layer2, dense1] # work on both so far
        plot_every_n_epochs: 1
        plot_directory: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar_plot_weights
        weight_file: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar.best.valid.w
        with_weights:
          - ["convolution", "kernel"]
          - ["convolution", "weight"]
          - ["dense", "kernel"]
          - ["dense", "weight"]
        # animate only specified layers and weights
        # sometimes, convolution_0, or convolution.0
        animate_layers: [images, convolution.0, conv_layer1, convolution.1, conv_layer2, dense1]
    - plot: # the folder must be prepared first
        loss_per_batch: /Users/Natsume/Downloads/temp_folders/demo_cifar/plot1.png
        loss_per_time: /Users/Natsume/Downloads/temp_folders/demo_cifar/plot2.png
        throughput_per_time: /Users/Natsume/Downloads/temp_folders/demo_cifar/plot3.png
  weights: # the folders below are prepared automatically?
    initial: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar.best.valid.w
    best: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar.best.train.w
    last: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar.last.w
  checkpoint:
    path: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar-checkpoint
    batches: 500 # batches, samples, epochs, minutes if present, must be an integer, not a string, not null, not None
    samples: 1000
    epochs: 1
    minutes: 1000
    validation: no
  optimizer:
    name: adam
    learning_rate: 0.001

validate:
  data:
    - cifar:
       <<: *cifar
       parts: 5
  provider:
    num_batches: 1
  weights: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar.best.valid.w
  hooks:
    - output: # folder and file must be prepared first
        path: /Users/Natsume/Downloads/temp_folders/demo_cifar/output.pkl
        format: pickle


test: &test
  data:
    - cifar:
       <<: *cifar
       parts: test
  weights: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar.best.valid.w
  provider:
    num_batches: 10

evaluate:
  <<: *test
  destination: /Users/Natsume/Downloads/temp_folders/demo_cifar/cifar.results.pkl


loss:
  - target: labels
    name: categorical_crossentropy
...

I got the following error only occur in keras theono backend. Could you give me some hints on how to solve it? thanks!

[ERROR 2017-05-17 00:00:28,337 kur.model.executor:295] Exception raised during training.
Traceback (most recent call last):
  File "/Users/Natsume/Downloads/kur/kur/model/executor.py", line 292, in train
    **kwargs
  File "/Users/Natsume/Downloads/kur/kur/model/executor.py", line 729, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/Natsume/Downloads/kur/kur/model/executor.py", line 116, in compile
    **kwargs
  File "/Users/Natsume/Downloads/kur/kur/backend/keras_backend.py", line 654, in compile
    compiled.trainable_weights, total_loss
  File "/Users/Natsume/Downloads/kur/kur/optimizer/optimizer.py", line 47, in optimize
    return keras_optimizer.get_updates(weights, [], loss)
  File "/Users/Natsume/Downloads/keras/keras/optimizers.py", line 381, in get_updates
    grads = self.get_gradients(loss, params)
  File "/Users/Natsume/Downloads/keras/keras/optimizers.py", line 47, in get_gradients
    grads = K.gradients(loss, params)
  File "/Users/Natsume/Downloads/keras/keras/backend/theano_backend.py", line 1180, in gradients
    return T.grad(loss, variables)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 561, in grad
    grad_dict, wrt, cost_name)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1324, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1324, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1113, in access_term_cache
    input_grads = node.op.grad(inputs, new_output_grads)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/tensor/nnet/abstract_conv.py", line 828, in grad
    d_bottom = bottom.type.filter_variable(d_bottom)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/tensor/type.py", line 233, in filter_variable
    self=self))
TypeError: Cannot convert Type TensorType(float32, 4D) (of Variable AbstractConv2d_gradInputs{border_mode='valid', subsample=(1, 1), filter_flip=True, imshp=(None, 64, 31, 31), kshp=(32, 64, 2, 2)}.0) into Type TensorType(float64, 4D). You can try to manually convert AbstractConv2d_gradInputs{border_mode='valid', subsample=(1, 1), filter_flip=True, imshp=(None, 64, 31, 31), kshp=(32, 64, 2, 2)}.0 into a TensorType(float64, 4D).
Traceback (most recent call last):
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/bin/kur", line 11, in <module>
    load_entry_point('kur', 'console_scripts', 'kur')()
  File "/Users/Natsume/Downloads/kur/kur/__main__.py", line 612, in main
    sys.exit(args.func(args) or 0)
  File "/Users/Natsume/Downloads/kur/kur/__main__.py", line 75, in train
    func(step=args.step)
  File "/Users/Natsume/Downloads/kur/kur/kurfile.py", line 432, in func
    return trainer.train(**defaults)
  File "/Users/Natsume/Downloads/kur/kur/model/executor.py", line 292, in train
    **kwargs
  File "/Users/Natsume/Downloads/kur/kur/model/executor.py", line 729, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/Natsume/Downloads/kur/kur/model/executor.py", line 116, in compile
    **kwargs
  File "/Users/Natsume/Downloads/kur/kur/backend/keras_backend.py", line 654, in compile
    compiled.trainable_weights, total_loss
  File "/Users/Natsume/Downloads/kur/kur/optimizer/optimizer.py", line 47, in optimize
    return keras_optimizer.get_updates(weights, [], loss)
  File "/Users/Natsume/Downloads/keras/keras/optimizers.py", line 381, in get_updates
    grads = self.get_gradients(loss, params)
  File "/Users/Natsume/Downloads/keras/keras/optimizers.py", line 47, in get_gradients
    grads = K.gradients(loss, params)
  File "/Users/Natsume/Downloads/keras/keras/backend/theano_backend.py", line 1180, in gradients
    return T.grad(loss, variables)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 561, in grad
    grad_dict, wrt, cost_name)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1324, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1324, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 973, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1279, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/gradient.py", line 1113, in access_term_cache
    input_grads = node.op.grad(inputs, new_output_grads)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/tensor/nnet/abstract_conv.py", line 828, in grad
    d_bottom = bottom.type.filter_variable(d_bottom)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/theano/tensor/type.py", line 233, in filter_variable
    self=self))
TypeError: Cannot convert Type TensorType(float32, 4D) (of Variable AbstractConv2d_gradInputs{border_mode='valid', subsample=(1, 1), filter_flip=True, imshp=(None, 64, 31, 31), kshp=(32, 64, 2, 2)}.0) into Type TensorType(float64, 4D). You can try to manually convert AbstractConv2d_gradInputs{border_mode='valid', subsample=(1, 1), filter_flip=True, imshp=(None, 64, 31, 31), kshp=(32, 64, 2, 2)}.0 into a TensorType(float64, 4D).
@ajsyp
Copy link
Collaborator

ajsyp commented Jun 5, 2017

I cannot reproduce this on my environment. I basically took your Kurfile, removed the plot_weights hook and leaky ReLU references, set the backend to Theano, and tried to run it. Worked fine. Maybe something changed in Theano? Can you do a pip freeze?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants