Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'im_info' and 'gt_boxes' when running 'example_train_slurm.sh' #4

Open
RafaelMostert opened this issue Nov 3, 2018 · 2 comments

Comments

@RafaelMostert
Copy link

Running the 'demo.py' works fine, but running 'example_train_slurm.sh' returns the following key error:

Traceback (most recent call last):
File "/local/s174/rgz_rcnn/tools/train_net.py", line 105, in
max_iters=args.max_iters, start_iter=args.start_iter)
File "/local/s174/rgz_rcnn/tools/../lib/fast_rcnn/train.py", line 333, in train_net
sw.train_model(sess, max_iters, start_iter=start_iter)
File "/local/s174/rgz_rcnn/tools/../lib/fast_rcnn/train.py", line 217, in train_model
self.net.im_info: blobs['im_info'],
KeyError: 'im_info'

The traceback refers to this code in "rgz_rcnn/lib/fast_rcnn/train.py":

210             # get one batch
211             blobs = data_layer.forward()
212 
213             # DEBUG
214             print(blobs.keys())
215             # Make one SGD update
216             feed_dict = {self.net.data: blobs['data'],
217                           self.net.im_info: blobs['im_info'],
218                          self.net.keep_prob: 0.5,
219                          self.net.gt_boxes: blobs['gt_boxes']}

I added the debug statement on line 214, which returns:

['bbox_inside_weights', 'labels', 'rois', 'bbox_targets', 'bbox_outside_weights', 'data']

Suggesting that not only 'im_info' but also 'gt_boxes' is a non existent key in the data_layer.

Any suggestions on what the problem might be?

@chenwuperth
Copy link
Owner

Looks like the data layer is not reading the data set properly. Before delving into the code further, just wondering how did you run the 'example_train_slurm.sh'? The script was made to run on a cluster, where the Python (TF) environment is pre-installed. So it won't invoke your python virtual environment as per the READ.ME. Do you mind shared your version of 'example_train_slurm.sh' somewhere on git?
Thanks!

@RafaelMostert
Copy link
Author

RafaelMostert commented Nov 5, 2018

Sure.
I changed 'example_train_slurm.sh' to 'example_train.sh', which results in the keyerror (see crash_log_example_train.txt for the full output):

#!/bin/bash

export CUDA_VISIBLE_DEVICES=7
source activate py2-tensorflow

RGZ_RCNN=/local/s174/rgz_rcnn

python $RGZ_RCNN/tools/train_net.py \
                    --device 'gpu' \
                    --device_id 0 \ 
                    --imdb rgz_2017_trainD4 \
                    --iters 80000 \
                    --cfg $RGZ_RCNN/experiments/cfgs/faster_rcnn_end2end.yml \
                    --network rgz_train \
                    --weights $RGZ_RCNN/data/pretrained_model/imagenet/VGG_imagenet.npy

The example_test_cpu.sh works fine and is adapted to look like this:

# please change to your own python virtual environment path
export CUDA_VISIBLE_DEVICES=7
source activate py2-tensorflow
RGZ_RCNN=/local/s174/rgz_rcnn

python $RGZ_RCNN/tools/test_net.py \
                    --device 'cpu' \
                    --device_id 0 \ 
                    --imdb rgz_2017_testD4 \
                    --cfg $RGZ_RCNN/experiments/cfgs/faster_rcnn_end2end.yml \
                    --network rgz_test \
                    --weights $RGZ_RCNN/data/pretrained_model/rgz/D4/VGGnet_fast_rcnn-80000 \
                    --comp

Irrespective of the --device flag, 'example_test_cpu.sh' will actually always run on the gpu:
success_log_example_test_cpu.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants