About function get_filter_similar() #65

BlossomingL · 2020-07-24T08:03:02Z

Thanks for your great work!
But I have some questions about the function 'get_filter_similar()'. In paper Eq(2) and Eq(3), you calculate x(GM) first and then find the filter(s) nearest to the x(GM), but in code, you just caculate the similar matrix and get sum on columns,then select some smallest to prune(but not find x(GM)). Is it different?

Sorry to bother you!

he-y · 2020-07-25T02:28:19Z

Thanks for your interest.
The following paragraphs explain why we not directly calculate GM.

We turn to find the filter with the most "GM" property to reduce the computation.

BlossomingL · 2020-07-26T07:29:37Z

OK, I see, Thanks!

BlossomingL · 2020-07-27T06:17:57Z

Hi~
I have another problem about training: Although my problem was raised in issue10, I am still confused.That is "once filters are pruned, they will be set to zero. And since you set the gradient of pruned filter to zero, they won't be updated during training. So will they be pruned again in next pruning?"
Sorry to bother you!

BlossomingL · 2020-07-27T06:29:17Z

In paper "The pruning operation is conducted at the end of every training epoch", but you have pruned all layers at the beginning, why did you prune again at the end of every training epoch?

he-y · 2020-07-27T06:40:34Z

Hi.
Let me explain.
First, "pruning filters" almost equals to "set them to zero".
Will the pruned ones are recovered (become non-zero) during fine-tuning? That depends on the gradients. Gradients have two situations:

gradients are normal and non-zero.
Yes, the pruned ones are recovered. So further pruning is needed.
gradients are forced to set to zero.
Then, the pruned ones could not be recovered. So further pruning is NOT needed. However, pruning these filters again (set them to zeros again) would not influence the results, since they are already zeros.

In FPGM experiment, we choose situation 2.
However, FPGM could work in both the situation 1 and 2. So we set "The pruning operation is conducted at the end of every training epoch", since this operation could solve the problem in situation 1 and do not influence situation 2.

BlossomingL · 2020-07-27T06:42:41Z

Thanks for your quick reply! Got it!

lovepan1 · 2020-10-11T15:15:59Z

hi，thanks for the works， i read the questions and discussions， but don`t understand this code:
optimizer.zero_grad()

    loss.backward()

    # Mask grad for iteration

    m.do_grad_mask()

    optimizer.step()

in pruning_imagenet.py line 281: m.do_grad_mask()
if i prune my model, and don`t update channels grad, means gradients are forced to set to zero, do I still need to delete the channel every prune epoch?@BlossomingL @he-y

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About function get_filter_similar() #65

About function get_filter_similar() #65

BlossomingL commented Jul 24, 2020

he-y commented Jul 25, 2020

BlossomingL commented Jul 26, 2020

BlossomingL commented Jul 27, 2020

BlossomingL commented Jul 27, 2020

he-y commented Jul 27, 2020

BlossomingL commented Jul 27, 2020 •

edited

Loading

lovepan1 commented Oct 11, 2020

About function get_filter_similar() #65

About function get_filter_similar() #65

Comments

BlossomingL commented Jul 24, 2020

he-y commented Jul 25, 2020

BlossomingL commented Jul 26, 2020

BlossomingL commented Jul 27, 2020

BlossomingL commented Jul 27, 2020

he-y commented Jul 27, 2020

BlossomingL commented Jul 27, 2020 • edited Loading

lovepan1 commented Oct 11, 2020

BlossomingL commented Jul 27, 2020 •

edited

Loading