Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About function get_filter_similar() #65

Open
BlossomingL opened this issue Jul 24, 2020 · 7 comments
Open

About function get_filter_similar() #65

BlossomingL opened this issue Jul 24, 2020 · 7 comments

Comments

@BlossomingL
Copy link

Thanks for your great work!
But I have some questions about the function 'get_filter_similar()'. In paper Eq(2) and Eq(3), you calculate x(GM) first and then find the filter(s) nearest to the x(GM), but in code, you just caculate the similar matrix and get sum on columns,then select some smallest to prune(but not find x(GM)). Is it different?

Sorry to bother you!

@he-y
Copy link
Owner

he-y commented Jul 25, 2020

Thanks for your interest.
The following paragraphs explain why we not directly calculate GM.
image

We turn to find the filter with the most "GM" property to reduce the computation.

@BlossomingL
Copy link
Author

OK, I see, Thanks!

@BlossomingL
Copy link
Author

Hi~
I have another problem about training: Although my problem was raised in issue10, I am still confused.That is "once filters are pruned, they will be set to zero. And since you set the gradient of pruned filter to zero, they won't be updated during training. So will they be pruned again in next pruning?"
Sorry to bother you!

@BlossomingL
Copy link
Author

In paper "The pruning operation is conducted at the end of every training epoch", but you have pruned all layers at the beginning, why did you prune again at the end of every training epoch?

@he-y
Copy link
Owner

he-y commented Jul 27, 2020

Hi.
Let me explain.
First, "pruning filters" almost equals to "set them to zero".
Will the pruned ones are recovered (become non-zero) during fine-tuning? That depends on the gradients. Gradients have two situations:

  1. gradients are normal and non-zero.
    Yes, the pruned ones are recovered. So further pruning is needed.
  2. gradients are forced to set to zero.
    Then, the pruned ones could not be recovered. So further pruning is NOT needed. However, pruning these filters again (set them to zeros again) would not influence the results, since they are already zeros.

In FPGM experiment, we choose situation 2.
However, FPGM could work in both the situation 1 and 2. So we set "The pruning operation is conducted at the end of every training epoch", since this operation could solve the problem in situation 1 and do not influence situation 2.

@BlossomingL
Copy link
Author

BlossomingL commented Jul 27, 2020

Thanks for your quick reply! Got it!

@lovepan1
Copy link

hi,thanks for the works, i read the questions and discussions, but don`t understand this code:
optimizer.zero_grad()

    loss.backward()

    # Mask grad for iteration

    m.do_grad_mask()

    optimizer.step()

in pruning_imagenet.py line 281: m.do_grad_mask()
if i prune my model, and don`t update channels grad, means gradients are forced to set to zero, do I still need to delete the channel every prune epoch?@BlossomingL @he-y

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants