Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce the results with a new model #99

Open
LogicStuff opened this issue Aug 9, 2021 · 1 comment
Open

Unable to reproduce the results with a new model #99

LogicStuff opened this issue Aug 9, 2021 · 1 comment

Comments

@LogicStuff
Copy link

LogicStuff commented Aug 9, 2021

I am not able to train a similarly performing model after changing all MLP activations to LeakyReLU (because of the observed issue with unchanging loss).

At first, I used the hyperparameters from scripts/run_traj.sh and tried both pooling modules (although the default 2 m neighborhood probably is not sensible for social pooling), getting to validation ADE of ~0.7, and FDE of ~1.5 at epoch ~70 for both eth and zara1, shortly before the discriminator overpowered the generator and the whole model diverged.

image

While the selection mechanism of the best model looks to be robust against this convergence issue, I would have expected more thorough assurance of losses' trends. It is also pointless to train the model any further after substantial divergence.

I have since experienced with many hyperparameter settings and larger batch size (128 sequences per batch) seems to stabilize the process the most, but I still cannot get below the aforementioned evaluation metrics' values. Even if I manage to train the model for 200-500 epochs with stable GAN losses (and diminishing D_loss_real), the predictions greatly suffer from some kind of directional bias, i.e. trajectories of all pedestrians try to turn to the same heading, which remains constant for different inputs.

I have stopped worrying about ADE and FDE metrics under the setting of N > 1, because of its issues discussed here. Unfortunately, with N=1, the metrics are too noisy... Perhaps I will just take average instead of minimum in evaluate_helper of scripts/evaluate_model.py and also focus on collisions.

Update: I have also tried the (correct) hyperparameters extracted from the pretrained models via scripts/print_args.py. The vast majority of independent training runs also diverges within 50 epochs. Here, I have noticed that weakening down the discriminator by --d_type local helps stabilize the losses. Since that way the discriminator does not capture social interactions in any way, I am now also trying higher --g_steps/--d_steps ratios.

@LogicStuff LogicStuff changed the title Can anyone reproduce the results? Unable to reproduce the results with a new model Aug 11, 2021
@liuyu9661
Copy link

Hi, I have a similar problem. I try to reproduce the results by just running 'train.py' with no changes to the model. However, the ADE and FDE are very large.
I will check my log file to see what happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants