Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in svd when response_type="transient" #4

Open
daniel-wells opened this issue Sep 20, 2017 · 3 comments
Open

Error in svd when response_type="transient" #4

daniel-wells opened this issue Sep 20, 2017 · 3 comments

Comments

@daniel-wells
Copy link

Firstly thanks for adding the transient option!

However, when response_type="transient" i.e. no genes are designated switching svd throws an error:

library(ouija)
data(example_gex)
oui <- ouija(example_gex, response_type="transient")
Error in svd(x, nu = 0, nv = k) : a dimension is zero

Presumably this is due to attempting prcomp on no genes

ouija/R/ouija.R

Line 170 in 1ebc4ea

pc1 <- prcomp(Y_switch)$x[,1]

Perhaps just do a pca of the full Y?

Also in the readme the data is called synth_gex rather than example_gex.

@kieranrcampbell
Copy link
Owner

Hi Daniel,

Thanks for catching this. The reason for this behaviour is that if all your genes were transient then PC1 wouldn't correspond to anything like "true" pseudotime, whereas PCA-ing only the switch-like genes will still approximate pseudotime. Ways to fix this I would envisage as

  • Don't allow all-transient datasets
  • PCA the transient genes anyway
  • Set it randomly
  • Initialise from a different pseudotime algorithm (DPT is the only one I've found that actually works when everything is transient)

In any case, Stan with HMC is normally efficient enough that initialising randomly gets you back to the "truth", whereas Stan with ADVI is obviously sensitive to init.

Open to suggestions,

Thanks,

Kieran

1 similar comment
@kieranrcampbell
Copy link
Owner

Hi Daniel,

Thanks for catching this. The reason for this behaviour is that if all your genes were transient then PC1 wouldn't correspond to anything like "true" pseudotime, whereas PCA-ing only the switch-like genes will still approximate pseudotime. Ways to fix this I would envisage as

  • Don't allow all-transient datasets
  • PCA the transient genes anyway
  • Set it randomly
  • Initialise from a different pseudotime algorithm (DPT is the only one I've found that actually works when everything is transient)

In any case, Stan with HMC is normally efficient enough that initialising randomly gets you back to the "truth", whereas Stan with ADVI is obviously sensitive to init.

Open to suggestions,

Thanks,

Kieran

@daniel-wells
Copy link
Author

Ah I see, I think random initiation seems like a good option - simple and unrestrictive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants