Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement suggestion - add an mlbam ID column to all by-player results #76

Closed
znmeb opened this issue Jun 12, 2015 · 9 comments
Closed

Comments

@znmeb
Copy link
Contributor

znmeb commented Jun 12, 2015

I'm now at the point where I want to join openWAR results with other data. I found the 'idTT' table in openWARData, but rather than mess around with 'fuzzy joins' I'd like to just add player name columns to that table for various other datasets and index everything by mlbam ID.

I can probably figure out from the code how to do this if you can't get to it in this round of changes. Right now the tables I want most are the "shakeWAR" results and the summary that has WAR and the RAA values.

@beanumber
Copy link
Owner

Great idea. We will add this!

@beanumber
Copy link
Owner

Just added this to

summary.openWARPlayers()
summary.do.openWARPlayers()

in 7e53e60.

Does that do the trick?

@znmeb
Copy link
Contributor Author

znmeb commented Jun 13, 2015

Looks good - I can't test it at the moment. I ran a 5000-sample shakeWAR over 2015 to date last night and it's still saving the output file as .rdata ;-)

@znmeb
Copy link
Contributor Author

znmeb commented Jun 14, 2015

OK - it's working! Thanks!

One more enhancement - let shakeWAR work from an output of makeWAR if you just want to resample the plays. On my workstation for 2015 to date makeWAR runs about 1.3 minutes and each resample runs about 0.03 minutes.

When you wrote the paper, did you just resample on plays or did you resample on the models too? If I have a spare overnight I might try eight hours worth of the 'both' option. ;-)

@gjm112
Copy link
Collaborator

gjm112 commented Jun 15, 2015

We re-sampled both plays and models.

On Sun, Jun 14, 2015 at 1:18 AM, M. Edward (Ed) Borasky <
[email protected]> wrote:

OK - it's working! Thanks!

One more enhancement - let shakeWAR work from an output of makeWAR if you
just want to resample the plays. On my workstation for 2015 to date makeWAR
runs about 1.3 minutes and each resample runs about 0.03 minutes.

When you wrote the paper, did you just resample on plays or did you
resample on the models too? If I have a spare overnight I might try eight
hours worth of the 'both' option. ;-)


Reply to this email directly or view it on GitHub
#76 (comment).

Gregory J. Matthews, Ph.D.
Assistant Professor
Department of Mathematics and Statistics
Loyola University Chicago
E-mail: gjm112 -at- gmail.com
Blog: StatsInTheWild.com
Art: etsy.me/1JAsYz9
Twitter: @statsinthewild http://www.twitter.com/StatsInTheWild
Twitter: @StatsClass http://www.twitter.com/statsclass

@znmeb
Copy link
Contributor Author

znmeb commented Jun 16, 2015

@gjm112 Well then, I'll try that tonight. I just paid the power bill. ;-)

@gjm112
Copy link
Collaborator

gjm112 commented Jun 16, 2015

it is really slow. And the variance in the models is dwarfed by the
variance in resampling the plays.

On Mon, Jun 15, 2015 at 10:10 PM, M. Edward (Ed) Borasky <
[email protected]> wrote:

@gjm112 https://github.com/gjm112 Well then, I'll try that tonight. I
just paid the power bill. ;-)


Reply to this email directly or view it on GitHub
#76 (comment).

Gregory J. Matthews, Ph.D.
Assistant Professor
Department of Mathematics and Statistics
Loyola University Chicago
E-mail: gjm112 -at- gmail.com
Blog: StatsInTheWild.com
Art: etsy.me/1JAsYz9
Twitter: @statsinthewild http://www.twitter.com/StatsInTheWild
Twitter: @StatsClass http://www.twitter.com/statsclass

@znmeb
Copy link
Contributor Author

znmeb commented Jun 16, 2015

I ran 4096 resamples of plays only for 2015 to date the other night. It ran about three hours (4 GHz CPU, 32 GB of RAM, probably single-threaded). This is an 8-core box and Monte Carlo is supposed to be embarrassingly parallel.

@beanumber
Copy link
Owner

@znmeb , I've added the following methods in c096eda:

shakeWAR.list()
shakeWAR.openWARPlays()

I think this achieves the functionality that you're looking for. Good suggestion!
We'll have to defer the speed issue until the next release. Note this is a duplicate of #42 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants