You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now RecListAnalysis is good but limited — only computes per-user metrics.
It would help standardization of evaluation procedures if we had a more coherent "analyze" (and maybe "run") tool for experiments. The first version, of course, would just be for analysis.
Specify experiment axes instead of inferring them?
Support global metrics
Specify list lengths as analysis parameter
Support metrics with additional data (novelty, etc.)
Clean up metric interface design
Support analysis (sig tests, CIs, distributions, etc.)
Support results in DuckDB?
This ticket is really probably its own epic.
The text was updated successfully, but these errors were encountered:
Right now
RecListAnalysis
is good but limited — only computes per-user metrics.It would help standardization of evaluation procedures if we had a more coherent "analyze" (and maybe "run") tool for experiments. The first version, of course, would just be for analysis.
This ticket is really probably its own epic.
The text was updated successfully, but these errors were encountered: