Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Generator::score_batch-like function for perplexity calculation in ctranslate2-rs #88

Open
gn64 opened this issue Jan 3, 2025 · 2 comments
Assignees

Comments

@gn64
Copy link

gn64 commented Jan 3, 2025

Hello,
First of all, thank you so much for providing this wonderful library. It has been incredibly helpful for various use cases, and I appreciate all the work that went into creating and maintaining it.

I’m currently trying to calculate perplexity with ctranslate2-rs for a GPT-like model by using the Generator class. However, I cannot find any function equivalent to the score_batch method from the original CTranslate2 library (specifically for the Generator). The ctranslate2-rs crate seems to focus primarily on translation tasks, and I can’t seem to find any scoring or perplexity-related methods for the Generator.

Is there currently any way to compute perplexity in ctranslate2-rs, or is this functionality missing? If so, are there plans to add a score_batch-like function to the Generator class? Any guidance or workarounds for GPT-like perplexity calculations would be greatly appreciated.

@jkawamoto
Copy link
Owner

Currently, scoring functions haven't been implemented yet. However, it seems like adding score_batch to Generator wouldn't be difficult, and I'll add them soon.

@jkawamoto
Copy link
Owner

Generator::score_batch is now available as of v0.9.6.

Here is an example of its usage:

// Scoring the prompts.
let scores = g.score_batch(&[prompts], &ScoringOptions::default())?;
println!("{:?}", scores[0]);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants