-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prediction scores #4
Comments
Hello !
|
Hi @emiliepicardcantin, I believe you make the assumption that because the model was trained on a binary classification task, its output is a single neuron with sigmoid activation. In fact, this model have two output neurons on which we apply a softmax activation in order to get (pseudo) probabilities. Because this is a binary classification task, if the "negative" probability is > 0.5, then the predicted label is "negative", and if the "positive" probability is > 0.5 then the predicted label is "positive". That's why on your examples, the output scores are always > 0. If you pass the This said, the predicted label seems ok in your examples (3rd example is arguable) , but I'd advise you to fine-tune the model for your task (as the model was only trained on movie review data). |
If you bypass the review = "J'aime le camembert"
inputs = tokenizer(review, return_tensors="tf")
model_outputs = model(inputs)
outputs = model_outputs["logits"][0]
print(outputs) # => tf.Tensor([-0.6336924 0.65147054], shape=(2,), dtype=float32) You can then manually apply the softmax import numpy as np
def softmax(_outputs):
maxes = np.max(_outputs, axis=-1, keepdims=True)
shifted_exp = np.exp(_outputs - maxes)
return shifted_exp / shifted_exp.sum(axis=-1, keepdims=True)
scores = softmax(outputs)
print(scores) # => [0.21667267 0.7833273 ] You will get the same results than with the nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)
result = nlp(review, return_all_scores=True)
result [[{'label': 'NEGATIVE', 'score': 0.2166726142168045},
{'label': 'POSITIVE', 'score': 0.7833273410797119}]] |
Thank you ! |
With 🤗Transformers pipelines it's very easy to get prediction scores, for each class.
Originally posted by @hodhoda in #3 (comment)
1. First instantiate Tokenizer & Model
2. Then create pipeline
Do not forget to set the
return_all_scores
parameter toTrue
, otherwise the pipeline will only output the probability of the predicted class.3. Last, feed the pipeline
The text was updated successfully, but these errors were encountered: