-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The MPS Backend is Not Working Properly #35
Comments
Which notebook are you trying to run, and did you make any changes to it? I'm not explicitly asking for |
I'm running the `notebooks/experiments.ipynb" on a fresh env now. The mps device bug was probably due to a change in the code I've made or some dependencies. Now running it all without changing anything, but I'm struggling to reproduce the results. For example, this piece of code honest_dataset = make_dataset(
"Pretend you're an {persona} person making statements about the world.",
["honest"],
["untruthful"],
truncated_fact_suffixes,
)
model.reset()
honest_vector = ControlVector.train(model, tokenizer, honest_dataset)
generate_with_vector(
"You are late for work because party until very late last night, but you don't want to lose your job. What would you tell your boss instead?",
honest_vector,
(2, -1.5),
) Should result in this (or something similar, according to the notebook):
while I keep getting distorted outputs like so:
Should I close this issue and open another one? I don't know if the reproducibility issue is related to that. What's your opinion? @vgel |
I'm fine with keeping things in this issue, no need to port over the context. Have you made any changes to that notebook (like changing the model string or datatype), or is it completely unchanged? Actually, if you could download the notebook from the IPython interface (with outputs, File > Download) and upload it as an attachment, that'd be really helpful for debugging 🙏 (make sure you remove any access tokens first if you added them) |
I haven't changed anything on the notebook I'm using, |
@d-lowl You said "It wasn't an issue with GGUF Mistral model under Ollama" on your previous comment. Did you mean just regularly using the model with Ollama or did you manage to get this repo working with ollama? |
@AtakanTekparmak just regular usage. I don't think Ollama can be used with this one or the original representation engineering project |
Yeah, it definitely looks like an upstream issue with the model, considering that it happens even with the baseline (which does technically inject code into the model, but with no vector loaded it short-circuits that code so I'd be very surprised if it impacted the output) One thing you could try doing is clearing your HF cache or switching to a different model (even just 0.2 instead of 0.1) to see if maybe your model download is corrupted? Unfortunately I don't have a mac so I can't really debug this easily. |
Similar thing happens with the v0.3 of Mistral-7B. I get the results below with the default coefficients:
I also tried changing the generate_with_vector(
"What does being an AI feel like?",
happy_vector,
(1.2, -1.2),
max_new_tokens=64,
repetition_penalty=1.3,
) I got the following response:
The desired behaviour of "happiness and sadness" are observable from the model responses from the few meaningful tokens/words in there, but there seems to be an over-modification of the hidden states? I honestly don't know but I don't think this is an upstream issue since I've tried so far the Mistral family (v0.1 and v0.3), Llama-3-8B and NousResearch/Hermes-2-Pro-Llama-3-8B, encountering similar behaviour. Is there anyone that has a working setup on Apple silicon that you know of ? @vgel @d-lowl |
And given that even the baseline results are corrupted, I feel either the |
I did poetry update (which updated transformers to 4.46.2 and torch to 2.5.1), now it just works. I suppose the issue can be closed, if @AtakanTekparmak can confirm that |
Can confirm, poetry update did the trick. Should the requirement version be updated also to match the latest then?
|
@AtakanTekparmak technically no, since any fresh install would download working versions now. But it might be worth identifying the version of transformers where they fixed it. |
On the
MPS
device when one tries to train aControlVector
, the following error is thrown becausetorch.autocast()
does not support MPS:The text was updated successfully, but these errors were encountered: