You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you take a closer look at the design of the head Transformer layers, the MLM prediction has an explicit conditioning/dependency on the CLS vector from the backbone LM.
As title, I already read paper but still confused about the Condition of CLS token.
What is exactly is CONdition of Condensor?
The text was updated successfully, but these errors were encountered: