Usage with BigBird-Roberta-Base #36

jordanparker6 · 2023-02-03T21:50:40Z

Would it be possible to use LiLT with BigBird-Roberta-Base models?

If so, any feedback on the best approach of doing so? What might need changing in the LiLT repository to do so?

https://huggingface.co/google/bigbird-roberta-base

jordanparker6 · 2023-02-07T16:00:56Z

I was able to use the provided script to create a lilt-roberta-base-en using the following: https://huggingface.co/google/bigbird-roberta-base. If I can get this working, I will post up to HuggingfaceHub.

BigBird uses the same tokenizer as roberta so no issue with tokenizationgoogle/bigbird-roberta-base.

However, the following error occurs when loading the model.

RuntimeError: Error(s) in loading state_dict for LiltForTokenClassification: size mismatch for lilt.layout_embeddings.box_position_embeddings.weight: copying a param with shape torch.Size([514, 192]) from checkpoint, the shape in current model is torch.Size([4096, 192]). You may consider adding ignore_mismatched_sizes=Truein the modelfrom_pretrained method.

I think this error is created when the pytorch state dicts are fused with the following line.

total_model = {**text_model, **lilt_model}

The lilt_model dim changes the incoming bigbird dim.

Would it be problematic to switch this:

total_model = {**lilt_model, **text_model }

Or would this break the architecture?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage with BigBird-Roberta-Base #36

Usage with BigBird-Roberta-Base #36

jordanparker6 commented Feb 3, 2023 •

edited

Loading

jordanparker6 commented Feb 7, 2023

Usage with BigBird-Roberta-Base #36

Usage with BigBird-Roberta-Base #36

Comments

jordanparker6 commented Feb 3, 2023 • edited Loading

jordanparker6 commented Feb 7, 2023

jordanparker6 commented Feb 3, 2023 •

edited

Loading