Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow saving, loading and pushing adapter compositions together #771

Merged
merged 5 commits into from
Jan 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/adapter_composition.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ model.active_adapters = ac.Fuse("d", "e", "f")

To learn how training an _AdapterFusion_ layer works, check out [this Colab notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/03_Adapter_Fusion.ipynb) from the `adapters` repo.

To save and upload the full composition setup with adapters and fusion layer in one line of code, check out the docs on [saving and loading adapter compositions](loading.md#saving-and-loading-adapter-compositions).

### Retrieving AdapterFusion attentions

Finally, it is possible to retrieve the attention scores computed by each fusion layer in a forward pass of the model.
Expand Down
36 changes: 36 additions & 0 deletions docs/loading.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,3 +94,39 @@ We will go through the different arguments and their meaning one by one:
To load the adapter using a custom name, we can use the `load_as` parameter.

- Finally, `set_active` will directly activate the loaded adapter for usage in each model forward pass. Otherwise, you have to manually activate the adapter via `set_active_adapters()`.

## Saving and loading adapter compositions

In addition to saving and loading individual adapters, you can also save, load and share entire [compositions of adapters](adapter_composition.md) with a single line of code.
_Adapters_ provides three methods for this purpose that work very similar to those for single adapters:

- [`save_adapter_setup()`](adapters.ModelWithHeadsAdaptersMixin.save_adapter_setup) to save an adapter composition along with prediction heads to the local file system.
- [`load_adapter_setup()`](adapters.ModelWithHeadsAdaptersMixin.load_adapter_setup) to load a saved adapter composition from the local file system or the Model Hub.
- [`push_adapter_setup_to_hub()`](adapters.hub_mixin.PushAdapterToHubMixin.push_adapter_setup_to_hub) to upload an adapter setup along with prediction heads to the Model Hub. See our [Hugging Face Model Hub guide](huggingface_hub.md) for more.

As an example, this is how you would save and load an AdapterFusion setup of three adapters with a prediction head:

```python
# Create an AdapterFusion
model = AutoAdapterModel.from_pretrained("bert-base-uncased")
model.load_adapter("sentiment/sst-2@ukp", config=SeqBnConfig(), with_head=False)
model.load_adapter("nli/multinli@ukp", config=SeqBnConfig(), with_head=False)
model.load_adapter("sts/qqp@ukp", config=SeqBnConfig(), with_head=False)
model.add_adapter_fusion(["sst-2", "mnli", "qqp"])
model.add_classification_head("clf_head")
adapter_setup = Fuse("sst-2", "mnli", "qqp")
head_setup = "clf_head"
model.set_active_adapters(adapter_setup)
model.active_head = head_setup

# Train AdapterFusion ...

# Save
model.save_adapter_setup("checkpoint", adapter_setup, head_setup=head_setup)

# Push to Hub
model.push_adapter_setup_to_hub("<user>/fusion_setup", adapter_setup, head_setup=head_setup)

# Re-load
# model.load_adapter_setup("checkpoint", set_active=True)
```
2 changes: 1 addition & 1 deletion docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ model = AutoAdapterModel.from_pretrained(example_path)
model.load_adapter(example_path)
```

Similar to how the weights of the full model are saved, the `save_adapter()` will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.
Similar to how the weights of the full model are saved, [`save_adapter()`](adapters.ModelWithHeadsAdaptersMixin.save_adapter) will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.

Finally, if we have finished working with adapters, we can restore the base Transformer to its original form by deactivating and deleting the adapter:

Expand Down
35 changes: 35 additions & 0 deletions src/adapters/composition.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import itertools
import sys
import warnings
from collections.abc import Sequence
from typing import List, Optional, Set, Tuple, Union
Expand Down Expand Up @@ -45,6 +46,31 @@ def parallel_channels(self):
def flatten(self) -> Set[str]:
return set(itertools.chain(*[[b] if isinstance(b, str) else b.flatten() for b in self.children]))

def _get_save_kwargs(self):
return None

def to_dict(self):
save_dict = {
"type": self.__class__.__name__,
"children": [
c.to_dict() if isinstance(c, AdapterCompositionBlock) else {"type": "single", "children": [c]}
for c in self.children
],
}
if kwargs := self._get_save_kwargs():
save_dict["kwargs"] = kwargs
return save_dict

@classmethod
def from_dict(cls, data):
children = []
for child in data["children"]:
if child["type"] == "single":
children.append(child["children"][0])
else:
children.append(cls.from_dict(child))
return getattr(sys.modules[__name__], data["type"])(*children, **data.get("kwargs", {}))


class Parallel(AdapterCompositionBlock):
def __init__(self, *parallel_adapters: List[str]):
Expand Down Expand Up @@ -80,12 +106,18 @@ def __init__(self, *split_adapters: List[Union[AdapterCompositionBlock, str]], s
super().__init__(*split_adapters)
self.splits = splits if isinstance(splits, list) else [splits] * len(split_adapters)

def _get_save_kwargs(self):
return {"splits": self.splits}


class BatchSplit(AdapterCompositionBlock):
def __init__(self, *split_adapters: List[Union[AdapterCompositionBlock, str]], batch_sizes: Union[List[int], int]):
super().__init__(*split_adapters)
self.batch_sizes = batch_sizes if isinstance(batch_sizes, list) else [batch_sizes] * len(split_adapters)

def _get_save_kwargs(self):
return {"batch_sizes": self.batch_sizes}


class Average(AdapterCompositionBlock):
def __init__(
Expand All @@ -105,6 +137,9 @@ def __init__(
else:
self.weights = [1 / len(average_adapters)] * len(average_adapters)

def _get_save_kwargs(self):
return {"weights": self.weights}


# Mapping each composition block type to the allowed nested types
ALLOWED_NESTINGS = {
Expand Down
95 changes: 92 additions & 3 deletions src/adapters/hub_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

from transformers.utils.generic import working_or_temp_dir

from .composition import AdapterCompositionBlock


logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -35,7 +37,7 @@
from adapters import AutoAdapterModel

model = AutoAdapterModel.from_pretrained("{model_name}")
adapter_name = model.load_adapter("{adapter_repo_name}", set_active=True)
adapter_name = model.{load_fn}("{adapter_repo_name}", set_active=True)
```

## Architecture & Training
Expand Down Expand Up @@ -66,6 +68,7 @@ def _save_adapter_card(
language: Optional[str] = None,
license: Optional[str] = None,
metrics: Optional[List[str]] = None,
load_fn: str = "load_adapter",
**kwargs,
):
# Key remains "adapter-transformers", see: https://github.com/huggingface/huggingface.js/pull/459
Expand Down Expand Up @@ -103,6 +106,7 @@ def _save_adapter_card(
model_name=self.model_name,
dataset_name=dataset_name,
head_info=head_info,
load_fn=load_fn,
adapter_repo_name=adapter_repo_name,
architecture_training=kwargs.pop("architecture_training", DEFAULT_TEXT),
results=kwargs.pop("results", DEFAULT_TEXT),
Expand Down Expand Up @@ -133,8 +137,6 @@ def push_adapter_to_hub(
Args:
repo_id (str): The name of the repository on the model hub to upload to.
adapter_name (str): The name of the adapter to be uploaded.
organization (str, optional): Organization in which to push the adapter
(you must be a member of this organization). Defaults to None.
datasets_tag (str, optional): Dataset identifier from https://huggingface.co/datasets. Defaults to
None.
local_path (str, optional): Local path used as clone directory of the adapter repository.
Expand All @@ -156,6 +158,8 @@ def push_adapter_to_hub(
Branch to push the uploaded files to.
commit_description (`str`, *optional*):
The description of the commit that will be created
adapter_card_kwargs (Optional[dict], optional): Additional arguments to pass to the adapter card text generation.
Currently includes: tags, language, license, metrics, architecture_training, results, citation.

Returns:
str: The url of the adapter repository on the model hub.
Expand Down Expand Up @@ -190,3 +194,88 @@ def push_adapter_to_hub(
revision=revision,
commit_description=commit_description,
)

def push_adapter_setup_to_hub(
self,
repo_id: str,
adapter_setup: Union[str, list, AdapterCompositionBlock],
head_setup: Optional[Union[bool, str, list, AdapterCompositionBlock]] = None,
datasets_tag: Optional[str] = None,
local_path: Optional[str] = None,
commit_message: Optional[str] = None,
private: Optional[bool] = None,
token: Optional[Union[bool, str]] = None,
overwrite_adapter_card: bool = False,
create_pr: bool = False,
revision: str = None,
commit_description: str = None,
adapter_card_kwargs: Optional[dict] = None,
):
"""Upload an adapter setup to HuggingFace's Model Hub.

Args:
repo_id (str): The name of the repository on the model hub to upload to.
adapter_setup (Union[str, list, AdapterCompositionBlock]): The adapter setup to be uploaded. Usually an adapter composition block.
head_setup (Optional[Union[bool, str, list, AdapterCompositionBlock]], optional): The head setup to be uploaded.
datasets_tag (str, optional): Dataset identifier from https://huggingface.co/datasets. Defaults to
None.
local_path (str, optional): Local path used as clone directory of the adapter repository.
If not specified, will create a temporary directory. Defaults to None.
commit_message (:obj:`str`, `optional`):
Message to commit while pushing. Will default to :obj:`"add config"`, :obj:`"add tokenizer"` or
:obj:`"add model"` depending on the type of the class.
private (:obj:`bool`, `optional`):
Whether or not the repository created should be private (requires a paying subscription).
token (`bool` or `str`, *optional*):
The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url`
is not specified.
overwrite_adapter_card (bool, optional): Overwrite an existing adapter card with a newly generated one.
If set to `False`, will only generate an adapter card, if none exists. Defaults to False.
create_pr (bool, optional):
Whether or not to create a PR with the uploaded files or directly commit.
revision (`str`, *optional*):
Branch to push the uploaded files to.
commit_description (`str`, *optional*):
The description of the commit that will be created
adapter_card_kwargs (Optional[dict], optional): Additional arguments to pass to the adapter card text generation.
Currently includes: tags, language, license, metrics, architecture_training, results, citation.

Returns:
str: The url of the adapter repository on the model hub.
"""
use_temp_dir = not os.path.isdir(local_path) if local_path else True

# Create repo or get retrieve an existing repo
repo_id = self._create_repo(repo_id, private=private, token=token)

# Commit and push
logger.info('Pushing adapter setup "%s" to model hub at %s ...', adapter_setup, repo_id)
with working_or_temp_dir(working_dir=local_path, use_temp_dir=use_temp_dir) as work_dir:
files_timestamps = self._get_files_timestamps(work_dir)
# Save adapter and optionally create model card
if head_setup is not None:
save_kwargs = {"head_setup": head_setup}
else:
save_kwargs = {}
self.save_adapter_setup(work_dir, adapter_setup, **save_kwargs)
if overwrite_adapter_card or not os.path.exists(os.path.join(work_dir, "README.md")):
adapter_card_kwargs = adapter_card_kwargs or {}
self._save_adapter_card(
work_dir,
str(adapter_setup),
repo_id,
datasets_tag=datasets_tag,
load_fn="load_adapter_setup",
**adapter_card_kwargs,
)
return self._upload_modified_files(
work_dir,
repo_id,
files_timestamps,
commit_message=commit_message,
token=token,
create_pr=create_pr,
revision=revision,
commit_description=commit_description,
)
Loading
Loading