Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Ramyyang committed Nov 20, 2024
1 parent 8592804 commit 8c61c20
Show file tree
Hide file tree
Showing 391 changed files with 167,179 additions and 4 deletions.
1 change: 1 addition & 0 deletions CL_Benchmark/BoolQA/BoolQA/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/BoolQA/BoolQA/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["True", "False"]
1 change: 1 addition & 0 deletions CL_Benchmark/BoolQA/BoolQA/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/BoolQA/BoolQA/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/COPA/COPA/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/COPA/COPA/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["A", "B"]
1 change: 1 addition & 0 deletions CL_Benchmark/COPA/COPA/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/COPA/COPA/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/MultiRC/MultiRC/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/MultiRC/MultiRC/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["False", "True"]
1 change: 1 addition & 0 deletions CL_Benchmark/MultiRC/MultiRC/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/MultiRC/MultiRC/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/CB/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/CB/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["entailment", "contradiction", "neutral"]
1 change: 1 addition & 0 deletions CL_Benchmark/NLI/CB/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/CB/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/MNLI/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/MNLI/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["neutral", "entailment", "contradiction"]
1 change: 1 addition & 0 deletions CL_Benchmark/NLI/MNLI/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/MNLI/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/RTE/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/RTE/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["contradiction", "entailment"]
1 change: 1 addition & 0 deletions CL_Benchmark/NLI/RTE/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/NLI/RTE/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/QQP/QQP/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/QQP/QQP/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["False", "True"]
1 change: 1 addition & 0 deletions CL_Benchmark/QQP/QQP/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/QQP/QQP/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/IMDB/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/IMDB/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["Good", "Bad"]
1 change: 1 addition & 0 deletions CL_Benchmark/SC/IMDB/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/IMDB/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/SST-2/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/SST-2/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["Good", "Bad"]
1 change: 1 addition & 0 deletions CL_Benchmark/SC/SST-2/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/SST-2/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/amazon/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/amazon/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["very negative", "negative", "neutral", "positive", "very positive"]
1 change: 1 addition & 0 deletions CL_Benchmark/SC/amazon/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/amazon/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/yelp/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/yelp/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["very negative", "negative", "neutral", "positive", "very positive"]
1 change: 1 addition & 0 deletions CL_Benchmark/SC/yelp/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/SC/yelp/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/agnews/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/agnews/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["World", "Sports", "Business", "Science or Technology"]
1 change: 1 addition & 0 deletions CL_Benchmark/TC/agnews/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/agnews/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/dbpedia/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/dbpedia/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["Company", "Educational Institution", "Artist", "Athlete", "Office Holder", "Mean of Transportation", "Building", "Natural Place", "Village", "Animal", "Plant", "Album", "Film", "Written Work"]
1 change: 1 addition & 0 deletions CL_Benchmark/TC/dbpedia/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/dbpedia/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/yahoo/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/yahoo/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["Society & Culture", "Science & Mathematics", "Health", "Education & Reference", "Computers & Internet", "Sports", "Business & Finance", "Entertainment & Music", "Family & Relationships", "Politics & Government"]
1 change: 1 addition & 0 deletions CL_Benchmark/TC/yahoo/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/TC/yahoo/train.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/WiC/WiC/dev.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/WiC/WiC/labels.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["True", "False"]
1 change: 1 addition & 0 deletions CL_Benchmark/WiC/WiC/test.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CL_Benchmark/WiC/WiC/train.json

Large diffs are not rendered by default.

4 changes: 0 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
# Is Parameter Collision Hindering Continual Learning in LLMs?
Code for the paper "Is Parameter Collision Hindering Continual Learning in LLMs?", exploring novel approaches to address parameter collision issues in large language models for continual learning.


# Sn-LoRA

- This repo releases our implementation for the N-LoRA model.
Expand Down
33 changes: 33 additions & 0 deletions configs/ds_configs/eval.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"fp16": {
"enabled": "auto",
"loss_scale": 0,
"loss_scale_window": 1000,
"initial_scale_power": 16,
"hysteresis": 2,
"min_loss_scale": 1
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"betas": "auto",
"eps": "auto",
"weight_decay": "auto"
}
},
"scheduler": {
"type": "WarmupLR",
"params": {
"warmup_min_lr": "auto",
"warmup_max_lr": "auto",
"warmup_num_steps": "auto"
}
},
"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"steps_per_print": 1e5,
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto",
"wall_clock_breakdown": false
}
38 changes: 38 additions & 0 deletions configs/ds_configs/stage0.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
{
"bfloat16": {
"enabled": "auto"
},
"fp16": {
"enabled": "auto",
"loss_scale": 0,
"loss_scale_window": 1000,
"initial_scale_power": 16,
"hysteresis": 2,
"min_loss_scale": 1
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"betas": "auto",
"eps": "auto",
"weight_decay": "auto"
}
},
"scheduler": {
"type": "WarmupLR",
"params": {
"warmup_min_lr": "auto",
"warmup_max_lr": "auto",
"warmup_num_steps": "auto"
}
},
"zero_optimization": {
"stage": 0
},
"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto",
"steps_per_print": 1e5
}
36 changes: 36 additions & 0 deletions configs/ds_configs/stage1.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
{
"bfloat16": {
"enabled": "auto"
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"betas": "auto",
"eps": "auto",
"weight_decay": "auto"
}
},
"scheduler": {
"type": "WarmupLR",
"params": {
"warmup_min_lr": "auto",
"warmup_max_lr": "auto",
"warmup_num_steps": "auto"
}
},
"zero_optimization": {
"stage": 1,
"allgather_partitions": true,
"allgather_bucket_size": 2e8,
"overlap_comm": true,
"reduce_scatter": true,
"reduce_bucket_size": 2e8,
"contiguous_gradients": true
},
"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto",
"steps_per_print": 1e5
}
48 changes: 48 additions & 0 deletions configs/ds_configs/stage2.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{
"bfloat16": {
"enabled": "auto"
},
"fp16": {
"enabled": "auto",
"loss_scale": 0,
"loss_scale_window": 1000,
"initial_scale_power": 16,
"hysteresis": 2,
"min_loss_scale": 1
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"betas": "auto",
"eps": "auto",
"weight_decay": "auto"
}
},
"scheduler": {
"type": "WarmupLR",
"params": {
"warmup_min_lr": "auto",
"warmup_max_lr": "auto",
"warmup_num_steps": "auto"
}
},
"zero_optimization": {
"stage": 2,
"offload_optimizer": {
"device": "cpu",
"pin_memory": true
},
"allgather_partitions": true,
"allgather_bucket_size": 2e8,
"overlap_comm": true,
"reduce_scatter": true,
"reduce_bucket_size": 2e8,
"contiguous_gradients": true
},
"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto",
"steps_per_print": 1e5
}
48 changes: 48 additions & 0 deletions configs/ds_configs/stage2_llama.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{
"bfloat16": {
"enabled": "true"
},
"fp16": {
"enabled": "auto",
"loss_scale": 0,
"loss_scale_window": 1000,
"initial_scale_power": 16,
"hysteresis": 2,
"min_loss_scale": 1
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"betas": "auto",
"eps": "auto",
"weight_decay": "auto"
}
},
"scheduler": {
"type": "WarmupLR",
"params": {
"warmup_min_lr": "auto",
"warmup_max_lr": "auto",
"warmup_num_steps": "auto"
}
},
"zero_optimization": {
"stage": 2,
"offload_optimizer": {
"device": "cpu",
"pin_memory": true
},
"allgather_partitions": true,
"allgather_bucket_size": 2e8,
"overlap_comm": true,
"reduce_scatter": true,
"reduce_bucket_size": 2e8,
"contiguous_gradients": true
},
"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto",
"steps_per_print": 1e5
}
44 changes: 44 additions & 0 deletions configs/ds_configs/stage2_without_offload.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
{
"bfloat16": {
"enabled": "auto"
},
"fp16": {
"enabled": "auto",
"loss_scale": 0,
"loss_scale_window": 1000,
"initial_scale_power": 16,
"hysteresis": 2,
"min_loss_scale": 1
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"betas": "auto",
"eps": "auto",
"weight_decay": "auto"
}
},
"scheduler": {
"type": "WarmupLR",
"params": {
"warmup_min_lr": "auto",
"warmup_max_lr": "auto",
"warmup_num_steps": "auto"
}
},
"zero_optimization": {
"stage": 2,
"allgather_partitions": true,
"allgather_bucket_size": 2e8,
"overlap_comm": true,
"reduce_scatter": true,
"reduce_bucket_size": 2e8,
"contiguous_gradients": true
},
"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto",
"steps_per_print": 1e5
}
56 changes: 56 additions & 0 deletions configs/ds_configs/stage3.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
{
"bfloat16": {
"enabled": false
},
"fp16": {
"enabled": "auto",
"loss_scale": 0,
"loss_scale_window": 1000,
"initial_scale_power": 16,
"hysteresis": 2,
"min_loss_scale": 1
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"betas": "auto",
"eps": "auto",
"weight_decay": "auto"
}
},
"scheduler": {
"type": "WarmupLR",
"params": {
"warmup_min_lr": "auto",
"warmup_max_lr": "auto",
"warmup_num_steps": "auto"
}
},
"zero_optimization": {
"stage": 3,
"offload_optimizer": {
"device": "cpu",
"pin_memory": true
},
"offload_param": {
"device": "cpu",
"pin_memory": true
},
"overlap_comm": true,
"contiguous_gradients": true,
"sub_group_size": 1e9,
"reduce_bucket_size": "auto",
"stage3_prefetch_bucket_size": "auto",
"stage3_param_persistence_threshold": "auto",
"stage3_max_live_parameters": 1e9,
"stage3_max_reuse_distance": 1e9,
"stage3_gather_fp16_weights_on_model_save": true
},
"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"steps_per_print": 1e5,
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto",
"wall_clock_breakdown": false
}
23 changes: 23 additions & 0 deletions configs/instruction_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{
"NLI": [
{"instruction_type": "zero-shot", "instruction": "What is the logical relationship between the \"sentence 1\" and the \"sentence 2\"? Choose one from the option.\n"}
],
"QQP": [
{"instruction_type": "zero-shot", "instruction": "Whether the \"first sentence\" and the \"second sentence\" have the same meaning? Choose one from the option.\n"}
],
"SC": [
{"instruction_type": "zero-shot", "instruction": "What is the sentiment of the following paragraph? Choose one from the option.\n"}
],
"TC": [
{"instruction_type": "zero-shot", "instruction": "What is the topic of the following paragraph? Choose one from the option.\n"}
],
"BoolQA":[
{"instruction_type": "zero-shot", "instruction": "According to the following passage, is the question true or false? Choose one from the option.\n"}
],
"MultiRC":[
{"instruction_type": "zero-shot", "instruction": "According to the following passage and question, is the candidate answer true or false? Choose one from the option.\n"}
],
"WiC":[
{"instruction_type": "zero-shot", "instruction": "Given a word and two sentences, whether the word is used with the same sense in both sentence? Choose one from the option.\n"}
]
}
Loading

0 comments on commit 8c61c20

Please sign in to comment.