long context #360

pseudotensor · 2023-06-30T00:45:57Z

https://lmsys.org/blog/2023-06-29-longchat/
https://arxiv.org/abs/2305.07185
https://www.reddit.com/r/LocalLLaMA/comments/14fgjqj/a_simple_way_to_extending_context_to_8k/

epfml/landmark-attention#1
epfml/landmark-attention@main...eugenepentland:landmark-attention-qlora:main
https://huggingface.co/epfml/landmark-attention-llama7b-wdiff
https://huggingface.co/TheBloke/landmark-attention-llama7b-fp16
https://huggingface.co/deepset/xlm-roberta-large-squad2
https://huggingface.co/SajjadAyoubi/xlm-roberta-large-fa-qa

https://arxiv.org/abs/2306.02707

https://www.reddit.com/r/LocalLLaMA/comments/14em713/just_released_vllm_inference_library_that/
https://github.com/vllm-project/vllm
https://vllm.ai/

https://arxiv.org/abs//2306.15595

The text was updated successfully, but these errors were encountered:

arnocandel · 2023-06-30T20:12:45Z

https://huggingface.co/openaccess-ai-collective/falcon-7b-4k-alibi
https://discord.com/channels/1104757954588196865/1110594519226925137/1118208902522220664

arnocandel · 2023-07-05T22:24:30Z

https://twitter.com/Yampeleg/status/1674430869828956161?s=20 points to simple non-linear modification for improved short-range interpolation, similar to what we discussed
https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/
https://colab.research.google.com/drive/1VI2nhlyKvd5cw4-zHvAIk00cAVj2lCCC#scrollTo=cfe6a59c
https://www.episodeyang.com/ffn/ <= why it works

arnocandel · 2023-07-05T22:25:57Z

^

import transformers
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

old_init = transformers.models.llama.modeling_llama.LlamaRotaryEmbedding.__init__
def ntk_scaled_init(self, dim, max_position_embeddings=2048, base=10000, device=None):

    #The method is just these three lines
    max_position_embeddings = 16384
    a = 8 #Alpha value
    base = base * a ** (dim / (dim-2)) #Base change formula

    old_init(self, dim, max_position_embeddings, base, device)

transformers.models.llama.modeling_llama.LlamaRotaryEmbedding.__init__ = ntk_scaled_init

pseudotensor · 2023-07-06T08:19:21Z

https://github.com/InternLM/InternLM

pseudotensor · 2023-07-07T19:44:28Z

https://twitter.com/Yampeleg/status/1674430869828956161?s=20

Code: https://colab.research.google.com/drive/1VI2nhlyKvd5cw4-zHvAIk00cAVj2lCCC#scrollTo=cfe6a59c
Reddit Thread: https://reddit.com/r/LocalLLaMA/c

https://www.episodeyang.com/ffn/

pseudotensor · 2023-07-07T22:47:03Z

https://twitter.com/s_tworkowski/status/1677125863429795840

arnocandel · 2023-07-12T17:45:26Z

huggingface/text-generation-inference#499

arnocandel · 2023-07-12T17:46:27Z

huggingface/text-generation-inference@main...brave-experiments:text-generation-inference:hack-ntk-aware-scaled-rope

arnocandel · 2023-07-17T22:05:54Z

https://github.com/conceptofmind/scaled-rope/tree/master/scaled_rope

arnocandel · 2023-07-20T19:21:52Z

#516 (comment) 70B with 16K

arnocandel · 2023-07-20T21:32:28Z

huggingface/text-generation-inference#529 TGI will soon have it too

arnocandel · 2023-07-21T20:32:20Z

c1ed68c

arnocandel · 2023-07-25T01:08:40Z

LongLlama

https://arxiv.org/pdf/2307.03170.pdf
https://github.com/cstankonrad/long_llama
https://www.marktechpost.com/2023/07/10/meet-longllama-a-large-language-model-capable-of-handling-long-contexts-of-256k-tokens/

pseudotensor · 2023-08-15T02:20:13Z

https://www.reddit.com/r/LocalLLaMA/comments/15r18rx/expand_the_context_length_with_rope_from_a_%CE%B2based/?utm_source=share&utm_medium=ios_app&utm_name=ioscss&utm_content=1&utm_term=1

pseudotensor · 2023-08-28T20:06:48Z

https://kaiokendev.github.io/til#extending-context-to-8k
turboderp/exllama#118
https://github.com/oobabooga/text-generation-webui/pull/2955/files
vllm-project/vllm#479
turboderp/exllama#115

pseudotensor · 2023-08-28T20:14:20Z

Things I’m Learning While Training SuperHOT | kaiokendev.github.io
https://kaiokendev.github.io/til#extending-context-to-8k

Reddit - Dive into anything
https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/?onetap_auto=true

TheBloke/Llama-2-70B-chat-GPTQ · Hugging Face
https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ

NTK-Aware Scaled RoPE allows LLaMA models to have extended (8k+) context size without any fine-tuning and minimal perplexity degradation. : LocalLLaMA
https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/

Llama 2 is here - get it on Hugging Face
https://huggingface.co/blog/llama2

jquesnelle/scaled-rope
https://github.com/jquesnelle/scaled-rope

(Experimental) Add support to NTK RoPE scaling by Panchovix · Pull Request #118 · turboderp/exllama
https://github.com/turboderp/exllama/pull/118/files

Reddit - https://preview.redd.it/2qdj7itsb39b1.png?width=662&format=png&auto=webp&s=464052174151b6ae8b6a9ce42b8f1acc9acabd35
https://preview.redd.it/2qdj7itsb39b1.png?width=662&format=png&auto=webp&s=464052174151b6ae8b6a9ce42b8f1acc9acabd35

How Long Can Open-Source LLMs Truly Promise on Context Length? | LMSYS Org
https://lmsys.org/blog/2023-06-29-longchat/

DachengLi1/LongChat: Official repository for LongChat and LongEval
https://github.com/DachengLi1/LongChat

Extending context size via RoPE scaling · ggerganov/llama.cpp · Discussion #1965
ggerganov/llama.cpp#1965

NTK-Aware Scaled RoPE allows LLaMA models to have extended (8k+) context size without any fine-tuning and minimal perplexity degradation. : LocalLLaMA
https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/

HuggingFace models have max_position_embeddings set incorrectly · Issue #359 · facebookresearch/llama
meta-llama/llama#359

Summary post for higher context sizes for this week. For context up to 4096, NTK RoPE scaling is pretty viable. For context higher than that, keep using SuperHOT LoRA/Merges. : LocalLLaMA
https://www.reddit.com/r/LocalLLaMA/comments/14ojd7s/summary_post_for_higher_context_sizes_for_this/

Output garbled in llama2 model · Issue #510 · vllm-project/vllm
vllm-project/vllm#510

Stay on topic with Classifier-Free Guidance : LocalLLaMA
https://www.reddit.com/r/LocalLLaMA/comments/14p6p0g/stay_on_topic_with_classifierfree_guidance/

Add Classifier-Free Guidance sampling · Issue #24536 · huggingface/transformers
huggingface/transformers#24536

tau/scrolls · Datasets at Hugging Face
https://huggingface.co/datasets/tau/scrolls/viewer/gov_report/train?row=0

Quantized LLama2 70B GPTQ 4-bit · Issue #516 · h2oai/h2ogpt
#516

Request: NTK rope support · Issue #479 · vllm-project/vllm
vllm-project/vllm#479

Add support for LLaMA-2 by zhuohan123 · Pull Request #505 · vllm-project/vllm
vllm-project/vllm#505

lmsys/longchat-13b-16k · Hugging Face
https://huggingface.co/lmsys/longchat-13b-16k

[2302.05507] Long-Context Language Decision Transformers and Exponential Tilt for Interactive Text Environments
https://arxiv.org/abs/2302.05507

LongChat/longeval at longeval · DachengLi1/LongChat
https://github.com/DachengLi1/LongChat/tree/longeval/longeval

Request: NTK rope support · Issue #479 · vllm-project/vllm
vllm-project/vllm#479

How Long Can Open-Source LLMs Truly Promise on Context Length? | LMSYS Org
https://lmsys.org/blog/2023-06-29-longchat/

RoPE scaling support? · Issue #464 · vllm-project/vllm
vllm-project/vllm#464

[2212.10947] Parallel Context Windows for Large Language Models
https://arxiv.org/abs/2212.10947

[2307.03172] Lost in the Middle: How Language Models Use Long Contexts
https://arxiv.org/abs/2307.03172

pseudotensor/LongChat: Official repository for LongChat and LongEval
https://github.com/pseudotensor/LongChat

openchat/openchat · Hugging Face
https://huggingface.co/openchat/openchat

How Long Can Open-Source LLMs Truly Promise on Context Length? | LMSYS Org
https://lmsys.org/blog/2023-06-29-longchat/#evaluation-toolkits-longeval

[2306.05685] Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
https://arxiv.org/abs/2306.05685

Training:
https://github.com/DachengLi1/LongChat#longchat-1
https://huggingface.co/togethercomputer/LLaMA-2-7B-32K

Data:
https://huggingface.co/datasets/togethercomputer/Long-Data-Collections

pseudotensor · 2023-08-28T20:34:10Z

RLHF related:
https://github.com/arc53/DocsGPT
https://github.com/shubh2016shiv/Bio_Medical_Question_Answering_NLP/tree/main/BioASQ_data
https://huggingface.co/datasets/pubmed_qa/viewer/pqa_labeled/train?row=0

pseudotensor · 2023-08-28T20:34:51Z

Giraffe: Adventures in Expanding Context Lengths in LLMs
https://arxiv.org/abs/2308.10882

Rope Beta:
https://www.reddit.com/r/LocalLLaMA/comments/15r18rx/expand_the_context_length_with_rope_from_a_%CE%B2based/?utm_source=share&utm_medium=ios_app&utm_name=ioscss&utm_content=1&utm_term=1

https://normxu.github.io/Rethinking-Rotary-Position-Embedding/
https://normxu.github.io/Rethinking-Rotary-Position-Embedding-2/

Consider also llama-2 way.

arnocandel mentioned this issue Jul 21, 2023

Create tests for long context #522

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

long context #360

long context #360

pseudotensor commented Jun 30, 2023 •

edited

Loading

arnocandel commented Jun 30, 2023 •

edited

Loading

arnocandel commented Jul 5, 2023 •

edited

Loading

arnocandel commented Jul 5, 2023

pseudotensor commented Jul 6, 2023

pseudotensor commented Jul 7, 2023

pseudotensor commented Jul 7, 2023

arnocandel commented Jul 12, 2023

arnocandel commented Jul 12, 2023

arnocandel commented Jul 17, 2023

arnocandel commented Jul 20, 2023

arnocandel commented Jul 20, 2023

arnocandel commented Jul 21, 2023

arnocandel commented Jul 25, 2023

pseudotensor commented Aug 15, 2023

pseudotensor commented Aug 28, 2023

pseudotensor commented Aug 28, 2023 •

edited

Loading

pseudotensor commented Aug 28, 2023

pseudotensor commented Aug 28, 2023 •

edited

Loading

long context #360

long context #360

Comments

pseudotensor commented Jun 30, 2023 • edited Loading

arnocandel commented Jun 30, 2023 • edited Loading

arnocandel commented Jul 5, 2023 • edited Loading

arnocandel commented Jul 5, 2023

pseudotensor commented Jul 6, 2023

pseudotensor commented Jul 7, 2023

pseudotensor commented Jul 7, 2023

arnocandel commented Jul 12, 2023

arnocandel commented Jul 12, 2023

arnocandel commented Jul 17, 2023

arnocandel commented Jul 20, 2023

arnocandel commented Jul 20, 2023

arnocandel commented Jul 21, 2023

arnocandel commented Jul 25, 2023

LongLlama

pseudotensor commented Aug 15, 2023

pseudotensor commented Aug 28, 2023

pseudotensor commented Aug 28, 2023 • edited Loading

pseudotensor commented Aug 28, 2023

pseudotensor commented Aug 28, 2023 • edited Loading

pseudotensor commented Jun 30, 2023 •

edited

Loading

arnocandel commented Jun 30, 2023 •

edited

Loading

arnocandel commented Jul 5, 2023 •

edited

Loading

pseudotensor commented Aug 28, 2023 •

edited

Loading

pseudotensor commented Aug 28, 2023 •

edited

Loading