[LLM][NPU] Ported sampler from Stateless to Stateful pipeline #1507

AsyaPronina · 2025-01-08T17:51:28Z

Ported sampler functionality from Stateless to Stateful pipeline

ilya-lavrenov · 2025-01-09T06:54:21Z

src/cpp/src/llm_pipeline_static.cpp

-    if (streamer_ptr && streamer_ptr->put(last_token)) {
-        return results;
-    }
+    // Swap max_new_token to get_max_new_token()


is this comment is valid? here we don't play with max_new_tokens

No, not valid, sorry

ilya-lavrenov · 2025-01-09T06:55:11Z

src/cpp/src/llm_pipeline_static.cpp

+    // Swap max_new_token to get_max_new_token()
+    auto sequence_group = std::make_shared<SequenceGroup>(
+        0 /* request_id */, input_ids, config, 1 /* block_size */);
+    sequence_group->update_processed_tokens_num(input_ids.get_size());


Suggested change

sequence_group->update_processed_tokens_num(input_ids.get_size());

sequence_group->update_processed_tokens_num(sequence_group->get_prompt_len() - output_sequence_len);

Will it work w/o SLICE_OUT?

I suppose yes.

w/o Slice output len is the same as prompt len and, hence, number of processed tokens is 0

I agree, however, for first input_ids we have length == length of prompt, for example, 18. But output_sequence_len is equal to 1024 for first logits as it is output from prefill model.

TolyaTalamanov

LGTM, could you also enable testing of StatefulLLMPipeline in test_llm_pipeline_static.py?

TolyaTalamanov · 2025-01-09T09:08:03Z

src/cpp/src/llm_pipeline_static.cpp

+    // Swap max_new_token to get_max_new_token()
+    auto sequence_group = std::make_shared<SequenceGroup>(
+        0 /* request_id */, input_ids, config, 1 /* block_size */);
+    sequence_group->update_processed_tokens_num(input_ids.get_size());


Will it work w/o SLICE_OUT?

Ported sampler from Stateless to Stateful pipeline

7b1a495

github-actions bot added the category: LLM LLM pipeline (stateful, static) label Jan 8, 2025

ilya-lavrenov reviewed Jan 9, 2025

View reviewed changes

ilya-lavrenov added this to the 2025.0 milestone Jan 9, 2025

ilya-lavrenov assigned ilya-lavrenov and TolyaTalamanov Jan 9, 2025

ilya-lavrenov changed the title ~~Ported sampler from Stateless to Stateful pipeline~~ [LLM][NPU] Ported sampler from Stateless to Stateful pipeline Jan 9, 2025

ilya-lavrenov added the category: NPU label Jan 9, 2025

TolyaTalamanov reviewed Jan 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM][NPU] Ported sampler from Stateless to Stateful pipeline #1507

[LLM][NPU] Ported sampler from Stateless to Stateful pipeline #1507

AsyaPronina commented Jan 8, 2025

ilya-lavrenov Jan 9, 2025

AsyaPronina Jan 10, 2025

ilya-lavrenov Jan 9, 2025

TolyaTalamanov Jan 9, 2025

ilya-lavrenov Jan 9, 2025

AsyaPronina Jan 10, 2025 •

edited

Loading

TolyaTalamanov left a comment

TolyaTalamanov Jan 9, 2025

	sequence_group->update_processed_tokens_num(input_ids.get_size());
	sequence_group->update_processed_tokens_num(sequence_group->get_prompt_len() - output_sequence_len);

[LLM][NPU] Ported sampler from Stateless to Stateful pipeline #1507

Are you sure you want to change the base?

[LLM][NPU] Ported sampler from Stateless to Stateful pipeline #1507

Conversation

AsyaPronina commented Jan 8, 2025

ilya-lavrenov Jan 9, 2025

Choose a reason for hiding this comment

AsyaPronina Jan 10, 2025

Choose a reason for hiding this comment

ilya-lavrenov Jan 9, 2025

Choose a reason for hiding this comment

TolyaTalamanov Jan 9, 2025

Choose a reason for hiding this comment

ilya-lavrenov Jan 9, 2025

Choose a reason for hiding this comment

AsyaPronina Jan 10, 2025 • edited Loading

Choose a reason for hiding this comment

TolyaTalamanov left a comment

Choose a reason for hiding this comment

TolyaTalamanov Jan 9, 2025

Choose a reason for hiding this comment

AsyaPronina Jan 10, 2025 •

edited

Loading