From eb88f9c5e5510d99d02ca67661ac58156944d078 Mon Sep 17 00:00:00 2001
From: samsucik <ssucik@cisco.com>
Date: Mon, 22 Apr 2024 11:57:03 +0200
Subject: [PATCH] Briefly document the label re-using feature

---
 README.md | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/README.md b/README.md
index 2290527..d304fd5 100644
--- a/README.md
+++ b/README.md
@@ -161,6 +161,20 @@ rewriting the `postprocess` function in `prompterator/postprocess_output.py`. Th
 receive one raw model-generated text at a time and should output its postprocessed version. Both
 the raw and the postprocessed text are kept and saved.
 
+### Reusing labels for repeatedly encountered examples
+
+While iterating your prompt on a dataset, you may find yourself annotating a model output that you
+already annotated in an earlier round. You can choose to automatically reuse such previously 
+assigned labels by toggling "reuse past labels". To speed up your annotation process even more, 
+you can toggle "skip past label rows" so that you only go through the rows for which no 
+previously assigned label was found.
+
+How this feature works:
+- Existing labels are searched for in the current list of files in the sidebar, where a match 
+  requires both the `response` and all the input columns' values to match.
+- If multiple different labels are found for a given input+output combination (a sign of
+  inconsistent past annotation work), the most recent label is re-used.
+
 ## Paper
 
 You can find more information on Prompterator in the associated paper: https://aclanthology.org/2023.emnlp-demo.43/