Merge branch 'master' into flairNLPgh-3488/save-column-corpus-to-files

chelseagzr · Dec 28, 2024 · 522621a · 522621a
2 parents d30ca22 + 68508cc
commit 522621a
Show file tree

Hide file tree

Showing 7 changed files with 15 additions and 19 deletions.
diff --git a/README.md b/README.md
@@ -23,7 +23,7 @@ document embeddings, including our proposed [Flair embeddings](https://www.aclwe
 * **A PyTorch NLP framework.** Our framework builds directly on [PyTorch](https://pytorch.org/), making it easy to
 train your own models and experiment with new approaches using Flair embeddings and classes.
 
-Now at [version 0.14.0](https://github.com/flairNLP/flair/releases)!
+Now at [version 0.15.0](https://github.com/flairNLP/flair/releases)!
 
 
 ## State-of-the-Art Models

diff --git a/docs/conf.py b/docs/conf.py
@@ -7,8 +7,8 @@
 from sphinx_github_style import get_linkcode_resolve
 from torch.nn import Module
 
-version = "0.14.0"
-release = "0.14.0"
+version = "0.15.0"
+release = "0.15.0"
 project = "flair"
 author = importlib_metadata.metadata(project)["Author"]
 copyright = f"2023 {author}"

diff --git a/docs/tutorial/intro.md b/docs/tutorial/intro.md
@@ -89,4 +89,8 @@ The output shows that the sentence "_I love Berlin and New York._" was tagged as
 
 ## Summary
 
-Congrats, you now know how to use Flair to find entities and detect sentiment!
+Congrats, you now know how to use Flair to find entities and detect sentiment!
+
+## Next steps
+
+If you want to know more about Flair, next check out [Tutorial 1](tutorial-basics/) that gives an intro into the basics of Flair!
diff --git a/docs/tutorial/tutorial-basics/basic-types.md b/docs/tutorial/tutorial-basics/basic-types.md
@@ -87,7 +87,8 @@ This print-out includes the token index (3) and the lexical value of the token (
 When you create a [`Sentence`](#flair.data.Sentence) as above, the text is automatically tokenized (segmented into words) using the [segtok](https://pypi.org/project/segtok/) library.
 
 ```{note}
-You can also use a different tokenizer if you like. To learn more about this, check out our tokenization tutorial.
+You can also use a different tokenizer by passing a different [`Tokenizer`](#flair.tokenization.Tokenizer ) to the Sentence 
+when you initialize it.
 ```
 
 

diff --git a/docs/tutorial/tutorial-training/how-to-load-prepared-dataset.md b/docs/tutorial/tutorial-training/how-to-load-prepared-dataset.md
@@ -115,7 +115,8 @@ This will print out the created dictionary:
 Dictionary with 17 tags: PROPN, PUNCT, ADJ, NOUN, VERB, DET, ADP, AUX, PRON, PART, SCONJ, NUM, ADV, CCONJ, X, INTJ, SYM
 ```
 
-#### Dictionaries for other label types
+
+### Printing label statistics
 
 If you don't know the label types in a corpus, just call [`Corpus.make_label_dictionary`](#flair.data.Corpus.make_label_dictionary) with
 any random label name (e.g. `corpus.make_label_dictionary(label_type='abcd')`). This will print
@@ -139,17 +140,6 @@ tense_dictionary = corpus.make_label_dictionary(label_type='number')
 If you print these dictionaries, you will find that the POS dictionary contains 50 tags and the number dictionary only 2 for this corpus (singular and plural).
 
 
-#### Dictionaries for other corpora types
-
-The method [`Corpus.make_label_dictionary`](#flair.data.Corpus.make_label_dictionary) can be used for any corpus, including text classification corpora:
-
-```python
-# create label dictionary for a text classification task
-from flair.datasets import TREC_6
-corpus = TREC_6()
-corpus.make_label_dictionary('question_class')
-```
-
 ### The MultiCorpus Object
 
 If you want to train multiple tasks at once, you can use the [`MultiCorpus`](#flair.data.MultiCorpus) object.
@@ -175,6 +165,7 @@ The [`MultiCorpus`](#flair.data.MultiCorpus) inherits from [`Corpus`](#flair.dat
 Flair supports many datasets out of the box. It usually automatically downloads and sets up the data the first time you
 call the corresponding constructor ID.
 The datasets are split into multiple modules, however they all can be imported from `flair.datasets` too.
+
 You can look up the respective modules to find the possible datasets.
 
 The following datasets are supported:

diff --git a/flair/__init__.py b/flair/__init__.py
@@ -34,7 +34,7 @@
     device = torch.device("cpu")
 
 # global variable: version
-__version__ = "0.14.0"
+__version__ = "0.15.0"
 """The current version of the flair library installed."""
 
 # global variable: arrow symbol

diff --git a/setup.py b/setup.py
@@ -6,7 +6,7 @@
 
 setup(
     name="flair",
-    version="0.14.0",
+    version="0.15.0",
     description="A very simple framework for state-of-the-art NLP",
     long_description=Path("README.md").read_text(encoding="utf-8"),
     long_description_content_type="text/markdown",