For your convenience, we provide our estimated at estimate_kappa/kappa.wiki2-train-noless-10.npy
is the input to step 2. If you are eager to get results in step 2, just use our estimate of and ignore the remaining of this section.
- First we have to extract contextualized word embeddings for all checkpoints. This may take a while and quite a bit disk space
mkdir text
curl https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip --output text/wikitext-2-v1.zip
unzip text/wikitext-2-v1.zip -d text
awk 'NF>=10' text/wikitext-2/wiki.train.tokens > text/wiki.train.len_noLess_10.tokens
cd estimate_kappa;
./extract.sh text/wiki.train.len_noLess_10.tokens your/feature/path
python estimate_kappa.py your/feature/path ../ckpts.txt kappa.wiki2-train-noless-10.npy
Fig.1 can be produced by
python show_dendrogram.py kappa.wiki2-train-noless-10.npy ../ckpts.txt
python estimate_kappa.py your/feature/path ../ckpts.txt kappa.wiki2-train-noless-10.npy --max-sentence 128
Then we can check how quickly converges w.r.t. number of words in probe data (fig. 2)
python convergence.py --data wikitext2 --metric kl --iso
The train-valid-test data for all tasks can be downloaded from here. Suppose you then put the unziped data dirs into probe_tasks/data. The following is the recipe.
- First prepare word representations for these tasks
./prepare_contextualizer.sh $one_of_the_34_ckpt #e.g., bert-large-cased
This should take a while. A folder named as contextualizers
should be created
Inside it are word representations by each checkpoint, and for every task.
- Then we can run jobs for each task,
./chuncking.sh
./ner.sh
./pos-ptb.sh
./st.sh
./ef.sh
./syn-p.sh
./syn-gp.sh
./syn-ggp.sh
This will create 8 task dirs under task_logs
. Inside each of them are many job logs.
- Run
./fig3.sh
to produce fig.3