You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To go beyond naive BoW and TF-IDF, we should investigate case embedding.
It could be done at several level since we have the tree of documents per paragraph, per subsection, section and obviously the whole document.
We should also create two separate embedding, one for the descriptive features and one for the documents because the later are available after the judgement (by definition) such that it makes no sense to build a system to predict the outcome.
Not sure what method would be the most appropriate. Maybe starting by a word2vec or sent2vec?
The text was updated successfully, but these errors were encountered:
To go beyond naive BoW and TF-IDF, we should investigate case embedding.
It could be done at several level since we have the tree of documents per paragraph, per subsection, section and obviously the whole document.
We should also create two separate embedding, one for the descriptive features and one for the documents because the later are available after the judgement (by definition) such that it makes no sense to build a system to predict the outcome.
Not sure what method would be the most appropriate. Maybe starting by a word2vec or sent2vec?
The text was updated successfully, but these errors were encountered: