-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we use our own data set to train the models and predict our own test set? #14
Comments
You can, just make sure the input format remains the same.
…On Wed, Oct 17, 2018 at 7:24 AM xinxu1018 ***@***.***> wrote:
Can we use our own data set to train the models and predict our own test
set?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#14>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/APvEHzSOPblJDTxAoi1H6gAbCHnDKeb7ks5ulo3DgaJpZM4Xi-p_>
.
|
@Sshanu Thanks so much for your quick response! Please allow me to ask one more question. All the best! |
I first extracted words and stored it in a list named as vocab, then
extracted word embedding and stored it in a numpy array.
If 2nd word in vocab is "the", the 2nd row in numpy array will have word
embedding corresponding to "the".
I then saved both vocab and word embedding numpy array using pickle.
So, you can create a similar array and vocab, or you can change the code to
load embeddings.
…On Wed, Oct 17, 2018 at 9:14 PM xinxu1018 ***@***.***> wrote:
@Sshanu <https://github.com/Sshanu> Thanks so much for your quick
response! Please allow me to ask one more question.
Since I am using word embeddings trained over my specific corpus instead
of your given Glove embeddings, how can I get my embeddings in the same
format with the Glove embedding file you provided in the data folder and
use it in your designed LSTM model?
All the best!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APvEH5M7hwIaEYdC0p093V4MPlrF7FFrks5ul1BmgaJpZM4Xi-p_>
.
|
@xinxu1018 That is so informative! Thanks a lot! |
@Sshanu It works! Thanks a lot! Can I ask a follow-up question again? Many thanks! |
You can try this approach but its shortcoming is that you don't have word
embedding for the multi-word term. Create a dependency tree, then choose
entity from the two word which is below another one in the tree, then the
information regarding the other word will be computed from the lstm -tree,
and instead of only using features of lca, entity1, and entity2 from the
lstm-tree for relation classification, also use the features of the other
word from the lstm-tree.
I did not work on Relation Classification or any related field after this
project, this project was my first in NLP, that's why I have very less
knowledge in this relation classification or extraction.
…On Thu, Oct 18, 2018 at 12:57 AM xinxu1018 ***@***.***> wrote:
@Sshanu <https://github.com/Sshanu> It works! Thanks a lot! Can I ask a
follow-up question again?
If I wanna classify relations between multi-word terms (in your case it is
one-word term pairs), how can I preprocess the sentences before I go to the
step of dependency path extraction? Do you have any suggestions? One way I
am considering is to connect every word within a multi-word term using
underscores (like, "system configuration" to "system_configuration" ) and
then treat them as a one-word term. Then follow your designed procedures.
Not sure if it will work. Do you have any ideas?
Many thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APvEH10qEJGzLGNtyWXIfXWIEnXIPJDxks5ul4SZgaJpZM4Xi-p_>
.
|
@Sshanu Thanks a lot! Hope you everything goes very well! |
@Sshanu Could you please provide your word_embd_wiki file? I cannot find the embedding file in your given data folder. Thanks for you help! Best, |
My google drive is full, please share a folder with me, where I will upload
the word_embed file.
…On Thu, Oct 18, 2018 at 9:03 PM xinxu1018 ***@***.***> wrote:
@Sshanu <https://github.com/Sshanu> Could you please provide your
word_embd_wiki file? I cannot find the embedding file in your given data
folder. Thanks for you help!
Best,
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APvEH44UwKs0jDTAByJd8PV4ORR6PXK2ks5umJ9SgaJpZM4Xi-p_>
.
|
@Sshanu How can I share a folder with you? What's your address? Sorry, I am new here! |
@Sshanu Hi Sshanu, I just shared a google drive folder to the email you provided in your Github profile. Not sure am I doing right! Many thanks! |
Oh, please share it with [email protected]
…On Thu, Oct 18, 2018 at 9:43 PM xinxu1018 ***@***.***> wrote:
@Sshanu <https://github.com/Sshanu> Hi Sshanu, I just shared a google
drive folder to the email you provided in your Github profile. Not sure am
I doing right! Many thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APvEHwTaPKHc9kpPbTxI8xl0QvV6dsahks5umKi3gaJpZM4Xi-p_>
.
|
@Sshanu I have shared to your gmail. Please check and many thanks! |
@Sshanu Hi Sshanu, got your shared file! You helped me a lot! I am just wondering do you have the original code that was used to separate embedding file into vocab and word_embedding arrays? Then I can generate my own trained embeddings into the format aligned with your designed method. Could you please share me the code? Thanks again! |
I don't have the file, if you are having a problem in generating the exact
file, then simply try storing the embeddings in numpy array and vocabulary
in a list, my code will work afterward.
…On Fri, Oct 19, 2018 at 10:17 AM xinxu1018 ***@***.***> wrote:
@Sshanu <https://github.com/Sshanu> Hi Sshanu, got your shared file! You
helped me a lot! I am just wondering do you have the original code that was
used to separate embedding file into vocab and word_embedding arrays? Then
I can generate my own trained embeddings into the format aligned with your
designed method. Could you please share me the code? Thanks again!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APvEH2QJJ7qTVg8HHwGhi8TqBQvRMP3Dks5umVlLgaJpZM4Xi-p_>
.
|
Can we use our own data set to train the models and predict our own test set?
The text was updated successfully, but these errors were encountered: