Skip to content
forked from allenai/deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)

License

Notifications You must be signed in to change notification settings

mrbot-ai/deep_qa

 
 

Repository files navigation

Build Status Documentation Status codecov

WARNING

This is unreleased code! We're at the pre-alpha stage, things could change, and there are still a lot of rough edges. This grew out of some research code, and we think it'll be pretty useful generally, but we're still working on making it easily usable by people outside of our group. Feel free to submit issues for problems that arise so that we're aware of them, but we're not to the point of having a supported release yet.

DeepQA

DeepQA is a library for doing high-level NLP tasks with deep learning, particularly focused on various kinds of question answering. DeepQA is built on top of Keras and TensorFlow, and can be thought of as a better interface to these systems that makes NLP easier.

Specifically, this library provides the following benefits over plain Keras / tensorflow:

  • It is hard to get NLP right in Keras. There are a lot of issues around padding sequences and masking that are not handled well in the main Keras code, and we have well-tested code that does the right thing for, e.g., computing attentions over padded sequences, padding all training instances to the same lengths (possibly dynamically by batch, to minimize computation wasted on padding tokens), or distributing text encoders across several sentences or words.
  • We provide a nice, consistent API around building NLP models in Keras. This API has functionality around processing data instances, embedding words and/or characters, easily getting various kinds of sentence encoders, and so on. It makes building models for high-level NLP tasks easy.
  • We provide a nice interface to training, validating, and debugging Keras models. It is very easy to experiment with variants of a model family, just by changing some parameters in a JSON file. For example, the particulars of how words are represented, either with fixed GloVe vectors, fine-tuned word2vec vectors, or a concatenation of those with a character-level CNN, are all specified by parameters in a JSON file, not in your actual code. This makes it trivial to switch the details of your model based on the data that you're working with.
  • We have implemented a number of state-of-the-art models, particularly focused around question answering systems (though we've dabbled in models for other tasks, as well). The actual model code for these systems is typically 50 lines or less.

Using DeepQA

To train or evaluate a model using DeepQA, the recommended entry point is to use the run_model.py script. That script takes one argument, which is a parameter file. You can see example parameter files in the examples directory. You can get some notion of what parameters are available by looking through the documentation.

Actually training a model will require input files, which you need to provide. We have a companion library, DeepQA Experiments, which was originally designed to produce input files and run experiments, and can be used to generate required data files for most of the tasks we have models for. We're moving towards putting the data processing code directly into DeepQA, so that DeepQA Experiments is not necessary, but for now, getting training data files in the right format is most easily done with DeepQA Experiments.

Running a Model

Running a model in DeepQA is very straightforward. Once you have specified model parameters in a json file you can run the following:

from deep_qa import run_model, evaluate_model, load_model

# Train a model given a json specification
run_model("/path/to/json/parameter/file")


# Load a model given a json specification
loaded_model = load_model("/path/to/json/parameter/file")
# Do some more exciting things with your model here!


# Evaluate a pretrained model on some test data specified in the json parameters.
predictions = evaluate_model("/path/to/json/parameter/file")

We have provided some example json specifications in the example_experiments directory.

Implementing your own models

To implement a new model in DeepQA, you need to subclass TextTrainer. There is documentation on what is necessary for this; see in particular the Abstracts methods section. For a simple example of a fully functional model, see the simple sequence tagger, which has about 20 lines of actual implementation code.

In order to train, load and evaluate models which you have written yourself, simply pass an additional argument to the functions above and remove the model_class parameter from your json specification. For example:

from deep_qa import run_model
from .local_project import MyGreatModel

# Train a model given a json specification (without a "model_class" attribute).
run_model("/path/to/json/parameter/file", model_class=MyGreatModel)

If you're doing a new task, or a new variant of a task with a different input/output specification, you probably also need to implement an Instance type. The Instance handles reading data from a file and converting it into numpy arrays that can be used for training and evaluation. This only needs to happen once for each input/output spec.

Organization

DeepQA is organised into the following main sections:

  • common: Code for parameter parsing, logging and runtime checks.
  • contrib: Related code for experiments and untested layers, models and features. Generally untested.
  • data: Indexing, padding, tokenisation, stemming, embedding and general dataset manipulation happens here.
  • layers: The bulk of the library. Use these Layers to compose new models. Some of these Layers are very similar to what you might find in Keras, but altered slightly to support arbitrary dimensions or correct masking.
  • models: Frameworks for different types of task. These generally all extend the TextTrainer class which provides training capabilities to a DeepQaModel. We have models for Sequence Tagging, Entailment, Multiple Choice QA, Reading Comprehension and more. Take a look at the READMEs under model for more details - each task typically has a README describing the task definition.
  • tensors: Convenience functions for writing the internals of Layers. Will almost exclusively be used inside Layer implementations.
  • training: This module does the heavy lifting for training and optimisation. We also wrap the Keras Model class to give it some useful debugging functionality.

The data and models sections are, in turn, structured according to what task they are intended for (e.g., text classification, reading comprehension, sequence tagging, etc.). This should make it easy to see if something you are trying to do is already implemented in DeepQA or not.

Implemented models

DeepQA has implementations of state-of-the-art methods for a variety of tasks. Here are a few of them:

Reading comprehension

Entailment

Memory networks

Datasets

This code allows for easy experimentation with the following datasets:

Note that the data processing code for most of this currently lives in DeepQA Experiments, however.

Contributing

If you use this code and think something could be improved, pull requests are very welcome. Opening an issue is ok, too, but we're a lot more likely to respond to a PR. The primary maintainer of this code is Matt Gardner, with a lot of help from Pradeep Dasigi (who was the initial author of this codebase), Mark Neumann and Nelson Liu.

A note on issues: we are a very small team, and our focus is on getting research done, not on building this library. We do not have anyone dedicated full-time to maintaining and improving this code. As such, we generally do not have bandwidth to solve your problems. Sorry. The code is well tested and works on our continuous integration server, so if the tests do not pass in your environment, there is something wrong with your environment, not the code. Please only submit issues after having made sure the tests pass, and trying to figure out the issue yourself. In your issue, explain clearly what the problem is and what you tried to do to fix it, including commands run and full stack traces. You're a whole lot more likely to actually get help that way.

License

This code is released under the terms of the Apache 2 license.

About

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.6%
  • Shell 0.4%