MedaCy seeks to create a unified platform to streamline research efforts in medical text mining while also providing an interface to easily apply models to real world problems. Due to this, contributions to medaCy are often consequences and direct by-products of active research projects. However, if not for the contributions, bug fixes/reports, and suggestions of practioners - medaCy could not grow and thrive.
This contribution guide is designed to inform:
- Researchers in how they can efficiently utilize medaCy to make their work more reachable by practioners.
- Practioners in how they can tune medaCy's cutting-edge functionalities to their specific application.
Please do a search before posting an issue/bug report - your problem may already be solved! If your search comes up for not - congratulations, you may have something to contribute!
At it's most basic one can fork medaCy, clone down their fork, and use their favorite text editor to develop. However, some up-front set-up effort goes a long way towards streamlining the contribution process and keeping organized. This section details a suggested set-up for efficient development, testing, and experimentation with medaCy utilizing PyCharm.
Assumptions of this section:
- You are working in a UNIX based operating system.
- Part 2 assumes you have Pycharm Professional installed - Pycharm Professional is provided with the Jetbrains University License. (this isn't entirely necessary but the useful Remote Host feature is disabled on the Community Edition)
Part 1: Development Installation
- If you are shaky with git - this link provides an excellent description of the branching model medaCy follows to organize contributions. Read it.
- Fork medaCy and copy the clone link.
- On your machine, insure you have Python 3 installed. Set-up a virtual environment and activate it.
- Run the bash commands:
python --version
andpip list
. Upgrade pip to the latest version as suggested. Your python version should be above 3.4 and your installed packages should be few in number - if both of these conditions do not hold return to Step 3. - In a directory separate from the one created by the virtual envirorment set-up command, clone down your fork of medaCy.
- Whilst inside your cloned fork, insure you are in at-least the development branch or a branch of the development branch. This can be verified by running
git status
and branching can be done withgit checkout <branch-name>
- Run
pip install -e .
This will install medaCy in editable mode inside of your virtual environment and will take several minutes to install dependencies - medaCy stands on the shoulders of giants! Errors one is likely to encounter here include the installation of sci-py and numpy. Google search the errors as they are easily fixable via the installation of some extra dependencies.
Part 2: Developing with PyCharm PyCharm can streamline development efforts - especially if you are developing locally and running medaCy on a remote machine for model building.
Work is currently being done to achieve full coverage of unit tests - but core functionality has been extensively tested. After installing medaCy for development, run:
-
For quick testing, run:
python setup.py test
. -
For more fine-grained testing on individual files with colorful log output run:
pytest -s tests/tools/test_data_manager.py -o log_cli=True --log-cli-level=INFO
.This will show log output during tests and allow you to adust logging level for the test file being run. Read the pytest documentation for details.