Releases: lszeremeta/molstruct
Development Build
Molstruct 3.0.0: Strong 💪 but light as a feather
The new and better Molstruct 3.0.0 is now available! Below you will find an overview of the most important changes since the last release.
🚀 New features
In the new version of Molstruct, you will find the following new features. You must try them out!
Predefined presets
To make your work easier, Molstruct has built-in preset support. Thanks to this, you do not have to set everything manually, you just select the appropriate preset and it's ready. The presets are flexible. If you want to change, e.g. the column names selected for a preset, you can do so. At the moment you can use the DrugBank open preset. There are plans to add more in the future. Any suggestions are welcome!
Support for multiple values
Sometimes you may find more than one value in a particular cell. Molstruct is ready for this, and you can change the value delimiter if you need to with the new -vd VALUE_DELIMITER
, --value-delimiter VALUE_DELIMITER
option. Multiple value support works for all formats.
Subject type selector
Now you can select the preferred subject type for all output formats. You can use the -s,--subject
option for this. Supported subject types are iri
, uuid
, and bnode
. If you don't know what to choose, you can leave the default subject type (iri
) and not use -s,--subject
at all.
Base subject selector
You can also set your own IRI base for the iri
subject type. You can override the default one ('https://example.com/molecule#entity') with the -b, --base
option if you want. For each base IRI with #, an additional id attribute is added to the HTML output formats.
Dataset type support
Dataset type support has been added for JSON-LD HTML, JSON-LD, RDFa, and Microdata formats. Thanks to this, the generated datasets can be even more visible to search engines, e.g. in the Google Dataset Search. Read more at Google Developers page.
📈 Improvements
The new version brings some improvements. Below you will find the most important of them.
Output format improvements
The generation of all output formats has been rewritten. The code responsible for this is much clearer. You may notice minor fixes to the output formats.
Better documentation
In this release, README has been enhanced with a Quick start section, which should make it easier for you to start with Molstruct. Additionally, some of the README stuff was moved to the project Wiki. You can find useful information there as well.
🐛 Bug Fixes
There are no infallible people, just as there are no perfect program codes. The following bugs have been found and fixed in this release.
Add missing image
for JSON-LD outputs
It turned out that the JSON-LD formats did not have image
property support. This has been corrected. In addition, the code responsible for handling MolecularEntity profile properties has been rewritten to reduce the chance of a similar problem in the future.
💔 Breaking changes
These changes are not backward compatible.
jsonld-html
is now jsonldhtml
Renamed jsonld-html
to jsonldhtml
output format. You don't need to write extra - now.
-s SMILES
is now -sm SMILES
You can now use -sm SMILES
, --smiles SMILES
instead of -s SMILES
, --smiles SMILES
to define a column name for SMILES. -s
is reserved for the new subject type option.
additionalType
is now not supported
This option requires specialized knowledge to be successfully used. In order not to create additional confusion, this option is no longer available.
🕶️ Changes under the hood
The changes under the hood don't affect you directly, but you might find them interesting.
Rewritten GitHub Actions workflows
Workflows in GitHub Actions have been rewritten for better clarity and understanding.
Below are just some of the additional changes:
- GitHub Actions builds and sends containers to Docker Hub instead of building it on Docker Hub,
- The cache is used to build the Docker,
- Waiting for concurrent jobs if needed.
If you are interested, you can see the workflows for the Molstruct project.
Code simplification
In this release, the Molstruct code is even clearer and more readable. Generating all formats as well as managing the supported properties of MolecularEntity profile has been rewritten.
This changelog contains only the most significant changes. Below you will find a list of all commits since the last release.
Commits
- [1cc38c2]: Add missing IMAGE for JSON-LD outputs (Łukasz Szeremeta)
- [a50699e]: Remove Python 3.3 and 3.4 from python-package.yml (Łukasz Szeremeta)
- [66fbb8b]: Introduce presets (Łukasz Szeremeta)
- [50eeb44]: Multivalue support and better DrugBank preset (Łukasz Szeremeta)
- [8444acc]: Remove unncessary rdf from context in jsonld output (Łukasz Szeremeta)
- [0962246]: Remove unnecessary additionalType from jsonld context (Łukasz Szeremeta)
- [0218d6c]: Return back to -d for description column (Łukasz Szeremeta)
- [b7d70cb]: http://example.com/molecule/ -> http://example.com/molecule# (Łukasz Szeremeta)
- http://example.com/molecule# -> http://example.com/molecule#entity (Łukasz Szeremeta)
- [c1ef89c]: Add id for molecule div tag if # in baseURI (Łukasz Szeremeta)
- [06b2741]: baseURI -> subject-base (Łukasz Szeremeta)
- [5fc306c]: Add urn:uuid option (Łukasz Szeremeta)
- [8fd1182]: Generate img tag for molecule's images (Łukasz Szeremeta)
- [d6581ae]: Remove / from img tag (Łukasz Szeremeta)
- [86cdc06]: Number molecules starting from 0 instead of 1 (Łukasz Szeremeta)
- [95fff6b]: itemprop fix in Microdata output (Łukasz Szeremeta)
- [c5460ae]: Add bnode subject type option (Łukasz Szeremeta)
- [090c8c3]: --subject-base -> --base in README.md (Łukasz Szeremeta)
- [a0c8b29]: SUBJECT_BASE -> BASE in README.md (Łukasz Szeremeta)
- [d395c3c]: Add schema:Dataset (Łukasz Szeremeta)
- [8d0b850]: jsonld_html -> jsonldhtml (Łukasz Szeremeta)
- [7db9412]: http -> https (Łukasz Szeremeta)
- [346a81c]: Better jsonldhtml indents (Łukasz Szeremeta)
- [3c92cb4]: https -> http for Google Rich Results Test (Łukasz Szeremeta)
- [4160ff9]: MolecularEntitly type -> profile with version (Łukasz Szeremeta)
- [7fcc638]: MolecularEntity typo fix in README (Łukasz Szeremeta)
- [eab0de1]: Add structured data link (Łukasz Szeremeta)
- [9ae62c2]: Limit fix after change numeration from 0 (Łukasz Szeremeta)
- [099fb3b]: drugbank -> drugbank-open with description (Łukasz Szeremeta)
- [b095a44]: Don't escape quotes if not needed (Łukasz Szeremeta)
- [2593f94]: Improve README (Łukasz Szeremet...
Molstruct 2.0.1: The better escaper
In this path release, you will find even better HTML and JSON escape. Molstruct has always tried to keep you from worrying whether there are characters in your data that should not appear in the output document. Support for some new cases has been added in this release. In addition, you will also find some really small improvements in README and help.
Commits
- [c3a4e89]: Add section link in README (Łukasz Szeremeta)
- [ae19db7]: URL type -> URL (Łukasz Szeremeta)
- [2f34523]: URL type -> URL in main.py (Łukasz Szeremeta)
- [56b5f16]: MIT license -> MIT License (Łukasz Szeremeta)
- [8743efb]: Improved html escape (Łukasz Szeremeta)
- [d97823d]: Reformat outputs.py file (Łukasz Szeremeta)
- [e7d998a]: Python 3.9.0-rc.1 -> 3.9 in python-package.yml (Łukasz Szeremeta)
- [1e8d67d]: Bump to 2.0.1 (Łukasz Szeremeta)
Molstruct 2.0.0: It's easy to swim! 🐋
Molstruct is now more convenient to use from the command line. Not only has the format selection changed, but the ability to specify the base URI of a molecule has been added. You also get improved documentation. The arguments are now assigned to groups, so you can quickly find the ones you need and those you need. Do you love️ Docker? Molstruct is now also available as a lightweight Docker image. You must check it out! What else? Unit tests have been added to test Molstruct even more accurately. Besides, in this release you will find a lot of small changes that make Molstruct even better. This all sounds pretty good, doesn't it?
Commits
- [ff9aff7]: Add raw-svg logos (Łukasz Szeremeta)
- [cdb5d0e]: Handling errors regarding file read (Łukasz Szeremeta)
- [95c84ea]: Add PyPI and Codacy badge to README.md (Łukasz Szeremeta)
- [1f63be1]: stderr for errors (Łukasz Szeremeta)
- [7e34820]: Replace PyPI version badge in README.md (Łukasz Szeremeta)
- [11ce0c4]: Update MEgen repo link (Łukasz Szeremeta)
- [93bce11]: Add --version (Łukasz Szeremeta)
- [5474c88]: Bump version to 1.0.2 (Łukasz Szeremeta)
- [bbfbed7]: -f/--format argument instead of separate arguments (Łukasz Szeremeta)
- [b68f858]: Better indent for jsonldhtml script end tag (Łukasz Szeremeta)
- [ac407d2]: Add ability to define base URL of molecule (Łukasz Szeremeta)
- [f60aaa1]: URL for PyPI badge (Łukasz Szeremeta)
- [8c762b9]: base URL of molecule -> base URI of molecule (Łukasz Szeremeta)
- [ddcea75]: Argument groups for better readability (Łukasz Szeremeta)
- [0d42de1]: Improve help for file argument (Łukasz Szeremeta)
- [ebe0b17]: Better arguments description readability in README (Łukasz Szeremeta)
- [3320bb3]: Add missing * in README (Łukasz Szeremeta)
- [9ac9c8e]: Wrap URL by angle brackets in README (Łukasz Szeremeta)
- [36a17eb]: Add missing docstring in init.py (Łukasz Szeremeta)
- [f741e48]: Add missing blank line in init.py (Łukasz Szeremeta)
- [1e580d2]: Update Codacy badge (Łukasz Szeremeta)
- [07018e1]: Docker image and improved README (Łukasz Szeremeta)
- [3e71473]: Merge branch 'master' of github.com:lszeremeta/molstruct (Łukasz Szeremeta)
- [6a4534d]: Add example with mount of current working directory (Łukasz Szeremeta)
- [34ae634]: Use nonroot baseImage (Łukasz Szeremeta)
- [39ab46d]: Rewrite Docker sentence (Łukasz Szeremeta)
- [334db2c]: Split long sentence into two separate sentences (Łukasz Szeremeta)
- [3b697e0]: molstruct -> Molstruct (Łukasz Szeremeta)
- [b6e034d]: More molstruct -> Molstruct (Łukasz Szeremeta)
- [40c9d56]: Update metadata (Łukasz Szeremeta)
- [31e1f18]: Merge branch 'master' of github.com:lszeremeta/molstruct (Łukasz Szeremeta)
- [4895333]: Rewrite keywords to string list (Łukasz Szeremeta)
- [9a5322c]: Add Docker image size badge (Łukasz Szeremeta)
- [1794de6]: Badges reorder (Łukasz Szeremeta)
- [a5da5bb]: Add tests, shorter output functions names (Łukasz Szeremeta)
- [e69a968]: GitHub Action: Test with pytest (Łukasz Szeremeta)
- [d7ba112]: Add always good path to test file (Łukasz Szeremeta)
- [007ed70]: Improve docstrings based on PEP 257 (Łukasz Szeremeta)
- [42deefc]: Improve tag match pattern for tagged-release.yml (Łukasz Szeremeta)
- [a39b8ff]: Additional improve docstrings based on PEP 257 (Łukasz Szeremeta)
- [0fb78d2]: Improve output tests (Łukasz Szeremeta)
- [82a9089]: Add missing "build" in README (Łukasz Szeremeta)
- [4b305cb]: Add Docker Hub link in main description (Łukasz Szeremeta)
- [85c99ee]: Better Molstruct description in README (Łukasz Szeremeta)
- [f21b537]: Reformat test_outputs.py (Łukasz Szeremeta)
- [45a9b07]: Rewrite README to better readability (Łukasz Szeremeta)
- [7992483]: Additional documantation improvements (Łukasz Szeremeta)
- [6c5f8b4]: Use yield in fixture (Łukasz Szeremeta)
- [f1f122d]: Add version change management (Łukasz Szeremeta)
Molstruct 1.0.1: This is definitely good time to struct something!
In this small, additional release you will find README with minor improvements to make getting started with Molstruct more comfortable for you. In addition, the PNG logos in the project repository have been optimized to take up less space without losing quality. More in Commits section below.
About Molstruct
Molstruct is a new Python tool. With it, you can easily convert chemical data from CSV files to structured data. Structured data are additional data placed on websites. They are not visible to ordinary internet users, but can be easily processed by machines. There are 3 formats that we can use to save structured data - JSON-LD, RDFa and Microdata. Molstruct supports them all and use MolecularEntitly type.
Contributions
If you have an idea to improve Molstruct, or if you've noticed a bug, don't hesitate to get involved. If you are new in open source contributions, read How to Contribute to Open Source.
Commits
Molstruct 1.0.0: Good time to struct something!
Say hello to Molstruct!
Molstruct is a new Python tool. With it, you can easily convert chemical data from CSV files to structured data. Structured data are additional data placed on websites. They are not visible to ordinary internet users, but can be easily processed by machines. There are 3 formats that we can use to save structured data - JSON-LD, RDFa and Microdata. Molstruct supports them all and use MolecularEntitly type.
If you have an idea to improve Molstruct, or if you've noticed a bug, don't hesitate to get involved. If you are new in open source contributions, read How to Contribute to Open Source.