Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bco ro edits #3

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions docs/tutorial/execution_domain.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,15 @@ sort: 5

The `execution_domain` should refer to actual the workflow script being executed.

This is a bit of a challenge in this example as we have not bundled the `*.nf` file in the BCO, but ran it by refernece `nf-core/chipseq` which Nextflow then retrieved from GitHub. The web page <https://nf-co.re/chipseq> gives great information for humans, but is in HTML and not executable by workflow engines.
This is a bit of a challenge in this example as we have not bundled the `*.nf` file in the BCO, but ran it by reference `nf-core/chipseq` which Nextflow then retrieved from GitHub. The web page <https://nf-co.re/chipseq> gives great information for humans, but is in HTML and not executable by workflow engines.

Taking into consideration the `-revision 1.2.2` we then navigate from <https://nf-co.re/chipseq> to <https://github.com/nf-core/chipseq>, select the [tag 1.2.2](https://github.com/nf-core/chipseq/tree/1.2.2) and find <https://github.com/nf-core/chipseq/blob/1.2.2/main.nf> - but again this is HTML, so we use the **Raw** button to find <https://raw.githubusercontent.com/nf-core/chipseq/1.2.2/main.nf>.
Taking into consideration the `-revision 2.0.0` we then navigate from <https://nf-co.re/chipseq> to <https://github.com/nf-core/chipseq>, select the [tag 2.0.0](https://github.com/nf-core/chipseq/tree/2.0.0) and find <https://github.com/nf-core/chipseq/blob/2.0.0/main.nf> - but again this is HTML, so we use the **Raw** button to find <https://raw.githubusercontent.com/nf-core/chipseq/2.0.0/main.nf>.

This can then be described in the BCO in the `script` array, for `script_driver` we use `nextflow` as it matches the command line (Note: there is currently no registry of known `script_driver` values).

```json
"execution_domain": {
"script": ["https://raw.githubusercontent.com/nf-core/chipseq/1.2.2/main.nf"],
"script": ["https://raw.githubusercontent.com/nf-core/chipseq/2.0.0/main.nf"],
"script_driver": "nextflow"
}
```
Expand All @@ -31,4 +31,4 @@ A challenge here is that we have not indicated how the workflow engine itself sh
}
```

In one way this is more useful, as it directly executable - at least if the Conda [environment.yml](environment.yml) has been activated. On the other side `run.sh` provides absolutely no details about the data analysis performed, and as the purpose of the BCO is to submit a workflow, we instead show the `main.nf` that lists the individual steps, matching the `pipeline_steps` section of the BCO.
In one way this is more useful, as it directly executable - at least if the Conda [environment.yml](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file) has been activated. On the other side `run.sh` provides absolutely no details about the data analysis performed, and as the purpose of the BCO is to submit a workflow, we instead show the `main.nf` that lists the individual steps, matching the `pipeline_steps` section of the BCO.
4 changes: 2 additions & 2 deletions docs/tutorial/prerequisites.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Below we'll use [Conda](https://conda.io/), which can be installed for all major

### Install Conda

First follow instructions for [installing conda](https://bioconda.github.io/user/install.html#install-conda).
First follow instructions for [installing conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html).

For Linux:

Expand Down Expand Up @@ -75,4 +75,4 @@ including installing the expected version of Java.

As this example let Nextflow download workflow dependencies with Conda,
you can instead install and use [Docker](https://www.docker.com/)
or [Singularity](https://sylabs.io/docs/) containers.
or [Singularity](https://sylabs.io/docs/) containers.
8 changes: 4 additions & 4 deletions docs/tutorial/rocrate.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,13 +140,13 @@ Rather than use `creator` with software agents, [RO-Crate provenance](https://ww

## License

[Licensing](https://www.researchobject.org/ro-crate/1.1/contextual-entities.html#licensing-access-control-and-copyright) can in RO-Crate be assigned to any data entity, allowing an RO-Crate to have a mix of licenses for different files, compared to BCO which can only provide an overall license.
[Licensing](https://www.researchobject.org/ro-crate/1.1/contextual-entities.html#licensing-access-control-and-copyright) in RO-Crate can be assigned to any data entity, allowing an RO-Crate to have a mix of licenses for different files, compared to BCO which can only provide an overall license.

Each `license` identifier can thus be expanded. In this case <https://github.com/nf-core/chipseq/blob/1.2.1/LICENSE> is the specific instance of the MIT license with _(c) copyright_ inserted. To classify it as MIT license, ideally [SPDX identifiers]() should be used (see also [schemaorg/suggestions-questions-brainstorming#251](https://github.com/schemaorg/suggestions-questions-brainstorming/issues/251).
Each `license` identifier can thus be expanded. In this case <https://github.com/nf-core/chipseq/blob/2.0.0/LICENSE> is the specific instance of the MIT license with _(c) copyright_ inserted. To classify it as MIT license, ideally [SPDX identifiers]() should be used (see also [schemaorg/suggestions-questions-brainstorming#251](https://github.com/schemaorg/suggestions-questions-brainstorming/issues/251).

```json
{
"@id": "https://github.com/nf-core/chipseq/blob/1.2.1/LICENSE",
"@id": "https://github.com/nf-core/chipseq/blob/2.0.0/LICENSE",
"@type": "CreativeWork",
"name": "MIT License",
"identifier": "https://spdx.org/licenses/MIT"
Expand All @@ -164,4 +164,4 @@ Following [RO-Crate documentation](https://www.researchobject.org/ro-crate/1.1/c

## Workflow entity

_TODO_: <https://www.researchobject.org/ro-crate/1.1/workflows.html>
_TODO_: <https://www.researchobject.org/ro-crate/1.1/workflows.html>
4 changes: 2 additions & 2 deletions docs/tutorial/running.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Min Consensus Reps : 1
```

The workflow will take a while to run. If you previously skipped ahead, now go back to create the [skeleton BCO](#skeleton-bco)
The workflow will take a while to run. If you previously skipped ahead, now go back to create the [skeleton BCO](https://biocompute-objects.github.io/bco-ro-crate/tutorial/starting.html#skeleton-bco)

Some workflow system require explicit inputs, while others have them declared as part of the workflow or the workflow config. Nextflow have both options, in this case we used the its [`test` profile](https://github.com/nf-core/chipseq/blob/1.2.2/conf/test.config) to pick the minimal test inputs suitable for testing.

Expand Down Expand Up @@ -332,4 +332,4 @@ This form uses [ARCP URIs inside the RO-Crate](https://www.researchobject.org/ro

```json
{"uri": "arcp://uuid,9b309ebd-6dfb-4c6d-983b-56b91fca6e06home/data/results/genome/genome.fa.include_regions.bed"},
```
```
2 changes: 1 addition & 1 deletion docs/tutorial/starting.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ We'll start by describing the RO-Crate itself under the `./` Dataset, including
}
```

Already you will notice some differences from the BCO. The `name` could match the `provenance_domain/name` of the BCO - but as the BCO focus more on the workflow and the Dataset includes all the files we've changed it to include `"Workflow run of.."`. However if your RO-Crate did not include workflow results, then the two could have the same title. `description` allow us to provide a longer description - comparable to BCO's `usability_domain` which we'll populate later, but again decribing the whole dataset.
Already you will notice some differences from the BCO. The `name` could match the `provenance_domain/name` of the BCO - but as the BCO focus more on the workflow and the Dataset includes all the files we've changed it to include `"Workflow run of.."`. However if your RO-Crate did not include workflow results, then the two could have the same title. `description` allow us to provide a longer description - comparable to BCO's `usability_domain` which we'll populate later, but again describing the whole dataset.

The reason these fields are mainly at dataset level is that we can further describe individual files and resources later as separate [data entities](https://www.researchobject.org/ro-crate/1.1/data-entities.html). Therefore here the `author` of the dataset is <https://orcid.org/0000-0001-9842-9718>, the ORCID identifier for Stian, as he ran the workflow and gathered (most of) the files, and `license` of the dataset (the whole folder) can be different from the license of the workflow. If need be `license`, `author` etc. can be different on the `ro-crate-metadata.json` entity if someone else made this JSON.

Expand Down