Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit README by adding table of contents and update overview, notices and disclaimers #498

Merged
merged 8 commits into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
155 changes: 110 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,31 @@
# CDCgov GitHub Organization Open Source Project Template
# Table of Contents
[1. Overview](#1-overview)
- [The Problem](#the-problem)
- [The Solution](#the-solution)
- [Future Considerations](#future-considerations)

**Template for clearance: This project serves as a template to aid projects in starting up and moving through clearance procedures. To start, create a new repository and implement the required [open practices](open_practices.md), train on and agree to adhere to the organization's [rules of behavior](rules_of_behavior.md), and [send a request through the create repo form](https://forms.office.com/Pages/ResponsePage.aspx?id=aQjnnNtg_USr6NJ2cHf8j44WSiOI6uNOvdWse4I-C2NUNk43NzMwODJTRzA4NFpCUk1RRU83RTFNVi4u) using language from this template as a Guide.**
[2. Notices](#2-notices)
- [2.1 Privacy Standard Notice](#21-privacy-standard-notice)
- [2.2 Records Management Standard Notice](#22-records-management-standard-notice)
- [2.3 Domestic Copyright Protection Notice](#23-domestic-copyright-protection-notice)
- [2.4 Open Source Notice](#24-open-source-notice)
- [2.5 License Standard Notice](#25-license-standard-notice)
- [2.6 Github Notice](#26-github-notice)
- [2.7 Contributing Standard Notice](#27-contributing-standard-notice)

**General disclaimer** This repository was created for use by CDC programs to collaborate on public health related projects in support of the [CDC mission](https://www.cdc.gov/about/organization/mission.htm). GitHub is not hosted by the CDC, but is a third party website used by CDC and its partners to share information and collaborate on software. CDC use of GitHub does not imply an endorsement of any one particular service, product, or enterprise.
[3. General Disclaimer](#3-general-disclaimer)

## Access Request, Repo Creation Request
[4. Other Related Documents](#4-other-related-documents)

* [CDC GitHub Open Project Request Form](https://forms.office.com/Pages/ResponsePage.aspx?id=aQjnnNtg_USr6NJ2cHf8j44WSiOI6uNOvdWse4I-C2NUNk43NzMwODJTRzA4NFpCUk1RRU83RTFNVi4u) _[Requires a CDC Office365 login, if you do not have a CDC Office365 please ask a friend who does to submit the request on your behalf. If you're looking for access to the CDCEnt private organization, please use the [GitHub Enterprise Cloud Access Request form](https://forms.office.com/Pages/ResponsePage.aspx?id=aQjnnNtg_USr6NJ2cHf8j44WSiOI6uNOvdWse4I-C2NUQjVJVDlKS1c0SlhQSUxLNVBaOEZCNUczVS4u).]_

## Related documents
# 1. Overview
The Intelligent Data Workflow Automation (IDWA) ReportVision Project aims to support the Office of Public Health Data, Surveillance, and Technology (OPHDST) in enhancing the ability of state, local, territorial, and tribal public health departments to manage, search, and secure critical data. As a key division of the CDC, OPHDST plays a vital role in public health infrastructure.

* [Open Practices](open_practices.md)
* [Rules of Behavior](rules_of_behavior.md)
* [Thanks and Acknowledgements](thanks.md)
* [Disclaimer](DISCLAIMER.md)
* [Contribution Notice](CONTRIBUTING.md)
* [Code of Conduct](code-of-conduct.md)
Please see the [UserGuide](/docs/user_guide.md) to get a technical overview of this project.

## Overview

Please see the [UserGuide](./user_guide.md) to get a overview of this project.
## The Problem

The exchange of public health data is hindered by outdated, manual processes. Some state, local, tribal, and territorial health departments still rely on fax, email, and physical mail to receive case data, requiring staff to manually review and re-enter information from lab reports. This labor-intensive process can take up to 20 minutes per report, and electronic data extraction remains cumbersome and error-prone, particularly when handling multiple documents. As a result, low accuracy in data ingestion impedes the ability of public health departments to efficiently process and utilize critical health data.

## Public Domain Standard Notice
This repository constitutes a work of the United States Government and is not
Expand All @@ -31,46 +36,106 @@ All contributions to this repository will be released under the CC0 dedication.
submitting a pull request you are agreeing to comply with this waiver of
copyright interest.

## License Standard Notice
The repository utilizes code licensed under the terms of the Apache Software
License and therefore is licensed under ASL v2 or later.
## The Solution

ReportVision is a powerful tool designed to automate the reading and extracting of data from lab reports, helping public health departments streamline their workflows. Leveraging the power of the Tesseract engine and Microsoft Azure Cloud Platform, ReportVision allows teams to create customizable, data-driven templates for automatic extraction and annotation of multiple datasets—delivering notable accuracy and speed.

The goal is simple yet powerful: to provide jurisdictions with a "starter kit" that empowers them to rapidly build their own resources, provision scalable Azure infrastructure, or seamlessly replicate similar configurations in Amazon Web Services (AWS) or Google Cloud Platform (GCP).

With ReportVision, public health departments can move from cumbersome, error-prone processes to a highly efficient, automated workflow that supports critical decision-making with fast, reliable data.

This application offers a robust framework for public health departments and personnel to efficiently extract relevant data from lab reports utilizing an advanced Optical Character Recognition (OCR) model. This OCR technology significantly enhances both the speed and accuracy of data extraction, taking your data processing capabilities to the next level.

This source code in this repository is free: you can redistribute it and/or modify it under
the terms of the Apache Software License version 2, or (at your option) any
later version.
Check out the following videos to see how the updated OCR model works in action, and and witness firsthand how ReportVision enhances both the speed and accuracy of data extraction!

This source code in this repository is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the Apache Software License for more details.
<div align="center">
<video width="500" height="280" controls>
<source src="images-and-media/reportvision-demo.mp4" type="video/mp4">
Video Extracting Data From Lab Reports.
</video>
</div align="center">

You should have received a copy of the Apache Software License along with this
program. If not, see http://www.apache.org/licenses/LICENSE-2.0.html
## Future Considerations

The source code forked from other open source projects will inherit its license.
The current version of the application is optimized only for PDF-based lab reports. However, as demand from public health departments and personnel continues to grow, we see significant potential to expand support for additional file formats in future updates.

+ [Return to Table of Contents](#table-of-contents).

# 2. Notices

## 2.1 Privacy Standard Notice
This repository contains only non-sensitive, publicly available data and information. All material and community participation is covered by the [Disclaimer](DISCLAIMER.md) and [Code of Conduct](code-of-conduct.md).

## Privacy Standard Notice
This repository contains only non-sensitive, publicly available data and
information. All material and community participation is covered by the
[Disclaimer](DISCLAIMER.md)
and [Code of Conduct](code-of-conduct.md).
For more information about CDC's privacy policy, please visit [http://www.cdc.gov/other/privacy.html](https://www.cdc.gov/other/privacy.html).

## Contributing Standard Notice
Anyone is encouraged to contribute to the repository by [forking](https://help.github.com/articles/fork-a-repo)
and submitting a pull request. (If you are new to GitHub, you might start with a
[basic tutorial](https://help.github.com/articles/set-up-git).) By contributing
to this project, you grant a world-wide, royalty-free, perpetual, irrevocable,
non-exclusive, transferable license to all users under the terms of the
[Apache Software License v2](http://www.apache.org/licenses/LICENSE-2.0.html) or
later.
+ [Return to Table of Contents](#table-of-contents).

All comments, messages, pull requests, and other submissions received through
CDC including this GitHub page may be subject to applicable federal law, including but not limited to the Federal Records Act, and may be archived. Learn more at [http://www.cdc.gov/other/privacy.html](http://www.cdc.gov/other/privacy.html).
## 2.2 Records Management Standard Notice

## Records Management Standard Notice
This repository is not a source of government records, but is a copy to increase
collaboration and collaborative potential. All government records will be
published through the [CDC web site](http://www.cdc.gov).

## Additional Standard Notices
Please refer to [CDC's Template Repository](https://github.com/CDCgov/template) for more information about [contributing to this repository](https://github.com/CDCgov/template/blob/main/CONTRIBUTING.md), [public domain notices and disclaimers](https://github.com/CDCgov/template/blob/main/DISCLAIMER.md), and [code of conduct](https://github.com/CDCgov/template/blob/main/code-of-conduct.md).
+ [Return to Table of Contents](#table-of-contents).


## 2.3 Domestic Copyright Protection Notice

This repository is a work of the United States Government and is not subject to domestic copyright protection under 17 U.S.C. § 105. If published in the public domain within the United States, copyright and related rights worldwide will be waived through the [CC0 1.0 Universal public domain dedication](https://creativecommons.org/publicdomain/zero/1.0/).

+ [Return to Table of Contents](#table-of-contents).

## 2.4 Open Source Notice

This repository is open source and follows [open practices](docs/open_practices.md). Contributors are expected to adhere to the organization's [rules of behavior](docs/rules_of_behavior.md).

+ [Return to Table of Contents](#table-of-contents).

## 2.5 License Standard Notice

The code in this repository is licensed under the Apache License 2.0 (ASL v2), or any later version at your discretion.

You are free to use, redistribute, and modify the source code under the terms of the Apache License 2.0. However, this software is distributed "as is", without any warranties of any kind, either express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, or non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the software or the use or other dealings in the software.

For full licensing details, refer to the [Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0.html).

Additionally, any code forked from this open-source project will retain its original license.

+ [Return to Table of Contents](#table-of-contents).

## 2.6 Github Notice

GitHub is not hosted by the CDC, but is a third party website used by CDC and its partners to share information and collaborate on software. CDC use of GitHub does not imply an endorsement of any one particular service, product, or enterprise. If you are new to GitHub, we recommend starting with this
[basic tutorial](https://help.github.com/articles/set-up-git) to familiarize yourself with version control and collaboration.

+ [Return to Table of Contents](#table-of-contents).

## 2.7 Contributing Standard Notice

While we encourage continuous development of this repository's codebase, there is currently no designated department overseeing its management. If you'd like to contribute, you have two options:

1. Clone the repository and create a new repository in your organization's codebase with the changes you wish to implement.
- This option allows you to manage the changes independently within your own organization's environment.
2. Submit a pull request and contact the CDC to inquire whether a department has been assigned to manage the repository.
- If a CDC department is designated, you can coordinate with them for further changes.
- _Note_: All comments, messages, pull requests, and other submissions received through
CDC including this GitHub page may be subject to applicable federal law, including but not limited to the Federal Records Act, and may be archived. Learn more at [http://www.cdc.gov/other/privacy.html](http://www.cdc.gov/other/privacy.html).
- Also see [CONTRIBUTING.md](docs/CONTRIBUTING.md) and [CDC Managed Repository Guidance](#4-cdc-managed-repository-guidance).

+ [Return to Table of Contents](#table-of-contents).

# 3. General Disclaimer

This repository was created for use by CDC programs to collaborate on public health related projects in support of the [CDC mission](https://www.cdc.gov/about/cdc/?CDC_AAref_Val=https://www.cdc.gov/about/organization/mission.htm).

+ [Return to Table of Contents](#table-of-contents).

# 4. Other Related Documents

* [Open Practices](docs/open_practices.md)
* [Rules of Behavior](docs/rules_of_behavior.md)
* [Disclaimer](docs/DISCLAIMER.md)
* [Contribution Notice](docs/CONTRIBUTING.md)
* [Code of Conduct](docs/code-of-conduct.md)
* [Review Guidelines](docs/REVIEW_GUIDELINES.md)
* [Review SLAS](docs/REVIEW_SLAS.md)
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file added images-and-media/reportvision-demo.mp4
Binary file not shown.
6 changes: 0 additions & 6 deletions thanks.md

This file was deleted.

Loading