Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS review - paper / state of the field #38

Open
step21 opened this issue Jul 26, 2022 · 3 comments
Open

JOSS review - paper / state of the field #38

step21 opened this issue Jul 26, 2022 · 3 comments

Comments

@step21
Copy link

step21 commented Jul 26, 2022

As far as I can tell there is no review of the "state of the field" in the paper. Or do no other, somewhat similar tools exist? Alternatively, you could also compare to existing solutions / use of the services without tools.
And, not totally sure, but a lot of the usage description might be better relegated to the Readme or documentation, which the paper is not supposed to be I think.

@rlskoeser
Copy link

I agree with @step21 — no state of the field review, and too much detail on the documentation.

I think the details of functionality, configuration and usage are beyond the kind of high-level overview that is really needed here. For example, the list of specific image formats and ways to specify images seemed irrelevant to me. Stating that the tool supports a wide range of common image formats and that there are multiple ways to provide images is probably enough here.

I am curious about the state of the field — are there any other tools like this, whether for HTR or OCR? I know about a student project on OCR benchmarking a few years ago because the student won the senior thesis prize from the Center for Digital Humanities at Princeton, where I work. Here's the write up, in case it's at all relevant: https://cdh.princeton.edu/updates/meet-the-2021-cdh-senior-thesis-prize-honorees-william-ughetta/

Are there other similar comparison tools, or do people typically stick with one API that they already have access to whether or not it is best suited to a particular set of content? Do these services not have that much traction, or are people using other tools to access them without running comparisons?

@rlskoeser
Copy link

I am a little curious about the statement of need — is this tool primarily for comparing the APIs, or do you expect people to use handprint to compare the APIs and make a choice, and then continue to use handprint for their actual workflows?

I'm asking because I was struck by the last sentence in the statement of need section, that "if desired" people could use handprint for automated workflows. Do you know if people are doing so? Would handprint be as good as an API-specific tool, or would there be some kind of tradeoff in using handprint vs something else?

I'm not sure how much you need to clarify this for the paper, but it would be stronger if you did.

@mhucka
Copy link
Collaborator

mhucka commented Aug 22, 2022

Thanks for the comments.

Regarding workflows: the intention is

  • People can use it in shell scripts or Python scripts (the latter using subprocesses)
  • If desired, Python programmers could also extract the code for interacting with a desired service and use it as a starting point for their own programs.

Regarding people using it for automated workflows: in the past, I've had a couple of people tell me they're using it in workflows. So, yes, people seem to be using it directly.

Regarding the state of the field: I have not found any other tools that allow you to run the same input against multiple HTR services. As far as I'm aware, there are only individual sample applications, such as Amazon's sample program Textractor.

Regarding the original question by @step21, there is no review of the "state of the field" in the paper: that's not actually a requirement for JOSS papers, as far as I'm aware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants