Training fulltext - annotation doubt #1237

martasoricetti · 2025-01-23T00:03:38Z

I’m training the full text model and I have a doubt. It could happen that a reference is divided by a figure, a table or a formula.
How should I handle this situations in the annotation process?

<p>[…]<ref type="biblio">(Ramanathan et<lb/></ref></p>
 
 <figure type="table">Table 5. […] </figure> 
 
 <figure>Figure 3. […] </figure>
 
 <p><ref type="biblio">al., 2001)</ref>. However, it was argued that the use of APCADA,<lb/>[…]</p>

I tried this approach (dividing the same intext reference pointer in two different tags) but i don’t know if it’s the right choice

The text was updated successfully, but these errors were encountered:

lfoppiano · 2025-01-23T23:52:20Z

HI @martasoricetti, I think that's the right way. However I'm not sure how those rare cases will be reconstructed after the model extracts them. But yes, that's the right approach.

You could add xml:id / corresp attributes only to those references that are split by other elements.
Those attributes will be ignored at the moment, but they might be used in future to establish that it's the same reference.

Something like:

<p>[…]<ref type="biblio" xml:id="ref1">(Ramanathan et<lb/></ref></p>
 
 <figure type="table">Table 5. […] </figure> 
 
 <figure>Figure 3. […] </figure>
 
 <p><ref type="biblio" corresp="#ref1">al., 2001)</ref>. However, it was argued that the use of APCADA,<lb/>[…]</p>

lfoppiano added question There's no such thing as a stupid question models:fulltext training guidelines Related to the annotation guidelines for training data labels Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training fulltext - annotation doubt #1237

Training fulltext - annotation doubt #1237

martasoricetti commented Jan 23, 2025 •

edited

Loading

lfoppiano commented Jan 23, 2025

Training fulltext - annotation doubt #1237

Training fulltext - annotation doubt #1237

Comments

martasoricetti commented Jan 23, 2025 • edited Loading

lfoppiano commented Jan 23, 2025

martasoricetti commented Jan 23, 2025 •

edited

Loading