Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor page breaking with figures #114

Closed
davidchisnall opened this issue Dec 22, 2024 · 21 comments
Closed

Poor page breaking with figures #114

davidchisnall opened this issue Dec 22, 2024 · 21 comments
Labels
not my bug Issue on a dependency or 3rd-party module report to SILE

Comments

@davidchisnall
Copy link

Using the figure environment, I am seeing two rendering issues:

image

First, the caption begins too low, overlapping the end of the page. Second, the caption itself is split across two pages. I believe part of this is an artefact of the fact that figures are not floats (which is sad, given that SILE was specifically designed to make floats easier to support with good placement than TeX), but I think the rest of it is that the figure is not being treated as a complete box for the purpose of layout and so line breaking is allowed in the middle.

@Omikhleia
Copy link
Owner

Omikhleia commented Dec 22, 2024

On the underlying SILE issues:

... given that SILE was specifically designed to make floats easier to support with good placement than TeX

I don't think this was ever true, unfortunately... A mere claim not backed up by any serious attempt. ^^

Regarding our options, well, it's not that obvious. One could rewrite these environments to use some grouping logic as done in my ptable package (or, as a mere workaround, wrap them in a 1x1 ptable - it's an overkill but it somewhat works). That's not fully satisfying though, as the overfull page issue can still ticks in and lead to even worse issues.

My feelings are that, in order to make any progress, lots of internal things have to change. It's opinionated, and I am pessimistic by nature, but IMHO, the typesetter is still highly problematic, the pagebuilder is an utter mess, the frame logic is totally broken (ouch! another claim that does not live to the expectations...), the footnote and "insertion" logic is just insane (and likely wrong, sile-typesetter/sile#619)....

So I could be tempted to say that this is not an issue for re·sil·ient...

But at the same time, I made so many attempts in 2023 at experimenting with a new typesetter and pagebuilder (I briefly alluded to it here: sile-typesetter/sile#2166 (comment)), that... well, who knows, perhaps something might came out of it... It's a deep rewriting, however, compatibility-breaking with many core packages,1 which I might end up throwing away: Such a "fork" (there's no better word...) is inherently hard, and a single developer cannot do much in these areas...

Footnotes

  1. I don't understand why SILE just keeps those broken packages, hardly maintained and impending on the capability to move forward, such as "linespacing", "gutenberg", "grid" (ouch! Another one bites the dust... but it never worked https://github.com/sile-typesetter/sile/issues/1720), etc. Sometimes one has to cut old branches of a tree.

@davidchisnall
Copy link
Author

Thanks for the detailed response. I think my requirements for figures are simpler than the general case for floats for two reasons:

  • I'm using a single-column layout with full-width figures, so the only degrees of freedom are move-down and move-up.
  • I am following the conventional rule that figures must appear after the first reference to them, which means that the only degree of freedom for the figure is to move it down.

I think this should allow a simple greedy LaTeX-like approach to get good(ish) results, where figures appear in one of two places:

  • Their original location.
  • The top of the next page, if the current location would result in an overfull page, with paragraphs moved up to fill in the space.

Ideally, floats would not be reordered, so if moving a figure down required moving it past another float then they'd move together and, as a last resort, produce a page of figures ( / listings / tables).

I am not familiar with the internals of SILE's page layout engine though, so I don't have a good intuition of how hard this would be to implement: is it a simple 'move and redraw' call or a complete rewrite of some core logic?

@Omikhleia
Copy link
Owner

Omikhleia commented Dec 28, 2024

The top of the next page, if the current location would result in an overfull page, with paragraphs moved up to fill in the space.

But what if the "moved up" content is a table, which is page-split, and with headers repeated? The devil is always in the details ;)

(Not all tables can be considered as float, especially those spanning multiple pages anyway)

@davidchisnall
Copy link
Author

But what if the "moved up" content is a table, which is page-split, and with headers repeated?

I consider tables that need to be split across pages bad style for anything that isn't a datasheet, so this would not be a problem for me. If I have tables that need splitting, they get split into something more concise.

(Not all tables can be considered as float, especially those spanning multiple pages anyway)

This is why LaTeX has \table and \tabular, one for the table that has a caption and floats, the other for the table data itself, which may be used outside of \table and split.

@Omikhleia
Copy link
Owner

... so this would not be a problem for me.

But a general solution would still have to take that case into account.

Likewise, if the legend/caption contains footnotes, these have to be moved too. The devil is in the detail.

@davidchisnall
Copy link
Author

I would rather have a solution that works for some uses than a solution that works for none. If it has limitations, such as ‘no footnotes in captions in the first version’ or ‘floats must be smaller than a page’ I can work around that.

@Omikhleia
Copy link
Owner

Omikhleia commented Jan 7, 2025

I'm confused. Studying the SILE typesetter and pagebuilder, and implementing something lame/limited towards that goal, is probably a matter of a week-end. That's the cool thing regarding SILE, one can hack anything one wants.

(I can't do this here though: I frequently have footnotes in captions, in my books. And split tables, with footnotes too.)

@davidchisnall
Copy link
Author

Any hints about where I should look on the code? I would like to make something that works for my use case and can later be extended and happy to contribute it.

@Omikhleia
Copy link
Owner

Omikhleia commented Jan 9, 2025

Another tricky point to consider: float barriers.
One wouldn't want, possibly, a "postponed" float to be moved after a next part/chapter/section/subsection, notably. Next part/chapter would be easier (= do not move a float past a "hard" page break), but section/subsection barriers are likely trickier (as those are generally continuous).

@davidchisnall
Copy link
Author

Good point. I'd be okay moving them across sections, but not chapter boundaries.

Do you have any example code or code in SILE I should look at to know where to start? It's been almost 20 years since I read the TeX papers and I've only been using Lua seriously for the last few months.

@Omikhleia
Copy link
Owner

Omikhleia commented Jan 9, 2025

Do you have any example code or code in SILE I should look at to know where to start?

A suggestion would be to familiarize with SILE's pagebuilders/base.lua first. Note what I said about it earlier, though :) = sc. "an utter mess"...

@davidchisnall
Copy link
Author

Thanks. I don't suppose there's any documentation on how that's used (is it called on each page or on the whole document? What are the properties / methods available vbox things that it's generating?)

Repository owner deleted a comment from davidchisnall Jan 9, 2025
Repository owner deleted a comment from davidchisnall Jan 9, 2025
Repository owner deleted a comment from davidchisnall Jan 9, 2025
@Omikhleia
Copy link
Owner

(deleted the duplicate comments)
Nope, not much documentation for this part of the code. Discussing these internals in issues is cumbersome, note that SILE also has a Gitter channel (with chat etc.): See Wiki 3rd bullet.

@davidchisnall
Copy link
Author

I've never used gitter, I'll take a look.

@davidchisnall
Copy link
Author

I've been poking the page builder a bit and trying to read some docs a bit more carefully, but it looks like it's the wrong level in the system. As I understand it, the page builder is passed a set of vboxes and told to try to lay them out into a page. It can't mutate the list of vboxes. To move floats, I need to call the page builder, detect if it has overflowed the frame and, if so, see if I can move one or more of the vboxes later to give a better break.

If I'm following the code correctly, this is done in the typesetter, not the page builder? It looks like it's the typesetter's output queue is the thing that I will need to mutate. I'll have a go at this tomorrow.

@Omikhleia
Copy link
Owner

I'm confused again.
Moving floats around pages clearly is a page builder concern, whatever the current code does or does not.
It's a matter of "separation of concerns" (Just in case - I don't know you - see, and ignore otherwise if you already know this terminology).

@davidchisnall
Copy link
Author

I believe that would require re architecting the current layering. As far as I can see, the current interface between the typesetter and the page builder does not allow the page builder to mutate the queue of vboxes, only to choose how many from the head of the queue should be used.

Abstractly, this also seems right. The page builder is building a page. It can reorder things within that page, but it cannot move things to a different page. This seems like the correct separation because the page builder is mostly a cost function. The typesetter provides it with options and asks for the cost of that choice.

@Omikhleia
Copy link
Owner

As far as I can see, the current interface between the typesetter and the page builder does not allow the page builder to mutate the queue of vboxes, only to choose how many from the head of the queue should be used

Well it's not totally true, the page builder does mutate some things (via ugly side effects, but still): footnotes can be split (and the remaining content is modified in the output queue (= things are split, and a part is moved to the next page).

@davidchisnall
Copy link
Author

It looks (again, you know this code better than me, I've only spent a few hours reading it) as if that's still unidirectional. The typesetter never gives the page builder the full set of vboxes to play with, it just says 'try these', and then relies on the page builder to say 'these are the ones for this page, in this order, and these are the ones I couldn't handle that need to overflow'. It's been almost twenty years since I worked on a typesetting system, but that seems like the kind of split that was common then, because it makes it easy to plug in local logic in the page builder (e.g. if I wanted to have a version for old-style paper copyediting where all lines are double spaced, the page builder can easily shove an extra vspace between each vbox for copyeditor annotations, but if floats can be moved that's not its concern).

@Omikhleia Omikhleia added not my bug Issue on a dependency or 3rd-party module report to SILE labels Jan 12, 2025
@Omikhleia
Copy link
Owner

Omikhleia commented Jan 12, 2025

SILE discussion is here sile-typesetter/sile#2211 (comment) and as noted already existing issue sile-typesetter/sile#458

Not a resilient.sile issue per se (though it could benefit from any improvement on SILE's side - Let's re-open a dedicated issue whenever the latter comes up with working floats).

@Omikhleia Omikhleia closed this as not planned Won't fix, can't repro, duplicate, stale Jan 12, 2025
@davidchisnall
Copy link
Author

I've moved discussion over here:

sile-typesetter/sile#2211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not my bug Issue on a dependency or 3rd-party module report to SILE
Projects
None yet
Development

No branches or pull requests

2 participants