Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR: Add Source edit by reference endpoints #90

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

samdbmg
Copy link
Member

@samdbmg samdbmg commented Nov 4, 2024

Details

Adds an ADR proposing how a Source-level edit-by-reference endpoint might behave, along with a partial implementation to illustrate one of the options.

Draft for discussion.

Pivotal Story (if relevant)

Story URL: https://www.pivotaltracker.com/story/show/188413555

Related PRs

N/A

Submitter PR Checks

(tick as appropriate)

  • PR completes task/fixes bug
  • API version has been incremented if necessary
  • ADR status has been updated, and ADR implementation has been recorded
  • Documentation updated (README, etc.)
  • PR added to Pivotal story (if relevant)
  • Follow-up stories added to Pivotal

Reviewer PR Checks

(tick as appropriate)

  • PR completes task/fixes bug
  • Design makes sense, and fits with our current code base
  • Code is easy to follow
  • PR size is sensible
  • Commit history is sensible and tidy

Info on PRs

The checks above are guidelines. They don't all have to be ticked, but they should all have been considered.

@samdbmg samdbmg requested a review from a team as a code owner November 4, 2024 16:47
@samdbmg samdbmg marked this pull request as draft November 4, 2024 16:47
@samdbmg samdbmg removed the request for review from a team November 4, 2024 16:47
@samdbmg samdbmg force-pushed the sammg-add-source-edit-adr branch from 474d5dc to 47fa80c Compare December 13, 2024 10:58
@samdbmg samdbmg marked this pull request as ready for review December 13, 2024 10:58
@samdbmg samdbmg force-pushed the sammg-add-source-edit-adr branch from 47fa80c to 1332332 Compare December 13, 2024 10:59
@philipnbbc philipnbbc self-requested a review January 6, 2025 16:10
@philipnbbc philipnbbc self-assigned this Jan 6, 2025
Copy link
Contributor

@philipnbbc philipnbbc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the choices of 3 and 4 but think that some implementation investigation would be needed to fully understand the details and consequences of 3.

docs/adr/0024-source-level-edit.md Show resolved Hide resolved
docs/adr/0024-source-level-edit.md Outdated Show resolved Hide resolved
And the following drawback is added:
> Bad, because clients have to do significant additional work to make use of a Source timeline (identifying, de-referencing and re-mapping the relevant Flows).

### Option 3: Provide additional Flow Segment API capability for more direct by-reference operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general questions and thoughts on this option:

  • I think in addition to using parts (time bounded) of a Flow in another Flow that defining a new Flow that is a time shift of another Flow would be useful.
  • Could the reference Flow Segment be unbounded? This would be useful in a live context and allow time shifting without resorting to time bounding.
  • The Flow Segment reference timerange may fall within media objects. This means that more information is required in a dereferenced Flow Segment to allow a client to work out which is the first or last sample in the target timerange.
  • What limitations if any should be placed on reference Flow Segments and what implementation recommendations should be given? E.g. multi levels of references to other reference Flow Segments. Are limits an implementation choice?

Copy link
Member Author

@samdbmg samdbmg Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think in addition to using parts (time bounded) of a Flow in another Flow that defining a new Flow that is a time shift of another Flow would be useful.
  • Could the reference Flow Segment be unbounded? This would be useful in a live context and allow time shifting without resorting to time bounding.

Perhaps the requirement should be "bounded before" so you can time-shift but still calculate the necessary offset? The other ways would be to add a form that has reference.timeshift instead of reference.timerange or similar, or require "bounded one end", but an unbound timeshift going backwards in time feels like an unusual requirement.

The Flow Segment reference timerange may fall within media objects. This means that more information is required in a dereferenced Flow Segment to allow a client to work out which is the first or last sample in the target timerange.

A way around this would be to treat reference.timerange like the timerange query parameter on /flows/<flowid>/segments: it selects all Flow Segments where any part falls within the range. That would probably mean the start of whatever got selected is moved to the start of the timerange given (so if you set reference.timerange to the middle of a segment the resulting shift is later than you wanted) which leads to wording to the effect that "clients SHOULD ensure the start of a reference timerange aligns with the start of a segment in the referenced Flow".
The other way is to add more info to the dereferenced Flow Segment - e.g. setting sample_offset to come out at the right point within the media object, or adding another field that sub-selects within a media object TimeRange.

What limitations if any should be placed on reference Flow Segments and what implementation recommendations should be given? E.g. multi levels of references to other reference Flow Segments. Are limits an implementation choice?
Ideally I think references-to-references get accepted in a POST and a store implementation figures out how to deference that efficiently, however there has to be a limit on that (and some mechanism to handle infinite reference loops).

We could say that "a store implementation MAY store and return a different (but functionally equivalent) list of Flow Segments" or something, so you can resolve nested references on write. You might write a reference to Flow X but Flow X is a reference to Flows Y and Z, so a GET (with ?deference=false) returns a reference to Flows Y and Z.

Reference loops are harder, because ideally we'd find them on write but that might create quite a complex query to explore the reference tree in both directions, although maybe we have to do that anyway to make reference counting work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think in addition to using parts (time bounded) of a Flow in another Flow that defining a new Flow that is a time shift of another Flow would be useful.
  • Could the reference Flow Segment be unbounded? This would be useful in a live context and allow time shifting without resorting to time bounding.

Perhaps the requirement should be "bounded before" so you can time-shift but still calculate the necessary offset? The other ways would be to add a form that has reference.timeshift instead of reference.timerange or similar, or require "bounded one end", but an unbound timeshift going backwards in time feels like an unusual requirement.

Timeshift for negative infinity would for example be needed if the Flow content is being populated in a random (asynchronous) order and you may not know or care where it starts. It would be nice to be able to simply define Flow A to be a timeshift of Flow B no matter what the bounds may be of Flow B. If there is a bound in the reference.timerange (or consequently in timerange) then the timeshift can be calculated from that and otherwise a reference.timeshift would be needed.

My current thinking is that if we allow any unboundedness in the reference Flow Segment then allowing fully unbounded seems sensible and in that case a reference.timeshift is needed if a timeshift operation is supported.

The Flow Segment reference timerange may fall within media objects. This means that more information is required in a dereferenced Flow Segment to allow a client to work out which is the first or last sample in the target timerange.

A way around this would be to treat reference.timerange like the timerange query parameter on /flows/<flowid>/segments: it selects all Flow Segments where any part falls within the range. That would probably mean the start of whatever got selected is moved to the start of the timerange given (so if you set reference.timerange to the middle of a segment the resulting shift is later than you wanted) which leads to wording to the effect that "clients SHOULD ensure the start of a reference timerange aligns with the start of a segment in the referenced Flow".
The other way is to add more info to the dereferenced Flow Segment - e.g. setting sample_offset to come out at the right point within the media object, or adding another field that sub-selects within a media object TimeRange.

I agree with you that a way around this would be to have clients treat the reference Flow Segment's timerange and reference.timerange as a query parameter and it's the client's problem to handle the case where the timerange falls within a media object. I guess adding a SHOULD that it should fall at the start and end of media objects is something that could be stated if we want to keep things simple for clients. The downside is that it might hamper reuse because the content needs to be ideally segmented.

I'm leaning towards not requiring sample_offset etc. in a reference Flow Segment because it would require the creator to download the segment and work out what the values should be. It is a choice between the creator of the reference Flow Segment figuring out what the sample_offset and/or sample_count should be for the first and last dereferenced Flow Segment, or leave it to the reader of the Flow Segment to work it out like they would need to do if running a query with timerange.

In a regular Flow Segment the timerange is for the whole media file unless sample_start and/or sample_count are set in which case the timerange is for the clipped portion of the media file. Maybe what should be added is a media_object_timerange for the whole media object which then allows a reference Flow Segment to be dereferenced to a Flow Segment that has timerange != media_object_timerange and sample_offset and/or sample_count are missing. A client would then need to work out the sample offset and count using the 2 timeranges given.

What limitations if any should be placed on reference Flow Segments and what implementation recommendations should be given? E.g. multi levels of references to other reference Flow Segments. Are limits an implementation choice?

Ideally I think references-to-references get accepted in a POST and a store implementation figures out how to deference that efficiently, however there has to be a limit on that (and some mechanism to handle infinite reference loops).

We could say that "a store implementation MAY store and return a different (but functionally equivalent) list of Flow Segments" or something, so you can resolve nested references on write. You might write a reference to Flow X but Flow X is a reference to Flows Y and Z, so a GET (with ?deference=false) returns a reference to Flows Y and Z.

Reference loops are harder, because ideally we'd find them on write but that might create quite a complex query to explore the reference tree in both directions, although maybe we have to do that anyway to make reference counting work.

Agree. Maybe another limitation a store could optionally apply is whether reference.timerange starts and end on the non-reference Flow Segment timerange, possibly also whether a non-reference Flow Segment does not cover the media object timerange and that sample_offset / sample_count are missing.

docs/adr/0024-source-level-edit.md Show resolved Hide resolved
@samdbmg samdbmg force-pushed the sammg-add-source-edit-adr branch 2 times, most recently from 52d228f to 569b780 Compare January 8, 2025 11:31
Updates the source level edit ADR to make the "Reference Form"
OpenTimelineIO example use a `MissingReference`, and discuss
preservation of metadata through workflows.
Also fixes some typos in the rest of the document.
@samdbmg samdbmg force-pushed the sammg-add-source-edit-adr branch from 569b780 to 1b305cf Compare January 8, 2025 11:32
}
]
```

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking - am I correct in my understanding that the References could be pointing to Timeranges of Flows which have no segments associated, and that would be just fine?

That could mean that, for instance, a Reference could point to a Timerange of a Flow which is "in the future" and therefore may have new segments added.

Thus a "time shifted" Flow could be created from a Timerange of a Flow such as now() until an infinite future time such that a, as-yet non-existent, future-playout-device could play out a Flow which is "still in record"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my intention, yes, although see the bottom of #90 (comment) for some of the complications that creates regarding things like referential loops (and a discussion about whether a more explicit timeshift feature is needed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants