Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outputs for provider authors #155

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

iwahbe
Copy link
Member

@iwahbe iwahbe commented Nov 8, 2023

This PR provides a new type available to infer authors: infer.Output[T]. This type can be used for any field with pulumi:"{name}" in place of T. It provides secret + unknown metadata and an unknown safe infer.Apply API modeled after pulumix as implemented in the bridge.

The file example has been converted to use the new API.


Implementors Guide

We need a generic type (Output[T]) that has a couple of properties:

  • A bijective function between resource.PropertyValue and Output[T] (when types align). This means that Output[T] needs to understand secrets and unknown values.
  • An ergonomic API for creating and consuming Output[T].
    • For creation, users need to be able to convert from:
      MyField *string `pulumi:"myField"`
      
      to
      MyField infer.Output[*string] `pulumi:"myField"`
      
      This cannot have schema changes.
    • For consumption, we expose the same "apply" based API as pulumix, with easy access to escape.

Implementation follows the needs of the new feature closely. There are two major areas of change:

  • I have written a custom mapper that is able to map from resource.PropertyValue to T for any T. This new mapper will preserve information when T is Output[U]. This mapper is part of the internal ende package, but ende could conceivably be moved outside of the infer package entirely.
  • I have written an infer.Output[T] type that fulfills the same basic contract as pulumix.Output[T]. infer.Output[T] is different in several ways:
    • It deserializes correctly with ende.
    • It has methods to escape from it's monadic model:
      • Output[T].Anchor() error blocks the thread until the output has resolved, if it can.
      • Output[T].GetKnown() (T, error) retrieves the known value (or an error in the computation chain). It panics if the value is unknowable.
      • Output[T].GetMaybeUnknown() (T, error) retrieves the known value or zero value (or an error in the computation chain).

@iwahbe iwahbe marked this pull request as draft November 8, 2023 18:26
@iwahbe
Copy link
Member Author

iwahbe commented Nov 8, 2023

As of ff4f178, assets, archives and resource references are not supported by the new mapper. We will need to add support for all of these to avoid a regression on merge.

@iwahbe iwahbe force-pushed the iwahbe/outputs-within-provider branch from ff4f178 to c3c09e4 Compare November 9, 2023 16:13
@iwahbe iwahbe requested review from t0yv0, Frassle, a team and AaronFriel November 11, 2023 01:17
@iwahbe iwahbe marked this pull request as ready for review November 11, 2023 01:19
@t0yv0
Copy link
Member

t0yv0 commented Nov 16, 2023

Per earlier conversation, my preferred end state here would be to use the same Go SDK for provider authoring as the program authoring. This may mean waiting on Generics, or using pre-Generic pulumi.Output forms from the current Go SDK. If we are indeed headed in that direction, introducing new forms here we'll need to retract them and refactor to the desired end-state some time afterward. This is a little unfortunate.

So the question becomes in light of this direction, is this still a change worth taking now? What are the intended use cases or provider we'd like to deploy this at, that motivates doing it now vs later, would that be something that can help with Docker Image perhaps? How about external ecosystem providers depending on pulumi-go-provider do we have a sense of how many there are and how much nuisance we're imposing by introducing these types first and then asking them to take a breaking change? Is the ecosystem /uptake small enough this is not a worry at the moment?

@iwahbe
Copy link
Member Author

iwahbe commented Nov 20, 2023

Per earlier conversation, my preferred end state here would be to use the same Go SDK for provider authoring as the program authoring. This may mean waiting on Generics, or using pre-Generic pulumi.Output forms from the current Go SDK. If we are indeed headed in that direction, introducing new forms here we'll need to retract them and refactor to the desired end-state some time afterward. This is a little unfortunate.

My concern here is that our pulumix.Output[T] generics have extremely different semantics then infer.Output[T] vis a vis escaping from the output model. We expose infer.Output[T].Anchor() error which forces async computations to "catch up" if possible. This is possible only because we know if every accessible infer.Output[T] can be resolved on creation, which is not the case for pulumix.Output[T].

I think more work is needed to check if we can unify these definitions. If pulumix.Output[T] was an open interface, I think we could include and then extend, but it isn't an interface at all. pulumix.OutputOf[T] is a closed interface (includes a private func).

So the question becomes in light of this direction, is this still a change worth taking now? What are the intended use cases or provider we'd like to deploy this at, that motivates doing it now vs later, would that be something that can help with Docker Image perhaps? How about external ecosystem providers depending on pulumi-go-provider do we have a sense of how many there are and how much nuisance we're imposing by introducing these types first and then asking them to take a breaking change? Is the ecosystem /uptake small enough this is not a worry at the moment?

I view this change (or one like it) as necessary for shipping a high quality docker provider (that handles unknowns well). I'll defer to the @blampe on if he is interested in adopting the provider framework (with or without this change) for Docker.

This change is opt-in and non-breaking for existing providers, so I'm not concerned about a break during adoption. I think that the ecosystem that is using this framework is willing to take breaking changes to be able to write better providers. The README.md still says:

This library is in active development, and not everthing is hooked up. You should expect breaking changes as we fine tune the exposed APIs.

@t0yv0
Copy link
Member

t0yv0 commented Nov 20, 2023

generics have extremely different semantics then infer.Output[T]

Can you elaborate that please?

AFAIK, every Output in a pulumi Go program is force-awaited before the program exists, so if you need Output<T> -> T that should be absolutely expressible.

Are there any other semantic differences?

BTW we do not need to wait for generic pulumi.Output[T] to land afaik as we can start using non-generic pulumi.Output perhaps as of today.

@t0yv0
Copy link
Member

t0yv0 commented Nov 20, 2023

@Frassle
Copy link
Member

Frassle commented Nov 20, 2023

I think Go is the one language where it is guaranteed that Outputs will resolve, because there's no lifting from promises.

@blampe
Copy link
Contributor

blampe commented Nov 21, 2023

I view this change (or one like it) as necessary for shipping a high quality docker provider (that handles unknowns well). I'll defer to the @blampe on if he is interested in adopting the provider framework (with or without this change) for Docker.

I'm definitely interested in anything we can do to make provider development easier and safer! I'll try to grab some time with you to learn more about how this could help docker.

BTW we do not need to wait for generic pulumi.Output[T] to land afaik as we can start using non-generic pulumi.Output perhaps as of today.

I'm still gathering context, but if pulumix.OutputOf[T] is satisfied by pulumix.Output[T] as well as pulumi.Output does that mean we could use pulumix.OutputOf[T] here to get the best of both worlds? You could use new generic types, and if you need to resolve the output you can still fall back to the older types. (There are some serialization implications I don't understand yet.)

@iwahbe
Copy link
Member Author

iwahbe commented Nov 21, 2023

AFAIK, every Output in a pulumi Go program is force-awaited before the program exists, so if you need Output -> T that should be absolutely expressible.

I need the user to have a Output<T> -> T available. If I can do that, then I can build most of what is necessary.

Are there any other semantic differences?

infer.Output[T] tracks the fields it depends on, not the resources it depends on. It doesn't need to track resources, since it is only used in the context of custom providers (which operating within a single field).

Field dependency is injected during deserialization and then merged during applies. It is then rehydrated during serialization.

BTW we do not need to wait for generic pulumi.Output[T] to land afaik as we can start using non-generic pulumi.Output perhaps as of today.

Assuming we taught pulumi.Output to track fields, we could do this. However, the experience for nested values is terrible. If we have generics, then simple nested inputs become easy:

type Input struct {
    Field infer.Output[Nested] `pulumi:"nested"`
}

type Nested struct {
    N1 infer.Output[string] `pulumi:"n1"`
    N2 infer.Output[int] `pulumi:"n2"`
}

Expressing nested values in pulumi.Output requires defining a Nested interface with a NestedOutput type. I view that as prohibitively painful.


I think Go is the one language where it is guaranteed that Outputs will resolve, because there's no lifting from promises.

Is that exposed anywhere? I would love to use pulumix.Output if possible. We would also need to resolve the type of tracked dependency.


I'm still gathering context, but if pulumix.OutputOf[T] is satisfied by pulumix.Output[T] as well as pulumi.Output does that mean we could use pulumix.OutputOf[T] here to get the best of both worlds? You could use new generic types, and if you need to resolve the output you can still fall back to the older types. (There are some serialization implications I don't understand yet.)

pulumix.OutputOf[T] is satisfied by pulumix.Output[T] and pulumni.Output[T], but it is not possible for infer.Output[T] to satisfy pulumix.OutputOf[T].

@Frassle
Copy link
Member

Frassle commented Nov 21, 2023

I need the user to have a Output -> T available.

That's not safe. Providers can get called with unknown values where there is no T. Further you lose all secret and dependency tracking when you try to exit the context like that.

The Go SDK has an UnsafeAwait for Output that returns a struct of Value, Known, Secret, Deps. But working with that is a pain, and I doubt anyone uses it fully correctly.

My opinion is any design where you try to exit the Output context is wrong. If there's something unexpressible with Apply we should fix Apply.

@t0yv0
Copy link
Member

t0yv0 commented Jan 22, 2024

Looks like this is still in my inbox, stalling a bit.

Throw a meeting to unblock? Or icebox for now?

Providers should have Output<T> -> T

I believe what's implied here is UnsafeAwaitOutput from the Go SDK pulumi/pulumi#10877

I'd think it is fair game for the provider implementation framework to call that one when needed, which is not the same thing as making it necessary to call for provider authors using the framework.

Does this help untie the knot here @iwahbe ?

@iwahbe
Copy link
Member Author

iwahbe commented Jan 22, 2024

@blampe I believe you are the current expert consumer of this library. Do you believe that this is worth pushing through, or should we put on hold.

I believe what's implied here is UnsafeAwaitOutput from the Go SDK pulumi/pulumi#10877

I still think that unless pulumi.OutputOf[T] is an unsealed and pluggable interface, we will need a different implementation:

In Pulumi program land, an pulumi.Output<T> is a value that may or may not resolved, and we can't know if it will. UnsafeAwaitOutput may hang during preview, because an output will never resolve. Output's track which resource they depend upon so we can build a dependency graph.

In pulumi-go-provider land, an infer.Output[T] is a value in one of three states:

  • resolved
  • unresolved (but will resolve)
  • won't resolve

For each infer.Output[T], we know which state it is in at any given time.

In addition, infer.Output[T] does track dependencies, but tracks dependent fields, not dependent resources.

@t0yv0
Copy link
Member

t0yv0 commented Jan 22, 2024

This are not very convincing arguments to me, as forking the foundational types is still a high cost.

  1. we don't know if pulumi.Output will resolve - every output in Pulumi afaik has to resolve, or you get | a hang semantic.

  2. you're saying you need to query if the Output is resolved or not right now, that can possibly be arranged

  3. possibly we need the Output a bit more extensible? I'm not sure what's the purpose of that but if it's to track secrets we have some alternatives? Also, presumably you'd need to track normal resource dependencies sometimes when e.g. implementing remote component resources so you need everything Output offers but you might be needing a small extension?

@Frassle
Copy link
Member

Frassle commented Jan 22, 2024

Inclined to agree with @t0yv0 here. Outputs in Pulumi programs will resolve unless users have done very bad things (infinite loops in apply for example).

You say infer.Output needs field tracking instead of resource tracking but given that the go-provider also needs to cater to Construct calls you need normal inputs for that, and I am not against extending normal outputs to track fields (it goes well with some ideas I've been having about making previews more detailed, it would be nice to say for example a field is unknown but will resolve to the same value as field X on resource Y).

@iwahbe
Copy link
Member Author

iwahbe commented Jan 22, 2024

  1. we don't know if pulumi.Output will resolve - every output in Pulumi afaik has to resolve, or you get | a hang semantic.

This is not true during preview.

  1. you're saying you need to query if the Output is resolved or not right now, that can possibly be arranged.

I'm saying that I need to query if the output is resolvable or not, which is different than resolved. I don't know if pulumi.OutputOf[T] can be augmented to track this, but I don't believe that it does right now.

  1. possibly we need the Output a bit more extensible? I'm not sure what's the purpose of that but if it's to track secrets we have some alternatives? Also, presumably you'd need to track normal resource dependencies sometimes when e.g. implementing remote component resources so you need everything Output offers but you might be needing a small extension?

We do not use infer.Output for component resources, we can use pulumi.Output for that.


To be able to use pulumi.OutputOf[T], I would need 2 things:

  1. An extension point to create field dependency info that is infectious. Something like:
v := unstable.NewOutputWithPayload[String, infer.dependency](unstable.RawOutput{
    Value: resource.PropertyOf("value"),
    Secret: true
    Computed: false,
}, infer.dependency{ fields: "valueField" })

// Give to user

i := pulumi.Apply1E(v, strconv.ParseInt)

return Args{I: I}

// Back in framework space

deps := unstable.OutputPayload[infer.dependency](result.I) // infer.dependency{ fields: "valueField" }
  1. I need to check if an arbitrary output is resolvable.

We need the following logic to be expressible:

if o.wait() will resolve; then o.wait() && return (o.value(), o.error, true)
else return (Zero[T](), o.error, false)

I actually think that (1) and (2) are solvable if pulumi.OutputOf[T] if we can add this interface:

// unstable/output.go -- somewhere in pulumi/pulumi

func NewOutputWithPayload[T Value, P any, PG PayloadManager[P]](output RawOutput, payload P, payload) { ... }

func GetPayload[P Payload](o internal.Output) (P, bool) { ... }

type PayloadManager[P] interface {
    func Merge(ctx context.Context, outputs []struct{p P, o RawOutput}) P
    func New(ctx context.Context, output RawOutput) P 
}

type RawOutput = internals.UnsafeAwaitOutputResult

Maybe I'm misunderstanding something and this is already possible.

@t0yv0
Copy link
Member

t0yv0 commented Jan 23, 2024

Can you give an example of an output that's not resolvable?

Everything passed to the provider is resolvable. In preview it may resolve to an unknown.

On top of this you have provider implementation that constructs resources and generates derived Outputs. I think in provider such as TF providers all I can think of is that all such outputs will be resolvable as well. They may resolve to the unknown value.

What's an unresolvable output? That would help me understand.

@iwahbe
Copy link
Member Author

iwahbe commented Jan 23, 2024

Can you give an example of an output that's not resolvable?

We might be talking past each-other. I think of this as unresolvable:

UnknownValue.Apply(f)

This is unresolved, but resolvable:

KnownValue.Apply(func(value int) int { sleep(value); return value })

If we allow something to resolve to unknown, then I agree that all values are resolvable.

@t0yv0
Copy link
Member

t0yv0 commented Jan 23, 2024

UnknownValue.Apply(f) === UnknownValue should hold in every Pulumi language I think. The UnknownValue output can be awaited and inspected.

In Go

result, err := UnsafeAwaitOutput(ctx, UnknownValue)
require.NoError(t, err)
require.False(t, result.Known)

@t0yv0
Copy link
Member

t0yv0 commented Jan 23, 2024

Another tangent, based on your explanation of what you mean by "will resolve", this logic seems to substitute zero values for unknowns, which is a massive footgun IMHO, admittedly one that is present in our prior designs. If we're designing from first principles, why is this ever needed to inject zero values?

if o.wait() will resolve; then o.wait() && return (o.value(), o.error, true)
else return (Zero[T](), o.error, false)

Could we force the provider author to model locations that might not be known with a type that explicitly models unknowns, e.g. an Input or an Output?

@Frassle
Copy link
Member

Frassle commented Jan 23, 2024

If we allow something to resolve to unknown, then I agree that all values are resolvable.

We do allow that. As @t0yv0 said all SDKs will resolve unknown.apply(f) to unknown.

I massively agree with Anton about being very careful about zero values around this.

We might be talking past each-other.

It might be this is at the point where we really ought to have a meeting about it. Provider/Core ideation is coming up, maybe enough to just be a point in that.

@blampe
Copy link
Contributor

blampe commented Jan 23, 2024

Maybe it would help to clarify the primary problem this is aimed at? I see a few potential benefits:

  1. Better plumbing of secrets throughout provider code (to prevent accidental logging, etc.).
  2. Allow providers to distinguish unset versus not-yet-resolved values during previews. (I think this is only relevant for previews?)
  3. Allow providers to schedule Apply() callbacks before things have fully resolved.
  4. ...anything else?

(1) seems like a good thing, but it also seems like we could accomplish this with existing types. For example we should be able to give the user the option to hydrate secrets as Outputs if they ask for them:

type Foo struct {
    RawSecret     string                 `pulumi:"raw,optional" provider:"secret"`
    WrappedSecret pulumix.Output[string] `pulumi:"wrapped,optional" provider:"secret"`
}

I might go further and say this deserves its own Secret type since pulumix.Output is fairly low-level and most of its functionality (e.g. Apply) isn't relevant here.

(2) I haven't run into yet, although I also haven't been focused on preview behavior. I do anticipate some gotchas around validation, and my plan is to basically special case zero-values during previews.

I do think it would be helpful to allow a user to determine if a value is unknown, although I don't think we need all the Output machinery it accomplish that. This feels similar to the "null" problem in JSON which https://github.com/aarondl/opt/ and other similar libraries help with. Similar to the example above, I could imagine an Optional helper the user could use if they need it:

type Foo struct {
    OptionalRaw    string                   `pulumi:"raw,optional"`
    Optional       Optional[string]         `pulumi:"opt,optional"`
    OptionalSecret Optional[Secret[string]] `pulumi:"secret,optional" provider:"secret"`
}

I realize you can accomplish essentially the same thing with Output but there is a lot of value in scoping the API down to something small and obvious.

(3) is a non-goal IMO. As a user I would much rather the harness handle resolving everything before invoking my code versus me needing to know or care about unresolved things (the exception being (2) above). If we do really need a micro-optimizations like this (I'm skeptical), I would prefer that to be an alternative code path instead of sacrificing simplicity in the common case.

@Frassle
Copy link
Member

Frassle commented Jan 23, 2024

my plan is to basically special case zero-values during previews.

:screaming-intensifies:

Optional is not the same as zero is not the same as unknown. These are three very distinct values, please for all that's holy can we design a system that understands that.

I get that we're working with Go and it's a hot mess for specifying anything but I'd hope we can do better than just sticking with Go's (frankly stupid) opinion that zero values are always good sensible defaults and can just be used for everything.

I might go further and say this deserves its own Secret type since pulumix.Output is fairly low-level and most of its functionality (e.g. Apply) isn't relevant here.

Inputs to providers are either plain (not secret, not unknown) or output'y (potentially unknown, potentially secret). I don't think there's any cases where we allow secret but don't allow unknown.

Allow providers to schedule Apply() callbacks before things have fully resolved.

We really need to define what we mean by resolved.

As far as I know all inputs to providers are always fully resolved. The engine does not send async values to providers, it waits for all inputs to resolve and then sends those resolved values. Even Call and Construct wait for resolution.

But when I say resolved I mean it has a value, but that value might be "unknown". I suspect most of you are using resolved for "not unknown"?

Either way, there's no value for callbacks in providers. The values either known or not and you know that as soon as you're called and it's not going to asynchronously change.

@t0yv0
Copy link
Member

t0yv0 commented Jan 23, 2024

Haha, I start observing words like holy and since religious feelings are involved we should definitely meet before writing any more :) (oh I feel those strong feeling also).

@t0yv0
Copy link
Member

t0yv0 commented Jan 23, 2024

BTW I fully expect this design space is a solution that satisfies all the comments.. And Bryce's and Fraser's comments both have merit. And Ian had some more information out of band. Take it easy folks we'll sort this one out!

@iwahbe
Copy link
Member Author

iwahbe commented Jan 23, 2024

The goal is (1) and (2) with Apply being a useful (and optional) method to operate on (1) and (2).

I have currently implemented Apply as asynchronous, but I agree that the benefits there are dubious.

I think that we have spent enough time here that this is worth a synchronous meeting. I'll write up a doc and present during IaC ideation.

my plan is to basically special case zero-values during previews.

This does convince me that a change like this is well motivated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants