Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assert failing in WebSocketProvider.connect after repo.find #92

Closed
jessegrosjean opened this issue May 12, 2024 · 6 comments · Fixed by #93
Closed

Assert failing in WebSocketProvider.connect after repo.find #92

jessegrosjean opened this issue May 12, 2024 · 6 comments · Fixed by #93
Assignees
Labels
bug Something isn't working

Comments

@jessegrosjean
Copy link
Contributor

This reproduces the problem every time for me:

func testFind() async throws {
    let repo = Repo(sharePolicy: SharePolicy.agreeable)
    let websocket = WebSocketProvider(.init(reconnectOnError: false, loggingAt: .tracing))
    await repo.addNetworkAdapter(adapter: websocket)
    try await websocket.connect(to: URL(string: "wss://sync.automerge.org/")!)
    try await repo.find(id: .init())
}

The assert failure is in WebSocketProvider.connect at assert(peeredCopy == false).

@jessegrosjean jessegrosjean changed the title Seeing assert failing when using WebSocketProvider.connect after repo.find Assert failing in WebSocketProvider.connect after repo.find May 12, 2024
@heckj heckj self-assigned this May 12, 2024
@heckj heckj added the bug Something isn't working label May 12, 2024
@heckj
Copy link
Collaborator

heckj commented May 12, 2024

Now I'm severly confused. I've got a PR that resolves the assertion failure, but as I'm testing it - with an equivalent bit of code for the pending PR (#85) that previously worked for me - now i'm very unclear on what the expected behavior is.

I'm feeling like I'm missing something here, big time. Not sure what the difference was, and I've tested against a local instance of Automerge-repo-sync-server as well as the current sync.automerge.org service.

@jessegrosjean look through the integration test I added in #93, and verify for me that it jives with what you're using - a variation of this bug with some explicit assertions embedded in it.

@heckj
Copy link
Collaborator

heckj commented May 12, 2024

Actually, I have a theory about this. On the walk home, I remembered that in earlier work I filed an issue about getting first an "Unavailable" message from the remote server, followed by a sync request with content. I don't believe that's changed, but I may be calling the document "resolved" and handing back the result before I get the sync message coming in that would say "Oh wait, I've got something". I may need to re-think how that whole flow operates with that behavior, or push on that behavior if that's sending an erroneous message that it shouldn't.

I know that doesn't leave you in a great place with how to use this, and me just as confused - not to mention a pending question of "how long do we wait to know if it's really not found or it it is?"

@jessegrosjean
Copy link
Contributor Author

@jessegrosjean look through the integration test I added in #93, and verify for me that it jives with what you're using - a variation of this bug with some explicit assertions embedded in it.

That integration test looks right to me. With that said I'm not 100% confident that what I'm doing with find really has the semantics that I want. It's a little unclear to me exactly how find is supposed to work. And hard to get into my head that my local repo and the web socket connected repo are peers... not client server.

This is also related to the forking discussion earlier. I will try to write up related thoughts/questions tomorrow.

I know that doesn't leave you in a great place with how to use this, and me just as confused - not to mention a pending question of "how long do we wait to know if it's really not found or it it is?"

I'm still just experimenting for fun when I should be working on other things, so don't worry about me :)

@heckj
Copy link
Collaborator

heckj commented May 12, 2024

I'm digging through the tests that define repo behaviors at https://github.com/automerge/automerge-repo/blob/6ef25d8f300133bfefc71b44369cc222bcb43f0c/packages/automerge-repo/test/Repo.test.ts#L207

I'm not sure I'm reading it correctly, but how it reacts to a local JS repo appears to be different from what we're seeing with the behavior that creates a new document on find with a random DocumentId. These tests aren't 1:1 with our integration tests, as they're not interacting over the network, but that's seems to be the gist.

@jessegrosjean
Copy link
Contributor Author

jessegrosjean commented May 13, 2024

This is also related to the forking discussion earlier. I will try to write up related thoughts/questions tomorrow.

I've been posting more thoughts in #79. I think the summary related to this issue is that NSDocument probably doesn't want to be using full repo.find() implementation when loading in document data. In particular I don't think we want to wait for network timeouts with peers when loading data. Instead just want to check with local storage. Probably this is all best done with a new repo method that combines some of the existing create and some of the existing find behavior.

Of course still want find to generally work :)

@heckj heckj closed this as completed in #93 May 13, 2024
@heckj
Copy link
Collaborator

heckj commented May 13, 2024

I dug around through the Javascript code and explored more of the testing, and I'm more convinced now than ever that the "create a new document" result we saw earlier was incorrect, although I'm at a loss as to what I had in place that enabled that result.

The API of find() in the javascript side returns a DocHandle, notably different from what Swift returns, in that the JS DocHandle is an object that emits events there. The internal concept of find() is that the request is transitive, and may come back at a later time, so "unavailable" is a current state, but not a final state in that scenario. As such, the dochandle emits events for "unavailable" when a peer in that chain says "Sorry, no beuno", but remembers the request and if it's peers find something, it CAN return the document. So in the JS world, it's far more like an active subscription for a possible eventual document, rather than a fully resolved endpoint.

Based on that, and continuing to poke at the Integration tests, I thought the PR was worthy to merge - it fixed the assert issue you saw, cleaned up the logic a bit, and puts a stake in the ground as to expectations.

There's plenty of more questions about what/how automerge-repo does with it's transitive peer links, and how find can/should work - in particular, per automerge/automerge-repo#343, I suspect the the JS WebSocker provider may be returning a temporary unavailable message before it's finished checking with any/all connected peers, but tracking that down it and verifying that for the JS side is writing a number of integration tests in JS to verify and provide a JS-local test scenario, so for the time being I'm leaving that alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants