Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction 2 #199

Merged
merged 12 commits into from
Oct 2, 2021
Merged

Transaction 2 #199

merged 12 commits into from
Oct 2, 2021

Conversation

nappa85
Copy link
Contributor

@nappa85 nappa85 commented Sep 26, 2021

#142

This is a HUGE rework of Transactions, that permits nested transactions, streams in transactions, etc...
There is unsafe code, not much, should be safe, but needs a lot of testing.
To avoid unsafe code we'll probably need a design rework.
This PR is based on the streams one, probably will be a mess of conflicts, I just need to know what you think about and decide how to go on

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 27, 2021

Good morning. Thank you again for the pioneering effort.

I quite liked the original transaction-1. Indeed, I am probably about to merge after more testing.
We don't have to rush on this transaction-2.
But including unsafe is a fat NO, because SQLx itself is very insistent to be #![forbid(unsafe_code)] (unless SQLite is involved for good reason), so limiting unsafe to only SQLite is probably reasonable enough, otherwise I imagine a crowd chanting with NO UNSAFE somewhen in future.

Back to square 1, was that the lifetime-hell example the root problem you were facing?

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 27, 2021

My first transaction implementation was inconplete, wrapper inside a Mutex it wasn't possible to have a Transaction type that didn't lived inside a pinboxed closure, and it wasn't possible to use it for streaming, because the mutable reference was limited by the Mutex.
The unsafe is there, instead of a Mutex there is an UnsafeCell that permits to leak the mutable reference where needed.
And the mutable reference is needed because of sqlx APIs, that conflict with our APIs, hence my declaration "to not use unsafe we'll need to redefine our APIs".
The unsafe is only inside DatabaseTransaction, 3 or 4 LOC at max, and should be safe because it's managed, there should be no concurrent accesses, etc...

As I already said, but maybe yesterday evening I was too tired to be expressive, this PR is to discuss the road, I think the improvement is huge and, if only UnsafeCell is the problem, I can try to find another solution, maybe with GhostCell

@tyt2y3 tyt2y3 changed the base branch from transaction-1 to transaction-2 September 27, 2021 06:36
@tyt2y3
Copy link
Member

tyt2y3 commented Sep 27, 2021

Thank you for the clarification.

I understand that.
I'd ask, would it be possible to separate the implementation of transaction vs normal connection?
Instead of an enum holding all inner connections, the InnerConnection only applies when a transaction is involved.
That way, at least for the normal connection case, it'd be as simple and safe as before.
The only complication is inside transactions, where an Inner Cell is needed.

Well, I have read the GhostCell paper a while ago, and I'd be very interested if it is indeed the solution to our current situation. Quite cool indeed!

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 27, 2021

Well, the normal connection has a completely safe implementation also now, the same applies to InnerConnection, UnsafeCell is only inside DatabaseTransaction.
I'll look into GhostCell as soon as I can

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 27, 2021

I've checked a few things:

  • I don't feel like GhostCell is a feasible solution for this problem (or maybe I haven't understood it at all).
  • I've tried to test this branch with Miri to check the soundness of the unsafe code, but Miri and async code doesn't seems to love each other.
  • Using Transanction-1 branch, I've ended up in another lifetime hell without doing anything special, I don't think you should put that in production.
  • I'm more and more convinced we need an API review, and now I'm going to explain you why.

The main problem with actual implementation, is that inside DatabaseConnection you take a connection from the pool, use it and drop it. This way you're safe from every problem, but you're forced to a "flat" structure.
By flat structure I mean you can't send out something (e.g. a transaction) that owns a sqlx connection, because your APIs always take &self, and sqlx needs a mutable reference of the connection to be able to execute queries.
Using a Mutex to be able to gain a mutable reference of an inner sqlx connection (like I've done in Transaction-1) excludes any kind of operation that would return a wrapped reference to the connection, like nested transactions or streams.

Another hidden bomb is Sqlite: Sqlite sqlx connections aren't Sync.
In this PR I've tried to let everything not be Send (you can see various #[async_trait(?Send)] around), this way sqlite won't be a problem, but in a real-life usage your futures need to be Send to run in an multithreaded runtime.

A lesson learnt from this PR is that sqlx Transactions can be easily de-structured, sqlx has a TransactionManager trait that manages what's executed on a connection to make it a transaction, therefore you don't really need a sqlx Transaction object, you just need a sqlx connection object and to know which TransactionManager implementation call.
But the problem remains, once we return a sqlx connection, we lose Sync if sqlite is enabled, and we need to obtain a mutable reference to be able to execute queries on it, while maintaining a common API with DatabaseConnection (actual ConnectionTrait)

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 27, 2021

Another test I've done is to impl ConnectionTrait on &DatabaseConnection and &mut DatabaseTransaction, this way we would be able to differentiate on mutability only where needed.
Unfortunately, mutable connections aren't Copy, so where we use the generic, we can't ask for Copy, and the reference would be considered consumed at first use.
More, mutable references aren't Sync.

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 27, 2021

Thank you for the hard work indeed.

Yes, we'd better scrap the 'transaction in a closure' API or only provide it as a convenient helper, but not as the universal solution.

May be we should return to the start, commit & rollback style, where 'start' returns an owned Transaction object wrapping a connection with Arc< Mutex > (thus can be mutated), whose lifetime shall not exceeds the parent pool, and automatically rollback if goes out of scope.

That'd still be 'flat'?

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 27, 2021

With Arcmutex you mean a normal Arc<Mutex> and cloning the Arc instead of returning a reference?

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 27, 2021

My bad for the poor formatting. Yes, otherwise there is no other way to hide the mutability I suppose.

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 27, 2021

Well, it could actually work...
I'm asking myself why I haven't thought of that myself

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 28, 2021

Hopefully it'd work.
Okay so we abandon transaction-1 for now?
I am cherry-picking the useful bits.

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 28, 2021

Hopefully it'd work.

I'm trying right now, but in fact now I remember why I didn't considered that technique: using a Mutex you can't "leak" a reference, therefore you can't have streams

Okay so we abandon transaction-1 for now?
I am cherry-picking the useful bits.

Well, this branch is based on streams that is based on transaction-1...
Maybe it's better to flatten everything and create a new PR, when we'll have a viable solution

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 29, 2021

I tried to think about that yesterday.
But I guess we might figure out how to fit streaming support after transaction is in place?
Or better the other way around?

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 29, 2021

I have an idea, maybe today I'll be able to test it, I'll let you know

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 29, 2021

Well, the only test failed is on the Rocket example, that I haven't touched, for reasons not related to my changes...
Take a look at the code, let me know if that's acceptable, I'll make some more tests during the day.
Probably you'll want me to make a new PR to clear the commit history and rebase to actual master.

DatabaseTransaction contains an Arc<Mutex<InnerConnection>> as you suggested, opening a stream on a transaction locks the transaction, so there can be deadlocks if someone opens a stream on a transaction and, before consuming it, tries to do something else with the transaction. We'll have to write it in the docs.
I had to make two different stream implementation, one with an owned connection and one with an inner mutex lock on a transaction.

There is still an uncovered path, and it's on drop of a nested transaction, actually there is an unreachable! macro, https://github.com/nappa85/sea-orm/blob/transaction-2/src/database/db_transaction.rs#L150
The problem is that the lock is async, but I'm not in an async closure because Drop isn't async.
Maybe I can use try_lock, that isn't async, and if it returns None panic because we are dropping a locked connection, but I'm a bit uncomfortable about that...

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 29, 2021

Thank you so much for the hard work!
I am going over the elaborate changeset.

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 30, 2021

My tests are all green, let me know if you've any doubt

@tyt2y3 tyt2y3 mentioned this pull request Sep 30, 2021
3 tasks
@tyt2y3
Copy link
Member

tyt2y3 commented Sep 30, 2021

Yup, I am making the final 0.2 release now.
After that, this will be rebased onto master to prepare for 0.3!
Oh, and, are we resolving Transaction + Stream at the same time?
TBH I am still studying the Mutexes.

Other than that, about the MockConnection, I hope that it can mimick the behavior, say there is a transaction with 2 statements executed on a MockConnection, the into_transaction_log will be like:

assert_eq!(
    db.into_transaction_log(),
    vec![
        Transaction::many([
            Statement::from_sql_and_values(
                DbBackend::Postgres,
                r#"SELECT "cake"."id", "cake"."name" FROM "cake" LIMIT $1"#,
                vec![1u64.into()]
            ),
            Statement::from_sql_and_values(
                DbBackend::Postgres,
                r#"SELECT "cake"."id", "cake"."name" FROM "cake""#,
                vec![]
            )
        ])
    ]
);

I am still thinking of how to implement this, perhaps a MockTransaction object that impl ConnectionTrait but store the statements within a buffer, and somehow push into the MockDatabase on commit.

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 30, 2021

Yup, I am making the final 0.2 release now. After that, this will be rebased onto master to prepare for 0.3! Oh, and, are we resolving Transaction + Stream at the same time? TBH I am still studying the Mutexes.

Yes, streams are implemented here.

I am still thinking of how to implement this, perhaps a MockTransaction object that impl ConnectionTrait but store the statements within a buffer, and somehow push into the MockDatabase on commit.

At the moment a transaction for a Mocked connection is transparent, but yes, a buffer could be a good solution, merging it on commit and dropping it on rollback

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 30, 2021

Can you explain briefly what the MutexGuard do?

@billy1624
Copy link
Member

Can you explain briefly what the MutexGuard do?

https://doc.rust-lang.org/std/sync/struct.MutexGuard.html ?

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 30, 2021

Can you explain briefly what the MutexGuard do?

MutexGuard is the owner of the mutable reference coming from Mutex, once MutexGuard is dropped, Mutex is unlocked.
https://docs.rs/futures/0.3.17/futures/lock/struct.MutexGuard.html

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 30, 2021

So, while the TransactionStream is running, the InnerConnection will be locked. And as soon as the Stream finishes, the lock will be released, am I right?

But what's the difference to QueryStream which does not require a MutexGuard?

@nappa85
Copy link
Contributor Author

nappa85 commented Sep 30, 2021

QueryStream has an owned connection inside, taken directly from the pool. Like any other operation outside a Transaction, a new connection is taken from the pool every time, used and then returned to the pool. QueryStream is little different, it keeps the connection for the stream lifetime

@tyt2y3
Copy link
Member

tyt2y3 commented Sep 30, 2021

Thank you for the explanation. It helped a lot.

@tyt2y3 tyt2y3 mentioned this pull request Oct 1, 2021
@Stumblinbear
Copy link

I know this is a pretty new PR, and is likely to undergo many changes until merge, but has thought been put into transaction isolation support?

@tyt2y3
Copy link
Member

tyt2y3 commented Oct 2, 2021

I know this is a pretty new PR, and is likely to undergo many changes until merge, but has thought been put into transaction isolation support?

Thank you for raising. Sure we will get to that.

@tyt2y3 tyt2y3 merged commit 9b9c217 into SeaQL:transaction-2 Oct 2, 2021
@tyt2y3 tyt2y3 mentioned this pull request Oct 4, 2021
3 tasks
@tyt2y3
Copy link
Member

tyt2y3 commented Oct 13, 2021

I know this is a pretty new PR, and is likely to undergo many changes until merge, but has thought been put into transaction isolation support?

See launchbadge/sqlx#481

@billy1624
Copy link
Member

We could support transaction isolation if SQLx have it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants