You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sooo...
After submitting this PR #1674 I looked at the code and it seems there are more issues of this nature.
The problem is that sometimes code that holds an open connection to the DB calls another component that tries to open another connection.
As an example: AccountRecordProcessor is constructed with a Connection, yet in a call to StorageSyncModels.localToRemoteRecord it passes account.getConfigurationStore(), which will try to create a new Connection with keyValueStore.
This particular example might not be critical but there are more issues like that and under load they produce errors while working with storage now and then.
For us, it leads to complete session degradation within a week with the client being completely unable to decrypt incoming messages from known contacts.
I modifier Database.getConnection and found some more deadlocks (basically, SQLBUSY exceptions on store modifications):
private static final Map<String, String> OpenConnections = new LinkedHashMap<>();
public final Connection getConnection() throws SQLException {
var id = UUID.randomUUID().toString();
var stack = Arrays.stream(Thread.currentThread().getStackTrace()).map(Objects::toString).collect(Collectors.joining(", "));
logger.trace("getConnection: {} on {}", id, stack);
OpenConnections.put(id, stack);
final Connection connection;
try {
connection = dataSource.getConnection();
} catch (SQLException e) {
logger.error("getConnection() failed for {}, possible deadlock. Open connections (the last one is failed):\n {}",
id,
OpenConnections.entrySet().stream().map(kv -> kv.getKey() + ": " + kv.getValue()).collect(Collectors.joining("; \n")),
e);
throw e;
}
return wrap(logger, id, connection);
}
public static Connection wrap(Logger logger, String id, Connection originalConnection) {
return (Connection) Proxy.newProxyInstance(
originalConnection.getClass().getClassLoader(),
new Class[]{Connection.class},
(proxy, method, args) -> {
if ("close".equals(method.getName())) {
logger.trace("Connection.close() for {}", id);
OpenConnections.remove(id);
}
return method.invoke(originalConnection, args);
}
);
}
There are many ways to address the issue, e.g.
"open" a connection for each operation never holding/passing it around
using interfaces/inheritance/wrappers to hide the fact that we're using an existing connection from other components (can be done in many ways). I have done some exercises with the code using this approach and so far I don't see error logs. Will do more testing today/tomorrow. Can submit a patch from what I did so far if you are interested.
the best way I can come up with is using an "opaque" db/store context which can provide a connection. The top-level context is constructed in jobs or on message arrival and we don't store it to account but rather pass it around to all methods that require access to DB (to avoid cluttering the code we can pass some other useful things in that context, like loggers, account itself etc). Then when we need an ad-hoc connection, there's a context.withConnection(c -> { ...}) method and if we want to do multiple operations we try (var ctx = ctx.connect()) {} which produces a context of the same type but with an already opened connection. This will allow to have functions, components, stores and helpers that can access DB but are unaware in which context they are called (with or without opened connection). So the code will be really simple and safe.
The text was updated successfully, but these errors were encountered:
The signal-cli database implementation started the first point, i.e. get a new connection for each operation. Then when implementing the storage sync it was extended to use existing connections so reading from storage could be done in a single transaction to prevent partial updates in case of failures.
The current model with some methods taking a connection and some creating a new one is a result of that. And that has indeed been a bit error prone ...
I'd be interested to see your changes.
Adding an additional context would probably be a bigger change, though maybe it could also be implemented with ThreadLocals or ScopedValues.
Sooo...
After submitting this PR #1674 I looked at the code and it seems there are more issues of this nature.
The problem is that sometimes code that holds an open connection to the DB calls another component that tries to open another connection.
As an example:
AccountRecordProcessor
is constructed with a Connection, yet in a call toStorageSyncModels.localToRemoteRecord
it passesaccount.getConfigurationStore()
, which will try to create a new Connection withkeyValueStore
.This particular example might not be critical but there are more issues like that and under load they produce errors while working with storage now and then.
For us, it leads to complete session degradation within a week with the client being completely unable to decrypt incoming messages from known contacts.
I modifier
Database.getConnection
and found some more deadlocks (basically, SQLBUSY exceptions on store modifications):There are many ways to address the issue, e.g.
account
but rather pass it around to all methods that require access to DB (to avoid cluttering the code we can pass some other useful things in that context, like loggers, account itself etc). Then when we need an ad-hoc connection, there's acontext.withConnection(c -> { ...})
method and if we want to do multiple operations wetry (var ctx = ctx.connect()) {}
which produces a context of the same type but with an already opened connection. This will allow to have functions, components, stores and helpers that can access DB but are unaware in which context they are called (with or without opened connection). So the code will be really simple and safe.The text was updated successfully, but these errors were encountered: