-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency issues #31
Comments
I'd be interested to see a test case when you have a chance (maybe paste a version it in JavaScript :) as some of us can't easily read coffee) |
Waiting on the returned promise is not practical - the I'll post a testcase (in JS) later - I mostly just dumped my Coffeescript code here because that's what my own codebase is written in, and it's better than nothing. |
@joepie91 no offense, but a major point of nodejs is for the code-execution to be asynchronous or non-blocking, since, by default-design, nodejs is single threaded. So, using the term "concurrent" is not very accurate in a single thread (unless you're talking clusters over a multi-core system) However, I think you indirectly bring a good point, "node-persist" does not support "atomic transactions" - which is totally expected since node-persist is running in the same thread as the application using it, (where any other DB system have it's own background service, i.e. mysql, redis, postgres, mongo etc..) That does not mean it's not possible, I think we can implement a node-service to do that or maybe just queue system (which would be less efficient), but possible. |
@joepie91 It sounds like you may want to build a queue system, and after adding to the queue, return a promise. Fulfill the promise when that item in the queue is processed. Of course, something like that could also be built into the core of node-persist, no? |
In other words, the calls to setItem could be asynchronous, but the database writes could be synchronous. |
Oh wait, a queue system is what @akhoury said. Oops. |
I am referring to concurrent writes, which is a thing that Node.js absolutely does.
This is a separate concern - regardless of whether it supports atomic transactions, it should ensure that it does not corrupt the file (and currently, it corrupts the file on concurrent writes).
A batched queue would possibly be a solution; ie. on every |
when it shouldn't, fair enough. @joepie91 I like the batched-queue idea. I am currently on vacation for another month. if no ones beats me to it, I might take a stab at it when i get back. |
As a general rule, files are hard to deal with. |
This is frustrating! Using getItemSync is the dirty way which will block an entire application! |
at this point, I am not so sure a queue is enough, we might need some job processing mechanism that also spawns child processes, writes a batch and then close them? but that seems like an overkill. Also, with batch writing means that if the app crashes, some write-requests may get lost. Also, that doesn't solve the issue with different apps using the same storage dir. The best solution is to run as a separate service with the same node api to talk to, but now we're implementing a database engine, which I really don't want to do. Really node-persist is not designed for this, it's more like a localStorage of the browser, if you want a database, i think you should use a database. But going back to the original issue filed, a smart queue might solve OP's issue ONLY. |
@akhoury I don't think we need anything more than this issue. Everyone knows what node-persist is and I don't think anybody expects it to be write-safe regarding external writes. But being write-safe within the same app, given node's asynchronous nature, looks like a necessary feature. |
I'm also having a problem with this. I chose node-persist for a small project, thinking it would be the fastest way to get up and running. Ran into a corrupted file while running tests. If we're undecided on how to address it, perhaps we could at least add a warning to the docs? Edit: It's important to stay on task here. All other concerns aside (eg. collisions with other apps), node-persist should be able to ensure that a basic application doesn't step on it's own toes when it calls setItem(). |
I wrote a solution FYI at #93 |
I've implemented serializing get and set operations per key. It is in typescript and requires node 10 because it uses asynchronous generators. But you can transcompile it to older nodes. import storage from 'node-persist';
function createStorage<T>(name: string) {
interface ISetOperation {
type: 'set';
data: T;
}
interface IGetOperation {
type: 'get';
}
type Union = ISetOperation | IGetOperation;
const executor = createExecutor();
executor.next();
async function get() {
return (await executor.next({ type: 'get' })).value as T;
}
async function set(data: T) {
await executor.next({ type: 'set', data });
}
async function* createExecutor(): AsyncIterableIterator<T | undefined> {
let result;
for (;;) {
const item: Union = yield result;
switch (item.type) {
case 'get':
result = await storage.getItem(name);
break;
case 'set':
await storage.setItem(name, item.data);
result = undefined;
break;
}
}
}
return { get, set };
} Example of use: import { IHttpLog } from '~/types';
import { createStorage } from '~/storage';
const httpLogsStorage = createStorage<IHttpLog[]>('httpLogs');
export async function loadHttpLogs() {
return await httpLogsStorage.get();
}
export async function saveHttpLogs(logs: IHttpLog[]) {
await httpLogsStorage.set(logs);
} |
Hitting the same issue. Maybe node-persist could make use of |
Also encountered this issue. I was also using |
I don't have time right now to create a PR but in my case I apparently fixed the issue by using const limiter = new Bottleneck({
minTime: 1,
maxConcurrent: 1,
});
export async function createCache(cachePath: string) {
await storage.init({
dir: cachePath,
ttl: 1000 * 60 * 60 * 24, // 24hrs
});
}
export const put = limiter.wrap(
async (key: string, value: any, opts: CacheOptions = {}) => {
return await storage.setItem(key, value, opts as any);
}
);
export const get = limiter.wrap(async (key: string) => {
const value = await storage.getItem(key);
return value;
}); |
When doing heavy concurrent writes/persists that progressively decrease in size, files can sometimes become corrupted as newer data only partially overwrites older data (as there is no locking mechanism in place).
I can write a testcase later if needed, as I'm currently rather busy, but I figured I'd file an issue before I would forget.
I did write a very hacky monkeypatch (in Coffeescript) that at least ensures write queueing:
It's very inefficient - it doesn't attempt to 'deduplicate' queued writes and it always persists the last known version at the time of persisting, so under heavy concurrency it'll likely end up writing identical data many times... but at least it doesn't corrupt the files on disk :)
EDIT: For clarification, these monkeypatched methods I wrote are what caused me to discover the issue:
Rapidly calling
removeListItem
several times (eg. like would happen in a queue) would result in the aforementioned writes that progressively decrease in size.The text was updated successfully, but these errors were encountered: