Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend): Add graph/node id & execution id on CreditTransaction table #9217

Open
wants to merge 3 commits into
base: zamilmajdy/secrt-1014-scalability-implement-transactions-snapshotting
Choose a base branch
from

Conversation

majdyz
Copy link
Contributor

@majdyz majdyz commented Jan 8, 2025

We need to be able to determine the cost of graph/node execution.

Changes 🏗️

  • Add these columns into CreditTransaction metadata column:
    • graph_id
    • node_id
    • graph_exec_id
    • node_exec_id
    • block_id
  • Drop the blockId column and backfill the dropped value into metadata->>block_id.
  • Frequent queries on these values will require an index created on demand through a migration, depending on the use case.

@majdyz majdyz requested review from Pwuts and kcze January 8, 2025 03:19
@majdyz majdyz requested a review from a team as a code owner January 8, 2025 03:19
@github-actions github-actions bot added platform/backend AutoGPT Platform - Back end size/l labels Jan 8, 2025
Copy link

qodo-merge-pro bot commented Jan 8, 2025

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Data Migration

The migration from blockId column to metadata.block_id needs careful validation to ensure no data is lost during the transition and that all queries are updated accordingly.

async def _add_transaction(
    self,
    user_id: str,
    amount: int,
    transaction_type: CreditTransactionType,
    is_active: bool = True,
    transaction_key: str | None = None,
    metadata: Json = Json({}),
):
    async with db.locked_transaction(f"usr_trx_{user_id}"):
        # Get latest balance snapshot
        user_balance, _ = await self._get_credits(user_id)
        if amount < 0 and user_balance < abs(amount):
            raise ValueError(
                f"Insufficient balance for user {user_id}, balance: {user_balance}, amount: {amount}"
            )

        # Create the transaction
        transaction_data: CreditTransactionCreateInput = {
            "userId": user_id,
            "amount": amount,
            "runningBalance": user_balance + amount,
            "type": transaction_type,
            "metadata": metadata,
            "isActive": is_active,
            "createdAt": self.time_now(),
        }
Error Handling

The credit spending logic lacks explicit error handling for cases where credit transactions fail or insufficient credits are available.

s = input_size + output_size
t = (
    (res.end_time - res.start_time).total_seconds()
    if res.end_time and res.start_time
    else 0
)
data.data = input_data
db_client.spend_credits(data, s, t)

ALTER TABLE "CreditTransaction" DROP CONSTRAINT "CreditTransaction_blockId_fkey";

-- Update migrate blockId into metadata->"block_id"
UPDATE "CreditTransaction" SET "metadata" = jsonb_set("metadata"::jsonb, '{block_id}', to_jsonb("blockId")) WHERE "blockId" IS NOT NULL;
Copy link
Member

@Pwuts Pwuts Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model CreditTransaction {
    // ...
    metadata Json?
}

Does jsonb_set("metadata"::jsonb, ...) work if metadata is currently NULL?

Copy link
Member

@Pwuts Pwuts Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robot says to use COALESCE("metadata"::jsonb, '{}'):

Suggested change
UPDATE "CreditTransaction" SET "metadata" = jsonb_set("metadata"::jsonb, '{block_id}', to_jsonb("blockId")) WHERE "blockId" IS NOT NULL;
UPDATE "CreditTransaction"
SET "metadata" = jsonb_set(
COALESCE("metadata"::jsonb, '{}'),
'{block_id}',
to_jsonb("blockId")
)
WHERE "blockId" IS NOT NULL;

Copy link
Member

@Pwuts Pwuts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand we can't make data out of thin air, but can we still use an actual column for node_execution_id? These columns can be optional for now, and I think we should be able to make them non-optional (and archive old CreditTransactions to a different place, e.g. a different table or some other system) after closing off the current fiscal year.

@majdyz
Copy link
Contributor Author

majdyz commented Jan 9, 2025

The transaction is not always related to the node execution. It can be a top-up, or any random adjustments. Or maybe another custom transaction. So there is no way to make it non-optional.

What's the need for the actual column? I can add an auto generate column value from the metadata column if you really need to have it materialized, e.g:

ALTER TABLE "CreditTransaction" ADD COLUMN "nodeExecId" TEXT GENERATED ALWAYS AS ("metadata"->>'node_exec_id') STORED;

And add the 'nodeExecId` in prisma schena as an actual column or even index it.
Would that suffice? But what's the need? My main concern is I don't want to add much foreign key constraints for this table, this table is going to be big with many writes.

@majdyz majdyz requested a review from Pwuts January 9, 2025 16:24
@Pwuts
Copy link
Member

Pwuts commented Jan 9, 2025

Ack on this table not being for just node execution credit transactions, I was missing that bit of context.

My concern is that "informal" relationships which aren't stored in actual columns means we can't use Prisma to query them together afaik. If we can make that work with one of the solutions you proposed I'm happy.

@majdyz
Copy link
Contributor Author

majdyz commented Jan 9, 2025

Yes, adding

  nodeExecId String? @default(dbgenerated())

on the schema, and

ALTER TABLE "CreditTransaction" ADD COLUMN "nodeExecId" TEXT GENERATED ALWAYS AS ("metadata"->>'node_exec_id') STORED;

on the migration, works, if needed in the future.

But I still think not doing it and querying directly using Prima still be a simpler choice:
https://www.prisma.io/docs/orm/prisma-client/special-fields-and-types/working-with-json-fields#filter-on-a-json-field-advanced

Unless we really have a strong use case of querying it through a relational entity, do we actually need that? can't we just query those separately?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants