Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add document hard deletion functionality to housekeeping tasks. #718

Draft
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

fourjae
Copy link
Contributor

@fourjae fourjae commented Dec 10, 2023

What this PR does / why we need it:
A housekeeping task was added to automatically physically delete documents at certain intervals.

Which issue(s) this PR fixes:

Fixes # #706

Although it does not directly solve the above issue, I think it is somewhat related to the need to improve the housekeeping structure.
I think the housekeeping structure could be improved if necessary.

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Additional documentation:


Checklist:

  • Added relevant tests or not required
  • Didn't break anything

Summary by CodeRabbit

  • New Features

    • Enhanced housekeeping functionality for managing document deletions and client deactivations with more granular control.
    • New configuration parameters for housekeeping intervals and limits, including document deletion grace periods.
    • Introduction of methods for identifying and deleting documents eligible for hard deletion based on project specifics.
  • Bug Fixes

    • Improved error handling during document deletion processes.
  • Documentation

    • Updated configuration files and sample files to reflect new housekeeping parameters and their usage.
    • Expanded test coverage to validate new configuration structures and parameters.

@fourjae fourjae marked this pull request as ready for review January 28, 2024 15:56
@krapie krapie self-requested a review January 28, 2024 16:01
Copy link

codecov bot commented Jan 29, 2024

Codecov Report

Attention: Patch coverage is 37.55459% with 143 lines in your changes missing coverage. Please review.

Project coverage is 50.45%. Comparing base (0cdb329) to head (d45f2ab).
Report is 58 commits behind head on main.

Files Patch % Lines
server/backend/housekeeping/housekeeping.go 0.00% 101 Missing ⚠️
server/backend/database/mongo/client.go 45.61% 27 Missing and 4 partials ⚠️
server/backend/database/memory/database.go 73.17% 7 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #718      +/-   ##
==========================================
- Coverage   50.71%   50.45%   -0.27%     
==========================================
  Files          70       70              
  Lines       10213    10424     +211     
==========================================
+ Hits         5180     5259      +79     
- Misses       4512     4636     +124     
- Partials      521      529       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@krapie krapie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution.
I have left some comments on naming convention and few more.

About Housekeeping Task Naming Convention

Since we now newly introducing document hard deletion, it would be better to rename existing housekeeping task "Deactivate Candidate" (deactivateCandidates) to "Client Deactivate Candidate" (ClientDeactivateCandidates) for more clarification and naming consistency with "Document Hard Deletion" (DocumentHardDeletion).

cmd/yorkie/server.go Outdated Show resolved Hide resolved
cmd/yorkie/server.go Outdated Show resolved Hide resolved
server/backend/database/database.go Outdated Show resolved Hide resolved
server/backend/database/database.go Outdated Show resolved Hide resolved
server/backend/database/mongo/client.go Outdated Show resolved Hide resolved
server/backend/housekeeping/config.go Outdated Show resolved Hide resolved
server/backend/housekeeping/config_test.go Outdated Show resolved Hide resolved
server/backend/housekeeping/housekeeping.go Outdated Show resolved Hide resolved
server/backend/housekeeping/housekeeping.go Outdated Show resolved Hide resolved
server/backend/database/mongo/client.go Outdated Show resolved Hide resolved
@fourjae
Copy link
Contributor Author

fourjae commented Feb 13, 2024

Thank you. All comments have been reflected, and some feature changes and comment typos have been corrected for the update.

However, I think there is one thing that needs to be changed.
If an error occurs while working with the h.database.DeleteDocument() function, there are the following differences.

Client: Returns without rollback when err occurs during operation

Memdb: If err occurs during operation, rollback and return without committing

Which of the two methods should you proceed with?

@hackerwins
Copy link
Member

In #932, we improved housekeeping so that multiple tasks can be added. It might be good to try document deletion again.

@krapie
Copy link
Member

krapie commented Jul 20, 2024

@hackerwins Should we reopen this issue?
@fourjae Please notify me if you are still working on this PR.

@fourjae
Copy link
Contributor Author

fourjae commented Jul 22, 2024

@krapie sorry for forgetting. I will modify the work to fit the changed structure.

@hackerwins
Copy link
Member

@hackerwins Should we reopen this issue?
@fourjae Please notify me if you are still working on this PR.

If it's okay for @fourjae, It would be good to continue in this PR. 🙏

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Outside diff range, codebase verification and nitpick comments (3)
server/backend/database/mongo/client.go (3)

704-729: Add logging for better traceability.

Adding logging can help trace the execution flow and identify issues more easily.

+  logging.DefaultLogger().Infof("Finding hard deletion candidates for project: %s", project.ID)

715-715: Rename variable to follow camelCase convention.

The variable name DocInfos should be docInfos to follow camelCase convention.

-  var DocInfos []*database.DocInfo
+  var docInfos []*database.DocInfo

763-791: Add logging for better traceability.

Adding logging can help trace the execution flow and identify issues more easily.

+  logging.DefaultLogger().Infof("Finding deactivate candidates, lastProjectID: %s", lastProjectID)
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d45f2ab and c171085.

Files selected for processing (17)
  • cmd/yorkie/server.go (3 hunks)
  • server/backend/database/database.go (2 hunks)
  • server/backend/database/memory/database.go (2 hunks)
  • server/backend/database/memory/database_test.go (1 hunks)
  • server/backend/database/mongo/client.go (3 hunks)
  • server/backend/database/mongo/client_test.go (1 hunks)
  • server/backend/database/testcases/testcases.go (1 hunks)
  • server/backend/housekeeping/config.go (2 hunks)
  • server/backend/housekeeping/config_test.go (1 hunks)
  • server/backend/housekeeping/housekeeping.go (3 hunks)
  • server/clients/housekeeping.go (4 hunks)
  • server/config.go (2 hunks)
  • server/rpc/server_test.go (1 hunks)
  • server/server.go (1 hunks)
  • test/bench/tree_editing_bench_test.go (1 hunks)
  • test/helper/helper.go (2 hunks)
  • test/sharding/server_test.go (1 hunks)
Files skipped from review due to trivial changes (2)
  • server/backend/database/mongo/client_test.go
  • test/bench/tree_editing_bench_test.go
Files skipped from review as they are similar to previous changes (8)
  • cmd/yorkie/server.go
  • server/backend/database/memory/database.go
  • server/backend/database/memory/database_test.go
  • server/backend/database/testcases/testcases.go
  • server/config.go
  • server/rpc/server_test.go
  • test/helper/helper.go
  • test/sharding/server_test.go
Additional comments not posted (24)
server/backend/housekeeping/config_test.go (6)

30-35: LGTM!

The new configuration fields are correctly initialized.


40-41: LGTM!

The test case correctly checks for validation errors for invalid interval values.


44-45: LGTM!

The test case correctly checks for validation errors for invalid interval values.


48-49: LGTM!

The test case correctly checks for validation errors for invalid period values.


52-53: LGTM!

The test case correctly checks for validation errors for invalid limit values.


56-57: LGTM!

The test case correctly checks for validation errors for invalid limit values.

server/backend/housekeeping/housekeeping.go (5)

32-33: LGTM!

The Housekeeping struct correctly includes the Config and scheduler fields.


51-65: LGTM!

The RegisterTask method correctly sets up the scheduler for the housekeeping tasks.


Line range hint 67-75:
LGTM!

The RegisterTask method correctly sets up the scheduler for document deletion tasks.


Line range hint 79-81:
LGTM!

The Start method correctly starts the scheduler for the housekeeping service.


Line range hint 83-91:
LGTM!

The Stop method correctly stops the scheduler and handles errors.

server/backend/housekeeping/config.go (6)

28-41: LGTM!

The new fields in the Config struct enhance the configurability of the housekeeping process.


49-53: LGTM!

The Validate method correctly validates the new configuration fields.


57-63: LGTM!

The Validate method correctly validates the interval for document deletion.


65-71: LGTM!

The Validate method correctly validates the period for document hard deletion.


72-78: LGTM!

The Validate method correctly validates the limit for client deactivation candidates.


79-85: LGTM!

The Validate method correctly validates the limit for document hard deletion candidates.

server/clients/housekeeping.go (4)

39-39: Improved parameter naming.

The parameter candidatesLimitPerProject has been renamed to clientDeactivationCandidateLimitPerProject for better clarity.


96-96: Improved parameter naming.

The parameter candidatesLimitPerProject has been renamed to clientDeactivationCandidateLimitPerProject for better clarity.


182-204: LGTM!

The FindDocumentHardDeletionCandidates function is well-implemented with proper error handling and logging. The return values are consistent with the function's purpose.


125-180: LGTM! But verify the function usage in the codebase.

The DeleteDocument function is well-implemented with proper locking and error handling. The logging provides useful information.

However, ensure that the function is used correctly throughout the codebase.

server/server.go (1)

162-206: Enhanced housekeeping task registration.

The function now registers two separate tasks for deactivating clients and deleting documents, improving control over task intervals. The logic and error handling are well-implemented.

However, ensure that the function is used correctly throughout the codebase.

Verification successful

Function usage verified.

The function RegisterHousekeepingTasks is correctly used within the codebase, specifically in the server/server.go file. The implementation and usage are consistent and error handling is appropriately managed.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `RegisterHousekeepingTasks`.

# Test: Search for the function usage. Expect: Correct usage of the updated function.
rg --type go -A 5 $'RegisterHousekeepingTasks'

Length of output: 700

server/backend/database/database.go (2)

168-173: LGTM!

The FindDocumentHardDeletionCandidatesPerProject method is well-implemented with proper error handling and logging. The return values are consistent with the method's purpose.


175-179: LGTM!

The DeleteDocument method is well-implemented with proper error handling and logging. The return values are consistent with the method's purpose.

server/backend/database/mongo/client.go Outdated Show resolved Hide resolved
server/backend/database/mongo/client.go Outdated Show resolved Hide resolved
server/backend/database/mongo/client.go Outdated Show resolved Hide resolved
fourjae added 2 commits August 1, 2024 22:59
HousekeepingDocumentHardDeletionGracefulPeriod default hour 14h -> 336h (14day * 24h)
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between c171085 and 6cc0e19.

Files selected for processing (5)
  • server/backend/database/mongo/client.go (3 hunks)
  • server/backend/housekeeping/housekeeping.go (3 hunks)
  • server/config.go (2 hunks)
  • server/config.sample.yml (1 hunks)
  • server/server.go (1 hunks)
Files skipped from review as they are similar to previous changes (4)
  • server/backend/database/mongo/client.go
  • server/backend/housekeeping/housekeeping.go
  • server/config.sample.yml
  • server/server.go
Additional comments not posted (3)
server/config.go (3)

43-45: New constants for housekeeping intervals and grace period.

The new constants provide more granular control over housekeeping tasks. Ensure that these values align with the intended housekeeping policies.


46-47: Renamed constants for candidate limits per project.

The constants have been renamed for clarity. Ensure that these changes are reflected in the rest of the codebase.


235-240: Updated newConfig function with new housekeeping constants.

The function now references the new housekeeping constants. Ensure that these changes are correctly integrated and tested.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 6cc0e19 and 7d636a3.

Files selected for processing (1)
  • server/backend/housekeeping/config_test.go (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • server/backend/housekeeping/config_test.go

@krapie krapie requested review from hackerwins and krapie August 7, 2024 06:56
@hackerwins hackerwins force-pushed the main branch 2 times, most recently from fbc6098 to 7d65a57 Compare August 9, 2024 02:00
@krapie
Copy link
Member

krapie commented Aug 23, 2024

@fourjae Sorry for the late review. I've been busy for couple of weeks.
I will look for this PR ASAP.

cc. @hackerwins

Copy link
Member

@krapie krapie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution :)

Firstly, I have left some comments about overall code conventions.

server/backend/database/database.go Outdated Show resolved Hide resolved
server/backend/database/testcases/testcases.go Outdated Show resolved Hide resolved
server/backend/database/database.go Outdated Show resolved Hide resolved
server/backend/housekeeping/config.go Outdated Show resolved Hide resolved
server/backend/housekeeping/config.go Outdated Show resolved Hide resolved
server/backend/housekeeping/housekeeping.go Outdated Show resolved Hide resolved
server/clients/housekeeping.go Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
server/backend/database/memory/database.go Outdated Show resolved Hide resolved
server/backend/database/mongo/client.go Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (4)
server/config.go (1)

237-237: Use String() method for consistency.

The DocumentHardDeletionGracefulPeriod field is set directly to the corresponding constant value, which is inconsistent with the other fields that use the String() method. To maintain consistency and avoid potential issues, consider updating the code as follows:

-DocumentHardDeletionGracefulPeriod:           DefaultHousekeepingDocumentHardDeletionGracefulPeriod,
+DocumentHardDeletionGracefulPeriod:           DefaultHousekeepingDocumentHardDeletionGracefulPeriod.String(),
server/backend/database/database.go (1)

175-179: LGTM, but please update the comment.

The DeleteDocuments method addition is approved. However, please update the comment at line 175 to match the method signature.

-// DeleteDocuments deletes document of the given key
+// DeleteDocuments deletes the documents specified by the given DocInfo pointers
server/backend/database/mongo/client.go (1)

230-263: Approve the implementation of DeleteDocuments.

The DeleteDocuments function is correctly implemented with proper error handling and validation.

Improve the error messages.

Consider making the error messages more descriptive:

  • When an invalid document ID is encountered, include the invalid ID in the error message.
  • When the deletion fails, include the specific error details in the returned error.
server/backend/database/testcases/testcases.go (1)

1414-1468: The code changes are approved with a suggestion to improve the test coverage.

The new test function RunDocumentHardDeletionTest is well-structured and covers the key aspects of the document hard deletion functionality. It follows the existing conventions in the file.

To further improve the test coverage, consider adding the following scenarios:

  1. Test the case when there are no candidate documents for hard deletion.
  2. Test the case when the grace period is set to a positive value.
  3. Test the case when there are multiple candidate documents for hard deletion.
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 7d636a3 and 16d8749.

Files selected for processing (16)
  • cmd/yorkie/server.go (3 hunks)
  • server/backend/database/database.go (2 hunks)
  • server/backend/database/memory/database.go (2 hunks)
  • server/backend/database/memory/indexes.go (1 hunks)
  • server/backend/database/mongo/client.go (2 hunks)
  • server/backend/database/testcases/testcases.go (1 hunks)
  • server/backend/housekeeping/config.go (2 hunks)
  • server/backend/housekeeping/config_test.go (1 hunks)
  • server/backend/housekeeping/housekeeping.go (2 hunks)
  • server/clients/housekeeping.go (4 hunks)
  • server/config.go (2 hunks)
  • server/config.sample.yml (1 hunks)
  • server/rpc/server_test.go (1 hunks)
  • server/server.go (2 hunks)
  • test/helper/helper.go (2 hunks)
  • test/sharding/server_test.go (1 hunks)
Files skipped from review due to trivial changes (1)
  • server/backend/housekeeping/housekeeping.go
Additional comments not posted (45)
server/backend/housekeeping/config_test.go (6)

30-35: LGTM!

The code changes are approved.


40-40: LGTM!

The code changes are approved.


44-44: LGTM!

The code changes are approved.


48-50: LGTM!

The code changes are approved.


51-57: LGTM!

The code changes are approved.


59-62: LGTM!

The code changes are approved.

server/backend/housekeeping/config.go (4)

28-41: LGTM!

The changes to the Config struct improve the configurability and granularity of the housekeeping service. The new fields allow for separate intervals for deactivating candidates and deleting documents, and provide limits on the number of candidates per project. The naming of the fields is clear and descriptive.


57-63: LGTM!

The new check for the DeleteDocumentsInterval field ensures that it is a valid duration, improving the robustness of the configuration validation.


65-82: LGTM!

The new checks for the DocumentHardDeletionGracefulPeriod, ClientDeactivationCandidateLimitPerProject, and DocumentHardDeletionCandidateLimitPerProject fields ensure that they have valid values, improving the robustness of the configuration validation.


97-105: LGTM!

The changes to the ParseInterval method enhance its flexibility and usability by accepting an interval as a parameter. This allows the method to be used with different interval values without modifying the struct. The updated error message provides more context about the interval being parsed.

server/config.sample.yml (4)

33-34: LGTM!

The change from Interval to DeactivateCandidatesInterval improves clarity by using a more descriptive name for the parameter.


36-37: LGTM!

The addition of the DeleteDocumentsInterval parameter introduces an interval for document deletion, which is a new housekeeping functionality.


39-40: LGTM!

The addition of the HousekeepingDocumentHardDeletionGracefulPeriod parameter specifies a grace period for hard deletion of documents, which is a new housekeeping functionality.


42-46: LGTM!

The changes in this code segment are approved:

  • The change from CandidatesLimitPerProject to ClientDeactivationCandidateLimitPerProject improves clarity by using a more descriptive name for the parameter.
  • The addition of the DocumentHardDeletionCandidateLimitPerProject parameter introduces a limit for document deletion candidates, which is a new housekeeping functionality.
server/backend/database/memory/indexes.go (1)

139-147: LGTM!

The new compound index project_id_removed_at is a valuable addition to the schema variable. It allows for efficient querying based on the combination of ProjectID and RemovedAt fields, potentially improving performance for operations that involve filtering or sorting by both fields.

The index is correctly defined using the memdb.CompoundIndex type and the appropriate field types. The index name is clear and follows the naming convention of other indexes.

The code changes are approved.

server/clients/housekeeping.go (7)

31-31: LGTM!

The code changes are approved.


32-32: LGTM!

The code changes are approved.


40-40: LGTM!

The code changes are approved.


108-114: LGTM!

The code changes are approved.


116-124: LGTM!

The code changes are approved.


127-181: LGTM!

The new DeleteDocuments function looks good. It follows the existing coding conventions and implements the document deletion functionality with proper locking and error handling.


Line range hint 184-219: LGTM!

The new FindDocumentHardDeletionCandidates function looks good. It follows the existing coding conventions and implements the logic for finding document deletion candidates across multiple projects.

server/server.go (4)

97-100: LGTM!

The code changes are approved.


101-103: LGTM!

The code changes are approved.


164-187: LGTM!

The code changes are approved.


189-214: LGTM!

The code changes are approved.

test/sharding/server_test.go (1)

81-85: LGTM!

The changes to the housekeeping configuration parameters in the TestMain function look good. The new parameters provide more granular control over document management and candidate deactivation processes, which aligns with the AI-generated summary.

The changes are in a test file and do not directly impact the production code. The new parameters are initialized with values from the helper package, which likely contains predefined constants for testing.

Overall, the changes enhance the functionality of the housekeeping configuration and do not introduce any issues or inconsistencies.

server/config.go (4)

43-47: LGTM!

The introduction of the new constants enhances the clarity and readability of the housekeeping configuration. The constants provide more specific and descriptive names, making it easier to understand their purpose in the context of document deletion and candidate deactivation processes.


48-48: Please clarify the purpose and usage of DefaultHousekeepingProjectFetchSize.

The constant DefaultHousekeepingProjectFetchSize seems to be a standalone addition and its purpose is not immediately clear from the context. Could you please provide more information on how this constant is used in the housekeeping process and why it was introduced?


235-240: LGTM!

The updates to the Housekeeping configuration in the newConfig function ensure that the logic for housekeeping operations is now aligned with the new constants introduced earlier. This change enhances the consistency and clarity of the housekeeping configuration.


43-48: Overall, the code changes look good!

The introduction of the new constants and the updates to the housekeeping configuration enhance the clarity, consistency, and maintainability of the codebase. The changes provide more specific and descriptive names for the housekeeping intervals and limits, making it easier to understand their purpose in the context of document deletion and candidate deactivation processes.

Apart from the minor nitpick regarding the consistency of setting the DocumentHardDeletionGracefulPeriod field, the code changes are well-structured and contribute positively to the overall quality of the codebase.

Also applies to: 235-240

server/rpc/server_test.go (1)

86-91: LGTM!

The changes to the housekeeping configuration parameters are well-aligned with the PR objectives and significantly enhance the functionality of the housekeeping process. The new parameters provide more granular control over the management of candidate documents and their deletion processes, likely improving the efficiency of data management.

The code changes are approved.

server/backend/database/database.go (2)

23-23: LGTM!

The import statement is approved.


167-173: LGTM!

The FindDocumentHardDeletionCandidatesPerProject method addition is approved.

cmd/yorkie/server.go (6)

44-44: Previous review comment addressed.

The variable name documentHardDeletionGracefulPeriod follows the naming convention suggested in the previous review comment.


74-76: LGTM!

The code changes are approved.


200-211: LGTM!

The code changes are approved. The flag names and descriptions follow the naming convention suggested in the previous review comment.


213-216: LGTM!

The code changes are approved. The flag name and description follow the naming convention suggested in the previous review comment.


219-223: LGTM!

The code changes are approved. The flag name and description follow the naming convention suggested in the previous review comment.


224-229: LGTM!

The code changes are approved. The flag name and description follow the naming convention suggested in the previous review comment.

test/helper/helper.go (2)

65-73: LGTM!

The code changes are approved.


242-246: LGTM!

The code changes are approved.

server/backend/database/memory/database.go (2)

676-712: LGTM!

The code changes are approved.


758-780: LGTM!

The code changes are approved.

server/backend/database/mongo/client.go (1)

710-736: LGTM!

The FindDocumentHardDeletionCandidatesPerProject function is correctly implemented and follows the requirements. It effectively identifies the documents eligible for hard deletion based on the provided project and grace period.

@fourjae
Copy link
Contributor Author

fourjae commented Sep 4, 2024

@krapie Edited to reflect all comments.

Additionally, the grace time setting for deleted documents has been declared as 2 weeks (14d*24h). How many hours would be good?

@krapie
Copy link
Member

krapie commented Sep 5, 2024

@fourjae I think 2 weeks is good for now.

@krapie krapie self-requested a review September 5, 2024 13:08
@krapie
Copy link
Member

krapie commented Sep 5, 2024

@hackerwins Sudden wonder, is this task should be in housekeeping? not background?
Seems like housekeeping is sort of an extension of GC.
Referencing: https://github.com/yorkie-team/yorkie/blob/main/design/housekeeping.md

@krapie
Copy link
Member

krapie commented Sep 11, 2024

@hackerwins Could you mind reviewing this PR instead of me please?
I think you are more suitable to review this PR since you have restructured the housekeeping module recently.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range and nitpick comments (3)
server/backend/database/memory/database.go (1)

758-780: LGTM, but consider improving the error handling.

The code changes are approved. However, consider wrapping the error returned from txn.Delete with more context using fmt.Errorf to provide better error messages.

For example, you can modify the error handling as follows:

-return 0, fmt.Errorf("fetch hard deletion candidates: %w", err)
+return 0, fmt.Errorf("delete document %s: %w", candidate.ID, err)
server/backend/database/mongo/client.go (1)

230-263: LGTM, but consider handling the case when fewer documents are deleted.

The code changes are approved. However, consider handling the case when DeleteMany deletes fewer documents than expected, which could happen if some documents were deleted concurrently by another process. You can compare len(candidates) with deletedResult.DeletedCount and log a warning if they don't match.

server/backend/database/testcases/testcases.go (1)

1596-1650: Comprehensive test implementation! A few suggestions:

The test thoroughly covers the document hard deletion flow, including setup, configuration, candidate identification, deletion, and verification steps. Well done!

A few suggestions to consider:

  1. The fetchSize variable is set but not used in the FindDocumentHardDeletionCandidatesPerProject call. Consider removing if unused.
  2. To make the test more robust, validate the candidates slice before deletion to ensure it contains the expected document(s).
  3. Consider adding a negative test case to validate behavior when no candidates are found for deletion.
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 16d8749 and 86a3f5d.

Files selected for processing (6)
  • server/backend/database/memory/database.go (2 hunks)
  • server/backend/database/memory/database_test.go (1 hunks)
  • server/backend/database/mongo/client.go (2 hunks)
  • server/backend/database/mongo/client_test.go (1 hunks)
  • server/backend/database/testcases/testcases.go (1 hunks)
  • test/complex/main_test.go (1 hunks)
Additional comments not posted (5)
server/backend/database/memory/database_test.go (1)

115-117: LGTM, but verify the implementation of RunDocumentHardDeletionTest.

The code changes are approved. The new test case follows the existing pattern and enhances the test suite by including functionality to verify the hard deletion of documents.

However, ensure that the RunDocumentHardDeletionTest function is correctly implemented and thoroughly tested to confirm that it behaves as expected. Run the following script to verify the function implementation:

Verification successful

"""

"""

Verification Successful: RunDocumentHardDeletionTest is correctly implemented.

The RunDocumentHardDeletionTest function is defined and thoroughly tests the hard deletion of documents. It is correctly integrated into the test suite, ensuring that the functionality is verified. No further action is required.

  • Location of RunDocumentHardDeletionTest implementation: server/backend/database/testcases/testcases.go starting at line 1597.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of `RunDocumentHardDeletionTest`.

# Test: Search for the function definition.
# Expect: The function is defined and has a non-empty body.
ast-grep --lang go --pattern $'func RunDocumentHardDeletionTest($_, $_) {
  $$$
}'

Length of output: 4916

server/backend/database/mongo/client_test.go (1)

131-134: LGTM, but verify the implementation of RunDocumentHardDeletionTest.

The code changes are approved. The addition of the new test case for "DocumentHardDeletion test" enhances the test suite by including functionality to verify the hard deletion of documents, which is a critical aspect of database management. The test case follows the existing pattern and is correctly implemented.

However, please ensure that the RunDocumentHardDeletionTest function, which is assumed to contain the actual test logic for document hard deletion, is properly implemented. You can use the following script to verify its implementation:

Verification successful

"""

"""

The RunDocumentHardDeletionTest function is correctly implemented.

The function is defined in server/backend/database/testcases/testcases.go and thoroughly tests the hard deletion of documents by simulating the necessary operations and using assertions to verify the expected outcomes. The test case is well-structured and follows standard Go testing practices. No further action is needed.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of the `RunDocumentHardDeletionTest` function.

# Test: Search for the function definition.
# Expect: The function should be defined and contain the test logic for document hard deletion.
ast-grep --lang go --pattern $'func RunDocumentHardDeletionTest($_, $_) {
  $$$
}'

Length of output: 4916

test/complex/main_test.go (1)

91-96: LGTM!

The code changes are approved. The addition of new configuration parameters enhances the housekeeping functionality by providing more granular control over intervals and limits related to candidate deactivation and document deletion. The renaming of CandidatesLimitPerProject to ClientDeactivationCandidateLimitPerProject improves clarity. The changes are consistent with the existing code structure and align with the PR objectives.

server/backend/database/memory/database.go (1)

676-712: LGTM!

The code changes are approved.

server/backend/database/mongo/client.go (1)

710-736: LGTM!

The code changes are approved.

Copy link
Member

@hackerwins hackerwins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. Overall looks good. I left a few comments.

A. Collections under 'documents' need to be deleted as well

Yorkie Documents are stored across multiple collections:

  • documents: Stores document metadata (creation time, number of changes, etc.)
  • changes: Stores document changes (similar to oplog in databases)
  • snapshots: Stores document snapshots (similar to snapshots in databases)
  • syncedseqs: Records how far clients have synced

Please refer to this: https://miro.com/app/board/o9J_kidjCtk=/?moveToWidget=3458764569836960605&cot=14

B. Exception cases need to be handled

In Yorkie system, document replicas exist not only on the server but also on clients. We need to consider the following scenarios:

  • When a client requests synchronization and the document has been soft-deleted on the server (check removedAt in the documents collection), the local replica should be marked as deleted.
  • But if the documents collection has been physically hard-deleted by housekeeping processes, there's no way to determine if the document was deleted.

We need to think about how to handle this second case.
For more details on document removal, please check this design document:
https://github.com/yorkie-team/yorkie/blob/main/design/document-removal.md

@window9u window9u mentioned this pull request Dec 28, 2024
3 tasks
@hackerwins hackerwins marked this pull request as draft January 4, 2025 03:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🌟 New feature or request
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

3 participants