-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DEV-6356] installation progress #3564
base: master
Are you sure you want to change the base?
Conversation
@@ -640,3 +640,19 @@ class DashboardChartGroupBy(create_time_mixin(created_at=True, updated_at=True), | |||
DashboardChartGroupBy.jira_issue_types, | |||
DashboardChartGroupBy.jira_labels, | |||
) | |||
|
|||
|
|||
class InstallationProgress(create_time_mixin(created_at=False, updated_at=False), Base): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class InstallationProgress(create_time_mixin(created_at=False, updated_at=False), Base): | |
class GitHubInstallationProgress(Base): |
I guess. If we write "github" in the description... So that we don't confuse with GitLab.
Also, create_time_mixin(created_at=False, updated_at=False)
does nothing, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
account_created = Column(TIMESTAMP(timezone=True)) | ||
fetch_started = Column(TIMESTAMP(timezone=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is "account_created" different from "fetch_started"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more so we don`t have a "account_created" event. yet.
and for a case of re-fetch they will be different.
so for now we can cut this field I think. yes
precompute_started = Column(TIMESTAMP(timezone=True)) | ||
precompute_completed = Column(TIMESTAMP(timezone=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is reposet and Athenian account specific. We have to move those timestamps to the repository_sets
table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this force us to have some key in the state DB for github_account_ids.
create a table for comparing github_account_id and athenian_account_id needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably mean account_github_accounts
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is contains both this IDs - yes it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then yes it is
consistency_completed = Column(TIMESTAMP(timezone=True)) | ||
precompute_started = Column(TIMESTAMP(timezone=True)) | ||
precompute_completed = Column(TIMESTAMP(timezone=True)) | ||
current_status = Column(Text()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's delete this for now. I need us to create something absolutely minimal and ASAP. If we make the status, we will have to think what statuses can we have, when to update and what, consider different edge cases, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noone is forcing us to use all the columns of the table from the begining.
we can create a table "as is" and add the features to the pipeline and API when we will be ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's migrate to additional fields in the future. To avoid seduction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but how one text field is making it harder? We already have a PR to MD that uses it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addition - PR to cloud-common is ready too. and if we cut the columns it will slow us to implement changes instead of getting some features faster.
and if we delete precompute timestamps from this table we will use all the rest immediately after releasing all the features that in process now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, we can update the status field in the metadata whatever way we want, but we will not use it downstream. If it warms your heart much to update an unused field - I am not standing against 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'd still like to have that field for the bot.
Agree about precomputer, we will need a second table then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
precompute- fields dropped. account_created field dropped. others (including status) is used by actual written logic in metadata. the point is that status can have a voluntary (apart of some statuses that would be fixed) value and this can be used for some specific messages like "paused", "delayed" etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dennwc we have a field "precomputed bool" in the table "repository_sets"
so it would be right to add a couple new fields to it. and provide a logic to fill this fields from the precomputer directly (not with event handler).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to write to metadata 👍
@@ -100,6 +100,8 @@ class RepositorySet( | |||
) | |||
tracking_re = Column(Text(), nullable=False, default=".*", server_default=".*") | |||
precomputed = Column(Boolean(), nullable=False, default=False, server_default="false") | |||
precompute_started = Column(TIMESTAMP(timezone=True)) | |||
precompute_completed = Column(TIMESTAMP(timezone=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
precompute_completed = Column(TIMESTAMP(timezone=True)) | |
precompute_finished = Column(TIMESTAMP(timezone=True)) |
Minor syntax nitpick: start-finish; begin-end; initiate-complete. I know that people don't usually care, but this is just good education and conformance to the surrounding table conventions :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and I see we have the same syntax to fix in installation_progress
🙏
consistency_completed = Column(TIMESTAMP(timezone=True)) | ||
precompute_started = Column(TIMESTAMP(timezone=True)) | ||
precompute_completed = Column(TIMESTAMP(timezone=True)) | ||
current_status = Column(Text()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, we can update the status field in the metadata whatever way we want, but we will not use it downstream. If it warms your heart much to update an unused field - I am not standing against 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @Ildyakov and @vmarkovtsev)
server/athenian/api/models/state/versions/5b3dc49a9d7b_installation_progress.py
line 20 at r2 (raw file):
def upgrade(): op.create_table( "installation_progress",
Since we are removing API-related things from the table anyway, does it make sense to move it to MD repo completely? It can be a set of added columns for github.account
, for example. This way we can add a data migration as well (set all accounts to "done" status).
this is interesting. yes the reason to use state db was precomputing. |
This change is