-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-267] Query BQ once per BP for spend report, handle errors #3166
base: develop
Are you sure you want to change the base?
Conversation
core/src/main/scala/org/broadinstitute/dsde/rawls/spendreporting/SpendReportingService.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some syntax suggestions inline, not a full review:
@@ -358,7 +358,8 @@ class SpendReportingService( | |||
} | |||
|
|||
def getAllUserWorkspaceQuery( | |||
billingProjects: Map[BillingProjectSpendExport, Seq[(GoogleProjectId, WorkspaceName)]], | |||
billingProject: BillingProjectSpendExport, | |||
workspaces: Seq[(GoogleProjectId, WorkspaceName)], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the WorkspaceName
s aren't used anywhere inside this method. If you changed this method to accept a Seq[GoogleProjectId]
instead then some interior code gets a little simpler
baseQuery | ||
.replace("_PARTITIONTIME", timePartitionColumn) | ||
.replace("_BILLING_ACCOUNT_TABLE", tableName) | ||
.replace("_PROJECT_ID_LIST", "(" + workspaces.map(tuple => s""""${tuple._1.value}"""").mkString(", ") + ")") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/optional: you can use the 3-argument form of mkString
here instead; your call on which is more readable
.replace("_PROJECT_ID_LIST", "(" + workspaces.map(tuple => s""""${tuple._1.value}"""").mkString(", ") + ")") | |
.replace("_PROJECT_ID_LIST", workspaces.map(tuple => s""""${tuple._1.value}"""").mkString("(", ", ", ")")) |
} | ||
} | ||
.recoverWith { case ex => | ||
Future.successful(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log something here to show that we ran into a problem?
SpendReportingResults( | ||
acc.spendDetails ++ res.spendDetails, | ||
SpendReportingForDateRange( | ||
(BigDecimal(acc.spendSummary.cost) + BigDecimal(res.spendSummary.cost)).toString, | ||
(BigDecimal(acc.spendSummary.credits) + BigDecimal(res.spendSummary.credits)).toString, | ||
acc.spendSummary.currency, | ||
acc.spendSummary.startTime, | ||
acc.spendSummary.endTime | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider writing a +
method in the SpendReportingResults
case class, then calling it here.
if you had:
case class SpendReportingResults(spendDetails: Seq[SpendReportingAggregation],
spendSummary: SpendReportingForDateRange
) {
def +(other: SpendReportingResults): SpendReportingResults =
new SpendReportingResults(
this.spendDetails ++ other.spendDetails,
SpendReportingForDateRange(
(BigDecimal(this.spendSummary.cost) + BigDecimal(other.spendSummary.cost)).toString,
(BigDecimal(this.spendSummary.credits) + BigDecimal(other.spendSummary.credits)).toString,
this.spendSummary.currency,
this.spendSummary.startTime,
this.spendSummary.endTime
)
)
}
then this code could simply be:
combinedResults = results.flatten.reduceOption((acc, res) => acc + res)
baseQuery | ||
.replace("_PARTITIONTIME", timePartitionColumn) | ||
.replace("_BILLING_ACCOUNT_TABLE", tableName) | ||
.replace("_PROJECT_ID_LIST", "(" + workspaces.map(tuple => s""""${tuple._1.value}"""").mkString(", ") + ")") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.replace("_PROJECT_ID_LIST", "(" + workspaces.map(tuple => s""""${tuple._1.value}"""").mkString(", ") + ")") | |
.replace("_PROJECT_ID_LIST", "(" + workspaces.map { case (project, _) => s""""${project.value}"""" }.mkString(", ") + ")") |
the _1
syntax is tough, using the curlies and case make it look nicer
} | ||
} | ||
.recoverWith { case ex => | ||
Future.successful(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should at least log this exception
@davidangb jinx |
Ticket: https://broadworkbench.atlassian.net/browse/CORE-267
For the consolidated spend report, instead of unioning all billing projects together into one BQ query, this PR does a separate query for each billing project. If BQ returns an error for any one of them, it is left out of the results.
As currently implemented, BPs that generate errors are essentially ignored, resulting in all these workspaces simply not showing up in the UI. Since the form of the result returned from the Rawls API has a list of SpendReportingForDateRange, it might not be very straightforward to return a result for all the workspaces in a problematic BP that indicates that there was a problem. So that would suggest that the UI would have to figure out which, if any, workspaces are missing and display an 'N/A' for them, but that leaves out some information that we do have. Reviewers, is it worth it to try to shove some indication of the problem into a
SpendReportingResult
or is it ok just to leave it to the UI?Testing:
I compared my results from the consolidated spend report API both before and after I split the query up and I believe the substance did not change. I then created a BP and gave it a false BQ dataset in its spend report configuration. After checking that this resulted in errors on the main branch, I verified that I was able to successfully get the report on this branch.
PR checklist
model/
, then you should publish a new officialrawls-model
and perform the corresponding dependency updates as specified in the README: