Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudcost-exporter: support GCP c4a compute and local hyperdisks #337

Open
1 of 2 tasks
jjo opened this issue Oct 28, 2024 · 7 comments
Open
1 of 2 tasks

cloudcost-exporter: support GCP c4a compute and local hyperdisks #337

jjo opened this issue Oct 28, 2024 · 7 comments
Labels
help wanted Extra attention is needed

Comments

@jjo
Copy link
Contributor

jjo commented Oct 28, 2024

Follow up from #182992.

Add GCP c4a compute and local storage support to cloudcost-exporter:

Tasks

Preview Give feedback
Pokom added a commit that referenced this issue Oct 30, 2024
Now that GCP has publicly unveiled the C4A family, let's update the
pricing map to handle both ARM and AMD chipsets.

- refs #337
Pokom added a commit that referenced this issue Oct 31, 2024
Now that GCP has publicly unveiled the C4A family, let's update the pricing map to handle both ARM and AMD chipsets.

- refs #337
@Pokom
Copy link
Contributor

Pokom commented Nov 1, 2024

Handling the ARM chipset was fairly straight forward(#341). The hyperdisk requires a bit more thinking. I pushed an MVP PR(#344) that will emit cost metrics just for the capacity. This is ideal because our initial attempts at using hyperdisks will be limited to configurations where throughput and IOPS and the default values, which are effectively free.

@jjo posed an interesting question, which is how do we expose the other pricing dimensions? I did a bit of a deep dive of Grafana's internal usage of the metrics. Most of our calculations for TCO of storage boil down to this type of PromQL expression represented in jsonnet:

...
joining the pod information
                      * on (cluster, namespace, persistentvolumeclaim) group_left(pod) (
                        # deduplicate
                        group by (cluster, namespace, pod, persistentvolumeclaim)(
                          kube_pod_spec_volumes_persistentvolumeclaims_info
                          # filter out failed pods
                          * on (cluster, namespace, pod) group_left() (
                            group by (cluster, namespace, pod)(kube_pod_status_phase{phase="Running"} == 1)
                          )
                        )
                      )
                      # Multiply for the hourly cost per pvc.
                      * on (cluster, volumename) group_left() (
                          label_replace(
                          label_replace(
                          cloudcost_gcp_gke_persistent_volume_usd_per_hour
                          , "volumename", "$1", "persistentvolume", "(.*)")
                          , "cluster", "$1", "cluster_name", "(.*)"
                        )
                        )
                      # Divide by 60 to get the per-minute cost.
                      / 60

Note that the only calculation we're making is finding pod's that are running, then joining that against the hourly cost of persistent volumes associated with that volume. In previous iterations, the cost metric would emit the list price for a specific volume. In our current approach, we bake that in to the metric so that we don't need to put more pressure on Prometheus to make the calculations.

Because of that design decision to calculate the total hourly cost for the metric, I believe we can also add the cost of the other dimensions necessary for Hyperdisks. This would require a subtle change to our storage pricing map(

Storage map[string]float64
)

We'd need to expand that from simply being a float64 to being something like

type StoragePriceDimensions struct {
    Capacity float64
    IOPS     float64
    Throughput float64
    HA       float64

Then the disk would need a method that gets the associated metadata for each dimension and calculates the cost based upon those dimensions.

What do you think @jjo?

@jjo
Copy link
Contributor Author

jjo commented Nov 1, 2024

[...]
We'd need to expand that from simply being a float64 to being something like

type StoragePriceDimensions struct {
Capacity float64
IOPS float64
Throughput float64
HA float64
Then the disk would need a method that gets the associated metadata for each dimension and calculates the cost based upon those dimensions.

What do you think @jjo?

💯 indeed we'd "just" need to change cloudcost_gcp_gke_persistent_volume_usd_per_hour to feed from those extra three dimension (set at creation time, e.g. from a Kube storage class.

In that sense, is interesting to realize that cost has always been "provisioned ", previously only one (Capacity), now these added to hyperdisks.

dimension per-month price
Hyperdisk Balanced provisioned space $0.083 per GiB
Hyperdisk Balanced provisioned IOPS $0.005 per IOPS provisioned
Hyperdisk Balanced provisioned throughput $0.042 per MB/s provisioned
Hyperdisk Balanced High Availability provisioned space $0.1664 per GiB

Just to play with an example, disk provisioned with: 100GiB space, 10K IOPs, 1000 MB/s BW, no HA (for simplicity), using the above table should give for cloudcost_gcp_gke_persistent_volume_usd_per_hour:

  • currently is:
0.01137 [USD/hr] = 
  100 [GiB] * 0.083 [USD/GiB/mo]
  /
  (30.4 [day/mo] * 24 [hr/day])
  • new should be:
0.1374 [USD/hr] =
  (100 [GiB] * 0.083 [USD/GiB/mo] + 10000 [IOps/mo] * 0.005 [USD/IOps/mo] + 1000 [MBsec] * 0.042 [USD/MBsec/mo])
  /
  (30.4 [day/mo] * 24 [hr/day] )

@Pokom
Copy link
Contributor

Pokom commented Nov 1, 2024

Very strong plus one on the calculations. The only very minor nit to be added @jjo is that GCP charges for usage after a certain point. So it would be closer to

100 [GiB] * 0.083 [USD/GiB/mo] + (10000 - freeIOPS) [IOps/mo] * 0.005 [USD/IOps/mo] + (1000 - freeMBsec) [MBsec] * 0.042 [USD/MBsec/mo]

@jjo
Copy link
Contributor Author

jjo commented Nov 1, 2024

Very strong plus one on the calculations. The only very minor nit to be added @jjo is that GCP charges for usage after a certain point. So it would be closer to

100 [GiB] * 0.083 [USD/GiB/mo] + (10000 - freeIOPS) [IOps/mo] * 0.005 [USD/IOps/mo] + (1000 - freeMBsec) [MBsec] * 0.042 [USD/MBsec/mo]

+1 math-wise just be careful to not go negative on those factors (i.e. max(factor, 0)) , if for whatever reason the scraped IOps or MBsec happen to be below the free threshold.

@Pokom
Copy link
Contributor

Pokom commented Nov 1, 2024

Very strong plus one on the calculations. The only very minor nit to be added @jjo is that GCP charges for usage after a certain point. So it would be closer to

100 [GiB] * 0.083 [USD/GiB/mo] + (10000 - freeIOPS) [IOps/mo] * 0.005 [USD/IOps/mo] + (1000 - freeMBsec) [MBsec] * 0.042 [USD/MBsec/mo]

+1 math-wise just be careful to not go negative on those factors (i.e. max(factor, 0)) , if for whatever reason the scraped IOps or MBsec happen to be below the free threshold.

Great call out! Effectively we need to do Math.Max(configuredDimension - freeDimension, freeDimension).

@jjo
Copy link
Contributor Author

jjo commented Nov 1, 2024

Very strong plus one on the calculations. The only very minor nit to be added @jjo is that GCP charges for usage after a certain point. So it would be closer to

100 [GiB] * 0.083 [USD/GiB/mo] + (10000 - freeIOPS) [IOps/mo] * 0.005 [USD/IOps/mo] + (1000 - freeMBsec) [MBsec] * 0.042 [USD/MBsec/mo]

+1 math-wise just be careful to not go negative on those factors (i.e. max(factor, 0)) , if for whatever reason the scraped IOps or MBsec happen to be below the free threshold.

Great call out! Effectively we need to do Math.Max(configuredDimension - freeDimension, freeDimension).

I think that's actually Math.Max(configuredDimension - freeDimension, 0) (i.e. no cost if not over free)

Pokom added a commit that referenced this issue Nov 4, 2024
First attempt at handling hyperdisk balanced volumes. The primary goal was to update the pricing map logic in such a way that would parse of hyperdisks  balanced and include only the cost of capacity. Hyperdisk Balanced pricing is done in such a way that if you don't configure IOPS and Throughput, you use a free tier.

In the future IOPS, Throughput, and High Availability costs need to be saved and calculated as well.

- relates to #337 

Co-authored-by: JuanJo Ciarlante <[email protected]>
@Pokom Pokom self-assigned this Nov 8, 2024
@Pokom
Copy link
Contributor

Pokom commented Dec 17, 2024

I'm going to place this back into TODO until we are closer to adopting hyperdisks. For now we handle the storage costs, and the remaining task is to attribute costs for IOPS and the other dimensions. One key bit I learned from this is that we'll need to handle the SKU similar to what we do with CPU/Memory, and write out regex's to match properly. See https://github.com/grafana/cloudcost-exporter/compare/feat/hyperdisk-price-calculator?expand=1 for where I stopped. Specifically, https://github.com/grafana/cloudcost-exporter/compare/feat/hyperdisk-price-calculator?expand=1#diff-c58feeadbe3b07fb5e4ff69378e1db9ed311c68e971e6bcdc2d91828c7226089R196-R197 is where things started to break down and the naive implementation no longer works.

@Pokom Pokom removed their assignment Dec 17, 2024
@Pokom Pokom added the help wanted Extra attention is needed label Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants