-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctness Testing 2.0 #12
Conversation
""" | ||
Parses a Prometheus duration string and returns the duration in milliseconds. | ||
""" | ||
pattern = r'(?:(\d+)h)?(?:(\d+)m)?(?:(\d+)s)?$' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Durations can be negative, for example, -5h
. This wouldn't match the entirety of a negative duration.
Prometheus shows a negative time duration as one possible value in their documentation:
1s # Equivalent to 1.
2m # Equivalent to 120.
1ms # Equivalent to 0.001.
-2h # Equivalent to -7200.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕
assert(write_response == 200) | ||
|
||
# ensures data is ingested into Timestream | ||
time.sleep(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this and the other sleeps, can network latency have any impact on whether data is finished ingesting into Timestream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically yes, but I think it should be fair given what we can expect from their docs ("write-to-read latency is in the sub-second range."). Cross-region requests might add additional latency, perhaps we can make the network latency sleep configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making it configurable is a good idea. I think using a named constant, like SLEEP_TIME
, would be good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me once negative durations are supported and the test sleep time can be set with a named constant.
@@ -0,0 +1,289 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add License header for new files
1. Bring up containers with `docker compose up -d` | ||
2. Run tests with `docker exec -it mockmetheus pytest` | ||
|
||
### Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a cleanup section.
1. Run the following command to execute the correctness tests: | ||
`go test -v ./correctness` | ||
1. Bring up containers with `docker compose up -d` | ||
2. Run tests with `docker exec -it mockmetheus pytest` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we run individual tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! pytest test_correctness.py::test_write_no_data
working_dir: /app | ||
command: ["sleep", "infinity"] | ||
environment: | ||
AWS_ACCESS_KEY_ID: "XXXXXXXXX" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values should be read from a env_file
that is added to gitignore. Storing credentials in tracked files is an easy way to commit secrets.
2. Execute the following command to save the docker image as a compressed file and update the `version` appropriately: | ||
`docker save timestream-prometheus-connector-docker | gzip > timestream-prometheus-connector-docker-image-<version>.tar.gz` | ||
1. Ensure your docker version is >= `20.10.0`, or you have `docker-compose` installed | ||
2. Update `docker-compose.yml` in this directory with your AWS credentials | ||
|
||
## How to execute tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a section how to run without docker?
It uses the same protobuf definitions as Prometheus to construct snappy-encoded | ||
payloads for both remote-read and remote-write requests and responses. | ||
""" | ||
class Mockmetheus: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation for using Python rather than Go. We are doing a lot of regex and custom parsing here for Prometheus requests rather than using Prometheus libraries.
For example we can use the github.com/prometheus/prometheus/promql/parser
library for parsing Prometheus query strings. Using Prometheus libraries would also have the added benefit of being able to upgrade the dependencies for functional changes / bugs, and not mixing Go and Python without warrant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are doing a lot of regex and custom parsing here for Prometheus requests rather than using Prometheus libraries.
The primary goal of Mockmetheus is to control remote/read operations that a Prometheus client would typically perform. While our regex may not exactly match what Prometheus uses, we can ensure its accuracy through our test setups and the controlled environment we maintain. Our current regex and parsing approach is straightforward and should suffice unless there are specific edge cases or additional pre/post-processing steps that the Prometheus parser handles. If you think there are test cases we might be missing (cases that their parser can handle) please let me know and I will incorporate those tests to verify compatibility with our existing setup.
not mixing Go and Python without warrant.
What's wrong with having Go and Python in one repo? The separation is clear - our connector simply exposes two REST API endpoints, and how we test against these endpoints should not matter as long as we also have the capability of verification using these same endpoints. IOW the choice of language for correctness testing does not impact the functionality of the connector itself. As long as we can effectively test and verify the endpoints, the underlying language used for tests should be flexible. However, if consistency is a priority or if there are specific reasons to standardize on Go, I'm open to rewriting the tests in Go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although we can use any language for this implementation, we need some reason for language interop. If there is no benefit or clear purpose here for using Python, I would prefer we implement in Go. There is no need to write the parsing logic when we have the native parsing libraries available. Here is a small Go program which can parse a Prometheus query string, rather than introducing the required logic ourselves:
package main
import (
"fmt"
"log"
"github.com/prometheus/prometheus/promql/parser"
)
func main() {
query := `scrape_duration_seconds{}[5d]`
node, err := parser.ParseExpr(query)
if err != nil {
log.Fatalf("Error parsing query: %v", err)
}
PrintQuery(node)
}
func PrintQuery(node parser.Node) {
switch n := node.(type) {
case *parser.MatrixSelector:
fmt.Println("Metric name:", n.VectorSelector)
fmt.Println("Range:", n.Range)
default:
fmt.Printf("Unknown query node type: %T\n", n)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to resolve best approach for using Go vs Python.
Removed our deprecated correctness testing suite in place of a new approach for correctness testing.
Mockmetheus
in./correctness
. It's designed to mock remote operations from Prometheus using snappy-encoded payloads.FRESH_TSDB
for running tests against a freshly new TSDB. The tests are designed to work either way, but defaults to false for convenience.Running tests
./correctness
and bring up containers withdocker compose up -d
docker exec -it mockmetheus pytest
Introduced test cases for easy reference:
Additional notes:
generate_test_run_id()
- helps achieve test case independence by using an extra label in our read/write requests.prom_pb2.py
) is generated using:protoc --python_out=. prom.proto
Mockmetheus (pic for reference):