Replies: 1 comment 7 replies
-
@ReubenBond is working on improvements to testing for 9.1, channel your frustration 😄 (it seems #5878 is important for testing as well).
I feel this while debugging our own flaky tests. @afscrome See #7131 to see what our trace logs look like for resources. |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Been using
DistributedApplicationTestingBuilder
a bit now and when it works it well it's amazing. But when it doesn't work, it's a painful and frustrating experience.Some of these I have captured in issues already logged in this repo, but I thought it useful to put this in one place to inform #7057
Logging in Test Frameworks isn't obvious
The docs don't do a great job of explaining how to configure logging, without which you have essentially no details on what's gong on - particularly in CI - (See dotnet/docs-aspire#2096). Doc improvements would help here, perhaps with some additions to the default templates.
Even with that, all resources get dumped into the same output, so if you're troubleshooting why FOO didn't start up, you've got to filter through a lot of noisy logs to find out. (And if other services make use of FOO w/out using
WaitFor
, those will may well be spamming the log with errors because FOO isn't fully available adding even more noise to filter through).I also wonder if is value in a helper method which write resource logs to a directory, with each resource/replica getting their own file. This directory could then be published as a build artifact, and provide a way to more easily view the logs of individual resources. Probably combined with some test framework infrastructure to give each test it's own directory, and possibly only publishing for failed tests. Aspire doesn't needs to handle all the test framework specific issues, but a
LogToDirectory
building block could be useful.DistributedApplicationTestingBuilder
could also do with some targeted log levels, or add its' own log entries based on aspire events. e.g. DefaultAspire.Hosting.ApplicationModel.ResourceNotificationService
toDebug
level to log state changes (or subscribe to events and publisht hem to a logger). Ditto for health checks including pass / fail (along with details of why they failed). (This overlaps with the "Hard to see current state of test host" section below)Timeouts leave you hanging
There are many failure scenarios in aspire which will hang forever. When you have the dashboard, this works great - you can see what hasn't started and use the tools in the dashboard to browse and filter logs to understand why the bits that failed did so.
DistributedApplicationTestingBuilder
doesn't include the dashboard (and if it did, the dashboard wouldn't be usable in CI).The current pattern to avoid this is:
Which this fixes the immediate problem of avoiding infinite hanging, all it does is result in the error
System.TimeoutException : The operation has timed out.
which doesn't give you any help in working out where to look for the root cause, leaving you hanging in a different way...One thing that I think could help here is to have native timeout support on waiting methods that can provide targeted errors when things fail.
app.StartAsync
Aside: I think it's surprising that
StartAsync
hangs at all - I now know enough about Aspire internals that I understand why, but it does feel like the delays are due to a leaky abstraction rather than making sense. But as long asStartAsync
can timeout, we can at least clarify what failed.resourceNotificationService.WaitForResourceAsync
/resourceNotificationService.WaitForCompletion
(Possibly also including a short snippet of the last 10 lines of stdout from the service)
resourceNotificationService.WaitForResourceHealthyAsync
:There is definitely further improvements / tuning that could be done on these error messages - treat these as a starting point that is many times better than
The operation has timed out
These could possibly be implemented directly on
ResourceNotificationService
, although it could make more sense to giveDistributedApplicationTestingBuilder
it's own version of these, optimised for test scenarios.Hard to see current state of test host
Again, another area that falls down due to the dashboard not being available.
There is a lot of good information in the
ResourceEvent
published byResourceNotificationService
, but you have to know they are there, and know how to subscribe to theWatchAsync
event to receive them. It would be really helpful if this data could be more easily / obviously accessible within the test host.For Local dev, this could be done by a
ResourceStates
property, benefiting from some of the DebuggerDisplay work done on those fields - see #5632 (comment)I don't think #6795 is sufficient to fix this issue as the data is still somewhat hidden behind
WatchAsync
- I'd expect to be able to get access to this data in something I can navigate to the current state through the locals / watch windows. (I.e. a synchronous method / property), without having to go through a full blown subscription.I've had several test failures which I was only able to solve was by using a crude state dumper like the following. I know enough about aspire to know I can get this data out of
ResourceNotificationService
, but this data feels to useful to not be more visible.Beta Was this translation helpful? Give feedback.
All reactions