In summary, when this intermittent bug appears DAS will not recognize that the instance server it told to stop - really stopped. If it stops VERY fast, then the Rest connection throws a (useless, unimportant) Exception which should be completely ignored.
The impact on the customer is annoyance. It causes a false negative. The major impact is in automated tests.
It is quite possible that the customer will bump into this issue.
It is not a serious bug because the command actually succeeed but was reported as a failure. If the same command is run again – it will report that the cluster or instance is already stopped.
OTOH - it depends on one's opinion of what's serious. It gives the user a bad impression of the product if he sees it.
The cost to fix it is minimal, in fact it is already fixed and waiting to go in. If it doesn't go into 4.0 it'll go into 4.0.1
The bug appears rarely in Quick Look tests. It's dependent on hardware.
The fix is about as simple as can be – ignore an Exception instead of failing. Changed code is simpler than existing code.
There is little risk. Automated tests, including QuickLook test this area all the time.
No doc impact.
QA need only run their usual standard tests start/stop/restart
This is a core lifecycle fix.
I highly recommend allowing this into 4.0
It is about as risk-free as you can get.