[GLASSFISH-20992] Clustered deployment sometimes fails with error "Command disable failed on server instance xxx: remote failure: Application not registered Command deploy failed." Created: 21/Feb/14  Updated: 17/Apr/14

Status: Open
Project: glassfish
Component/s: deployment
Affects Version/s: 4.0
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: electricsam Assignee: Hong Zhang
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Linux, JDK 1.7.0_45


Sometimes when redeploying an application, the deploy fails with the following error:

Failure: Command disable failed on server instance xxxxxx: remote failure: Application not registered Command deploy failed.

Repeated attempts always fail. After the error appears once it seems like nothing can be deployed.

Sometimes the error is:

Error occurred during deployment: Keys cannot be duplicate. Old value of this key property, nullwill be retained. 

Here is the command:

asadmin deploy --force=true --deploymentorder 300 --target yyyyy 

Sometimes by trial and error I can get the deploy to work. Here is what I've tried:

1. Stop the domain then remove the "generated" and "osgi-cache" and re-start the domain
2. Stop the cluster and the domain then remove the "generated" and "osgi-cache" from the domain and the nodes and re-start the domain and cluster
3. Undeploy the problem app, then redeploy it. However, sometimes the app is listed as having been un-deployed already and this fails
4. Undeploy everything that can be undeployed. Manually delete the applications that could not be undeployed via asadmin from the applications folder and manually edit the domain.xml to remove references to those apps.

Comment by Hong Zhang [ 11/Apr/14 ]

Can you provide the test application and the set of steps for us to reproduce from our side?

Comment by electricsam [ 17/Apr/14 ]

It's not always the same app.

Let me describe our set-up. We have a complex web service. It is comprised of about 60 ears. We have a nightly script which will deploy / re-deploy any apps that have changed (varies between 5-60).

This error occurs probably 3-4 times a month and requires manual intervention and possibly an outage to resolve.

We are running a cluster of 2 nodes. The DAS and each of the nodes are on their own separate boxes.

Other than all being ears, there is no commonality between the apps. I have not been able to find one scenario that will consistently reproduce the issue.

I will continue to look for patterns, but so far none are obvious.

Generated at Sat Feb 28 20:12:12 UTC 2015 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.