glassfish
  1. glassfish
  2. GLASSFISH-18124

Cannot start cluster - synchronization fails

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 3.1.2_b16
    • Fix Version/s: None
    • Component/s: distributed management
    • Labels:
      None
    • Environment:

      ogs-3.1.2-b16.zip on solaris

      Description

      I have a cluster with two instances: one on localhost, another one on a remote solaris machine on an SSH node. This cluster was running in the past. I can no longer start the cluster with the following error:

      cll2: Could not start instance cll2 on node localhost-domain1 (localhost). Command failed on node localhost-domain1 (localhost): Previous synchronization failed at Jan 4, 2012 5:28:26 PM Will perform full synchronization. Removing all cached state for instance cll2. Command start-local-instance failed. CLI802 Synchronization failed for directory config, caused by: remote failure: Command timed out. Unable to acquire a lock to access the domain. Another command acquired exclusive access to the domain on Wed, 04 Jan 2012 17:28:56 PST. Retry the command at a later time. To complete this operation run the following command locally on host localhost from the GlassFish install location /export/home/j2eetest/3.1.2/glassfish3: bin/asadmin start-local-instance --node localhost-domain1 --sync normal cll2 clt1: Could not start instance clt1 on node tuppy (tuppy). Command failed on node tuppy (tuppy): Previous synchronization failed at Jan 4, 2012 6:04:59 PM Will perform full synchroni .... msg.seeServerLog

      I'will attach screenshot and server.log. I am not certain what is causing this issue but I was testing Download Logs in Admin Console before hitting this issue. I'm going to leave the machine intact, in case anyone wants to poke around. This issue may not be easily reproducible.

        Activity

        Hide
        lidiam added a comment -

        I had another cluster with two instances on a DCOM node. After removing this cluster, I could start the solaris/ssh cluster without problems. This seems to have been caused by issue http://java.net/jira/browse/GLASSFISH-18123 - trying to download log files for a server instance on a DCOM node.

        Show
        lidiam added a comment - I had another cluster with two instances on a DCOM node. After removing this cluster, I could start the solaris/ssh cluster without problems. This seems to have been caused by issue http://java.net/jira/browse/GLASSFISH-18123 - trying to download log files for a server instance on a DCOM node.
        Hide
        Anissa Lam added a comment -

        Console is showing whatever backend gives back.
        Transfer to backend for evaluation.

        Show
        Anissa Lam added a comment - Console is showing whatever backend gives back. Transfer to backend for evaluation.
        Hide
        lidiam added a comment -

        I tried several times, but cannot reproduce it. I see messages like:

        [#|2012-01-04T21:34:23.865-0800|INFO|glassfish3.1.2|javax.enterprise.system.tools.admin.com.sun.enterprise.v3.admin.cluster|_ThreadID=286;_ThreadName=Thread-2;|Warning: Synchronization with DAS failed, continuing startup...

        but cluster starts fine after that.

        Show
        lidiam added a comment - I tried several times, but cannot reproduce it. I see messages like: [#|2012-01-04T21:34:23.865-0800|INFO|glassfish3.1.2|javax.enterprise.system.tools.admin.com.sun.enterprise.v3.admin.cluster|_ThreadID=286;_ThreadName=Thread-2;|Warning: Synchronization with DAS failed, continuing startup... but cluster starts fine after that.
        Hide
        Byron Nevins added a comment -

        This bug seems to be related to domain-locking and synchronization.

        Show
        Byron Nevins added a comment - This bug seems to be related to domain-locking and synchronization.
        Hide
        Tom Mueller added a comment -

        The submitter of this issue reported that this issue cannot be reproduced.
        Marking this as "Cannot Reproduce".

        If this issue shows up again, please reopen the issue.

        Show
        Tom Mueller added a comment - The submitter of this issue reported that this issue cannot be reproduced. Marking this as "Cannot Reproduce". If this issue shows up again, please reopen the issue.

          People

          • Assignee:
            Tom Mueller
            Reporter:
            lidiam
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: