glassfish
  1. glassfish
  2. GLASSFISH-12706

Windows: start-instance on local instance hangs in ProcessManager

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1
    • Fix Version/s: 3.1_ms04
    • Component/s: admin
    • Labels:
      None
    • Environment:

      Operating System: Windows (generic)
      Platform: PC

    • Issuezilla Id:
      12,706

      Description

      How to reproduce:

      – das is running –

      1. create-cluster c1
      2. create-local-instance --cluster c1 i1
      3. create-local-instance --cluster c1 i2
      4. start-cluster c1

      after a VERY long time (10 minutes??)
      asadmin> start-cluster c1
      I/O Error: Read timed out
      Command start-cluster failed.
      asadmin>
      asadmin>
      ==============
      5. list-instances --verbose shows i1 running, i2 not running
      6. start-local-instance i2

      step 6 succeeds - no problem. therefore something must be broken in start-cluster

        Issue Links

          Activity

          Hide
          Joe Di Pol added a comment -

          The command being run is "asadmin". asadmin terminates and returns.
          ProcessManager does not return.

          Show
          Joe Di Pol added a comment - The command being run is "asadmin". asadmin terminates and returns. ProcessManager does not return.
          Hide
          Byron Nevins added a comment -

          This is not a bug. it is doing precisely what it was engineered to do. No
          callers have ever needed to start a long-running process. 100% of callers want
          the class to do what it does – reliably start a process, wait for it to finish,
          return stdout and stderr and integer exit value. And do all that without
          deadlocking.

          What you are doing in start-cluster is indirectly calling ProcessManager via
          LocalAdminCommand. In order to do what you want for this new feature both of
          those classes need to be changed.

          The changes are easy but time-consuming. There is no way I can do it for you
          before MS3.

          2 choices:
          1) You can do it yourself - right now - and I can advise you on how/what to do.
          Then you'll have start-cluster working for MS3

          2) I'll set it to MS4 and work on it later this week.

          I'll assume you want #2 nd will change attributes of the issue to match...

          p1->P3
          3.1->3.1_ms4

          Show
          Byron Nevins added a comment - This is not a bug. it is doing precisely what it was engineered to do. No callers have ever needed to start a long-running process. 100% of callers want the class to do what it does – reliably start a process, wait for it to finish, return stdout and stderr and integer exit value. And do all that without deadlocking. What you are doing in start-cluster is indirectly calling ProcessManager via LocalAdminCommand. In order to do what you want for this new feature both of those classes need to be changed. The changes are easy but time-consuming. There is no way I can do it for you before MS3. 2 choices: 1) You can do it yourself - right now - and I can advise you on how/what to do. Then you'll have start-cluster working for MS3 2) I'll set it to MS4 and work on it later this week. I'll assume you want #2 nd will change attributes of the issue to match... p1->P3 3.1->3.1_ms4
          Hide
          Byron Nevins added a comment -

          Accidentally changed attributes on this issue.
          Restoring them.

          Ignore most of my comments. It must be some sort of grand-child process issue
          on Windows only.
          Nobody ever uses that class for long-running processes so it's not so surprising.

          Show
          Byron Nevins added a comment - Accidentally changed attributes on this issue. Restoring them. Ignore most of my comments. It must be some sort of grand-child process issue on Windows only. Nobody ever uses that class for long-running processes so it's not so surprising.
          Hide
          Joe Di Pol added a comment -

          With the fix to issue 12725 this should be fixed. I just need to test it to verify.

          Show
          Joe Di Pol added a comment - With the fix to issue 12725 this should be fixed. I just need to test it to verify.
          Hide
          Joe Di Pol added a comment -

          I have verified that the fix to issue 12725 has fixed this hang.
          That fix was in r38964 and r38965

          Show
          Joe Di Pol added a comment - I have verified that the fix to issue 12725 has fixed this hang. That fix was in r38964 and r38965

            People

            • Assignee:
              Joe Di Pol
              Reporter:
              Byron Nevins
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: