glassfish
  1. glassfish
  2. GLASSFISH-20535

Intermittent failure in admin-devtest-trunk-windows hudson job (start-domain/restart-domain/start-instance/start-local-instance)

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 4.0_b89_RC5
    • Fix Version/s: 4.1
    • Component/s: admin
    • Labels:
      None
    • Environment:

      Windows 7

      Description

      The admin-devtest-trunk-windows hudson job is failing intermittently with failures in the following commands:

      start-domain
      restart-domain
      start-instance (via start-cluster)
      start-local-instance

      The hudson job page is here. Recent failing jobs have been: #3000, #3002, #3003, #3005.
      http://hudson-sca.us.oracle.com/view/GF%20Trunk/job/admin-devtests-trunk-windows/

        Activity

        Hide
        Tom Mueller added a comment -

        This problem has shown up in the 4.0 branch job too:

        http://hudson-sca.us.oracle.com/job/admin-devtests-4.0-windows/2/

        Show
        Tom Mueller added a comment - This problem has shown up in the 4.0 branch job too: http://hudson-sca.us.oracle.com/job/admin-devtests-4.0-windows/2/
        Hide
        Byron Nevins added a comment -

        This reminds me of the Heisenberg uncertainty principle. The moment I add diag code to the portion of the tests that fail - POOF. They don't fail anymore. And something seemingly completely different fails.

        I've marked it for 4.0.1

        I don't know what the real problem is. I'll just keep adding more and better diagnosis code directly into the tests and we'll find out in due course.

        Show
        Byron Nevins added a comment - This reminds me of the Heisenberg uncertainty principle. The moment I add diag code to the portion of the tests that fail - POOF. They don't fail anymore. And something seemingly completely different fails. I've marked it for 4.0.1 I don't know what the real problem is. I'll just keep adding more and better diagnosis code directly into the tests and we'll find out in due course.
        Hide
        Byron Nevins added a comment -

        The failed test on the 4.0 branch is restart-domain. This issue is about a start-cluster issue.

        I suspect that it's gremlins that are only seen in automated tests. I can't ever reproduce it.

        Show
        Byron Nevins added a comment - The failed test on the 4.0 branch is restart-domain. This issue is about a start-cluster issue. I suspect that it's gremlins that are only seen in automated tests. I can't ever reproduce it.
        Hide
        Tom Mueller added a comment -

        Byron, John Wells suggested adding a build step to the end of the hudson job that would look for an ASMain processes that are left after the run and then run jstack on them. This might help in determining if the failure to start the server is caused by a deadlock.

        Show
        Tom Mueller added a comment - Byron, John Wells suggested adding a build step to the end of the hudson job that would look for an ASMain processes that are left after the run and then run jstack on them. This might help in determining if the failure to start the server is caused by a deadlock.
        Hide
        marina vatkina added a comment -

        I'm wondering if it is the same problem as I observed on the gf-transaction-cluster-devtest since November, until I changed the tests to dynamically determine the ports grabbed by the instances in a new cluster.

        Show
        marina vatkina added a comment - I'm wondering if it is the same problem as I observed on the gf-transaction-cluster-devtest since November, until I changed the tests to dynamically determine the ports grabbed by the instances in a new cluster.

          People

          • Assignee:
            Byron Nevins
            Reporter:
            Tom Mueller
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated: