glassfish
  1. glassfish
  2. GLASSFISH-20799

Threads stuck in ClassCopierFactoryPipelineImpl

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.1.2.2
    • Fix Version/s: None
    • Component/s: orb
    • Labels:
      None
    • Environment:

      CentOS 5.8
      java version "1.7.0_25"

      Description

      On one of our environments GlassFish becomes unresponsive with threads stuck in the following WAIT state:

      "http-thread-pool-8080(1)" - Thread t@238
      java.lang.Thread.State: WAITING
      at sun.misc.Unsafe.park(Native Method)

      • parking to wait for <5def7775> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
        at com.sun.corba.ee.impl.orbutil.copyobject.ClassCopierFactoryPipelineImpl.getClassCopier(ClassCopierFactoryPipelineImpl.java:276)

      No thread in the stacktrace seems to hold the lock or is in a runnable state, all threads are waiting to acquire the lock.

      We tried to debug the issue adding system outs to the ClassCopierFactoryPipelineImpl class. It shows the following error scenario:

      [#|2013-09-06T11:02:49.473+0200|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=243;_ThreadName=Thread-2;|###LOCKISSUE: Acquired read lock for thread 243|#]

      [#|2013-09-06T11:02:49.474+0200|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=243;_ThreadName=Thread-2;|###LOCKISSUE: Released read lock for thread 243|#]

      [#|2013-09-06T11:02:49.474+0200|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=243;_ThreadName=Thread-2;|###LOCKISSUE: Acquired read lock for thread 243|#]

      [#|2013-09-06T11:02:49.474+0200|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=243;_ThreadName=Thread-2;|###LOCKISSUE: Trying to acquire write lock for thread: 243|#]

      [#|2013-09-06T11:02:49.877+0200|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=211;_ThreadName=Thread-2;|###LOCKISSUE: Acquired read lock for thread 211|#]

      [#|2013-09-06T11:02:49.877+0200|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=211;_ThreadName=Thread-2;|###LOCKISSUE: Released read lock for thread 211|#]

      [#|2013-09-06T11:02:49.878+0200|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=211;_ThreadName=Thread-2;|###LOCKISSUE: Acquired read lock for thread 211|#]

      [#|2013-09-06T11:02:49.878+0200|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=211;_ThreadName=Thread-2;|###LOCKISSUE: Released read lock for thread 211|#]

      The write lock for thread 243 never comes back. After some time also the read lock request get stuck and the server becomes unresponsive.

        Activity

        Hide
        tl_shark added a comment -

        Hello,

        we have the same problem. After searching for a cause for several days, we have the idea that the cause might be the second cpu installed some weeks ago. Could you tell me, if you still have this problem and if you have several cpu's.

        At the moment we try to solve the problem by locking the glassfish process to just one of the cpu's.

        Show
        tl_shark added a comment - Hello, we have the same problem. After searching for a cause for several days, we have the idea that the cause might be the second cpu installed some weeks ago. Could you tell me, if you still have this problem and if you have several cpu's. At the moment we try to solve the problem by locking the glassfish process to just one of the cpu's.
        Hide
        boernd added a comment -

        Hi,

        at the moment we are only experiencing this issue on one of our test environments which is a KVM based virtualization. One test server has serveral CPUs assigned. So maybe the issue is KVM related and not a GF issue at all.

        We "fixed" the issue by making the getClassCopier method in the ClassCopierFactoryPipelineImpl class synchronized and removing the ReentrantReadWriteLock stuff. This has probably a big performance impact. But we are using the environment just for functional testing so it works for us.

        We are upgrading the KVM next week, maybe this solves the issue.

        Show
        boernd added a comment - Hi, at the moment we are only experiencing this issue on one of our test environments which is a KVM based virtualization. One test server has serveral CPUs assigned. So maybe the issue is KVM related and not a GF issue at all. We "fixed" the issue by making the getClassCopier method in the ClassCopierFactoryPipelineImpl class synchronized and removing the ReentrantReadWriteLock stuff. This has probably a big performance impact. But we are using the environment just for functional testing so it works for us. We are upgrading the KVM next week, maybe this solves the issue.

          People

          • Assignee:
            Harshad Vilekar
            Reporter:
            boernd
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: