glassfish
  1. glassfish
  2. GLASSFISH-10722

"Out of Memory" error seen with webservices application

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: V3
    • Fix Version/s: V3
    • Component/s: sqe-test
    • Labels:
      None
    • Environment:

      Operating System: All
      Platform: Sun

    • Issuezilla Id:
      10,722

      Description

      GlassFish V3 B71 Nightly dated 10/29/09
      JDK1.6.0_16 B01

      During longevity testing of webservices application, after about 22 hours into
      the run, "Out of Memory" error is observed which appears to be coming from
      webservices. The instance has crashed so not able to take jstack. But the run
      has produced hs_err_pid.log file and core file.

      This run is with the fix for Issue 9004 which got integrated into Metro B25
      which got integrated in GF B71 Nightly build dated 10/29/09.

      [#|2009-10-30T12:33:58.871-0700|SEVERE|glassfishv3.0|javax.enterprise.resource.webservices.jaxws.servlet.http|_ThreadID=27;_ThreadName=Thread-1;|caught
      throwable
      java.lang.OutOfMemoryError
      at sun.misc.Unsafe.allocateMemory(Native Method)
      at sun.nio.ch.NativeObject.<init>(NativeObject.java:60)
      at sun.nio.ch.AllocatedNativeObject.<init>(AllocatedNativeObject.java:34)
      at
      sun.nio.ch.DevPollArrayWrapper.updateRegistrations(DevPollArrayWrapper.java:190)
      at sun.nio.ch.DevPollArrayWrapper.poll(DevPollArrayWrapper.java:163)
      at sun.nio.ch.DevPollSelectorImpl.doSelect(DevPollSelectorImpl.java:68)
      at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
      at sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:88)
      at
      com.sun.grizzly.util.SelectorFactory.selectNowAndReturnSelector(SelectorFactory.java:158)
      at com.sun.grizzly.util.OutputWriter.flushChannel(OutputWriter.java:142)
      at com.sun.grizzly.util.OutputWriter.flushChannel(OutputWriter.java:74)
      at
      com.sun.grizzly.http.SocketChannelOutputBuffer.flushChannel(SocketChannelOutputBuffer.java:321)
      at
      com.sun.grizzly.http.SocketChannelOutputBuffer.flushBuffer(SocketChannelOutputBuffer.java:386)
      at
      com.sun.grizzly.http.SocketChannelOutputBuffer.flush(SocketChannelOutputBuffer.java:364)
      at com.sun.grizzly.http.ProcessorTask.action(ProcessorTask.java:1172)
      at com.sun.grizzly.tcp.Response.action(Response.java:260)
      at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:378)
      at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:351)
      at
      org.apache.catalina.connector.CoyoteOutputStream.flush(CoyoteOutputStream.java:176)
      at com.ctc.wstx.io.UTF8Writer.flush(UTF8Writer.java:99)
      at com.ctc.wstx.sw.BufferingXmlWriter.flush(BufferingXmlWriter.java:214)
      at com.ctc.wstx.sw.BufferingXmlWriter.close(BufferingXmlWriter.java:194)
      at com.ctc.wstx.sw.BaseStreamWriter.finishDocument(BaseStreamWriter.java:1685)
      at
      com.ctc.wstx.sw.BaseStreamWriter.writeEndDocument(BaseStreamWriter.java:585)
      at
      com.sun.xml.ws.message.AbstractMessageImpl.writeTo(AbstractMessageImpl.java:146)
      at com.sun.xml.ws.encoding.StreamSOAPCodec.encode(StreamSOAPCodec.java:109)
      at
      com.sun.xml.ws.encoding.SOAPBindingCodec.encode(SOAPBindingCodec.java:284)
      at
      com.sun.xml.ws.transport.http.HttpAdapter.encodePacket(HttpAdapter.java:380)
      at com.sun.xml.ws.transport.http.HttpAdapter.access$100(HttpAdapter.java:92)
      at
      com.sun.xml.ws.transport.http.HttpAdapter$HttpToolkit.handle(HttpAdapter.java:525)
      at com.sun.xml.ws.transport.http.HttpAdapter.handle(HttpAdapter.java:285)
      at
      com.sun.xml.ws.transport.http.servlet.ServletAdapter.handle(ServletAdapter.java:143)
      at
      com.sun.xml.ws.transport.http.servlet.WSServletDelegate.doGet(WSServletDelegate.java:151)
      at
      com.sun.xml.ws.transport.http.servlet.WSServletDelegate.doPost(WSServletDelegate.java:182)
      at com.sun.xml.ws.transport.http.servlet.WSServlet.doPost(WSServlet.java:76)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:754)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
      at
      org.apache.catalina.core.StandardWrapper.service(StandardWrapper.java:1523)
      at
      org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:279)
      at
      org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:188)
      at
      org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:641)
      at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:97)
      at
      com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.java:85)
      at
      org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:185)
      at
      org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:332)
      at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:233)
      at
      com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:161)
      at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:789)
      at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:697)
      at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:952)
      at
      com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:166)
      at
      com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:135)
      at
      com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:102)
      at
      com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:88)
      at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:76)
      at
      com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:53)
      at
      com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:57)
      at com.sun.grizzly.ContextTask.run(ContextTask.java:69)
      at
      com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:330)
      at
      com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:309)
      at java.lang.Thread.run(Thread.java:619)

      #]
      1. hs_err_pid1016.log
        28 kB
        meenap
      2. jmap.out.7
        341 kB
        Joe Fialli
      3. jstack.out.6
        123 kB
        Joe Fialli
      4. jstack.out.7
        130 kB
        Joe Fialli

        Activity

        Hide
        Joe Fialli added a comment -

        Status update:

        Test is running with following configuration changes.

        Running with 64-bit jvm.
        Increased -Xss128k to -Xss512k // increase stack size due to 64 bit jvm.
        // Was experiencing stack overflow in server.log
        // before change.
        // in galactic.props,
        //globalvmoptions=-d64 -Xms1500m -Xmx1500m -Xss512k -XX:+DisableExplicitGC
        //-Xloggc:/export/home/glassfishv3/glassfish/domains/galactica/logs/gc.log

        *********

        Change number http-threads from
        64 to 24. // only 20 client threads at a time.
        // 64 http-thread was too many, especially when
        // trying to run in 32 bit jvm.
        // in galactica.props, changed from 64 to galactica.http_threadpoolsize=24
        // recommendation would be to set min-threads in app server to 20 and have a max
        // number of threads that matches the machine power and resources. 64 threads
        // was too many threads for a 32 bit jvm. Not sure what max should be but
        // since test configuration is currectly set to 20 concurrent clients, 24
        // threads are sufficient. There are resource overheads per http-thread,
        // complicated by high overhead in malloc (and subsequently needing to be freed
        // by finalizers) for PKCS11 impl usage for this web services app.

        Run has been running for 12 hours. Process size taken over that time shows
        promising stability.

        root@bigapp-niagara-3 # more ps.out
        PID VSZ RSS
        6693 1742656 801088
        PID VSZ RSS
        6693 1767272 929896
        PID VSZ RSS
        6693 1781616 1398128
        PID VSZ RSS
        6693 1868776 1793512
        PID VSZ RSS
        6693 1868776 1793512

        Based on dtracing malloc of native memory, the theory is that there was a lot
        of PKCS11 malloc'ed data waiting to be freed by finalizers. (jmap heap dump of
        live data showed that only thing keeping gathering PKCS11 objects in memory was
        finalizers that were not getting run too frequently before). GC in 64 bit env
        seems to be running them frequently and keeping memory usage steady.

        At this point, the altered test configuration appears to address the issue we
        observed before that around 20 hour time, the process size was 3.7 GB, too close
        to max process size for 32 bit jvm.

        jstack logs and ps.out log are in domains/galatica directory where the test is
        running.

        Show
        Joe Fialli added a comment - Status update: Test is running with following configuration changes. Running with 64-bit jvm. Increased -Xss128k to -Xss512k // increase stack size due to 64 bit jvm. // Was experiencing stack overflow in server.log // before change. // in galactic.props, //globalvmoptions=-d64 -Xms1500m -Xmx1500m -Xss512k -XX:+DisableExplicitGC //-Xloggc:/export/home/glassfishv3/glassfish/domains/galactica/logs/gc.log ********* Change number http-threads from 64 to 24. // only 20 client threads at a time. // 64 http-thread was too many, especially when // trying to run in 32 bit jvm. // in galactica.props, changed from 64 to galactica.http_threadpoolsize=24 // recommendation would be to set min-threads in app server to 20 and have a max // number of threads that matches the machine power and resources. 64 threads // was too many threads for a 32 bit jvm. Not sure what max should be but // since test configuration is currectly set to 20 concurrent clients, 24 // threads are sufficient. There are resource overheads per http-thread, // complicated by high overhead in malloc (and subsequently needing to be freed // by finalizers) for PKCS11 impl usage for this web services app. Run has been running for 12 hours. Process size taken over that time shows promising stability. root@bigapp-niagara-3 # more ps.out PID VSZ RSS 6693 1742656 801088 PID VSZ RSS 6693 1767272 929896 PID VSZ RSS 6693 1781616 1398128 PID VSZ RSS 6693 1868776 1793512 PID VSZ RSS 6693 1868776 1793512 Based on dtracing malloc of native memory, the theory is that there was a lot of PKCS11 malloc'ed data waiting to be freed by finalizers. (jmap heap dump of live data showed that only thing keeping gathering PKCS11 objects in memory was finalizers that were not getting run too frequently before). GC in 64 bit env seems to be running them frequently and keeping memory usage steady. At this point, the altered test configuration appears to address the issue we observed before that around 20 hour time, the process size was 3.7 GB, too close to max process size for 32 bit jvm. jstack logs and ps.out log are in domains/galatica directory where the test is running.
        Hide
        kumarjayanti added a comment -

        Reassigning to Joe since he continues to investigate and add more insights into it.

        Show
        kumarjayanti added a comment - Reassigning to Joe since he continues to investigate and add more insights into it.
        Hide
        Joe Fialli added a comment -

        Status:

        Ran test over 72 hours at 39 to 40% cpu, virtual memory at 1.8 GB against
        64 bit jvm with following new tunings in test. (monitored with jconsole)

        • http-threads changed from 64 to 24. (test was only using 20 threads for client)
        • stack size changed from 128k to 512k. (stackovererror on 64 bit jvm w/o this
          change)

        Conclusion is that PKSC11 was creating a lot of Java object with native memory
        associated with them and the finializers that free'ed the native memory were not
        getting run frequently enough on 32 bit jvm, thereby exceeding 4GB process
        limit. (Test was setting aside 1. GB for java heap and rest of memory was native
        memory allocation.)

        Confirmed via jmap live object dump that pkcs11 native allocated objects
        reachable from java objects were only reachable via outstanding finalizers.
        Finalizers ran frequently in 64 bit jvm.

        Unsure why 64 bit jvm with above config changes was completely stable at 1.8 GB
        virtual process size. However, this is definitely a tuning issue not a native
        memory leak or it would have happened in 64 bit case. There exists a chance
        that just lowering the http-threads from 64 to 24 might have been the key.
        A lot of NIO buffers are retained per thread, lowering number of threads in
        http container might have reduced native memory consumption significantly itself.

        Reassigning issue to sqe-test and have meena confirm these results that tuning
        of test is sufficient to avoid OOM of native memory.

        Show
        Joe Fialli added a comment - Status: Ran test over 72 hours at 39 to 40% cpu, virtual memory at 1.8 GB against 64 bit jvm with following new tunings in test. (monitored with jconsole) http-threads changed from 64 to 24. (test was only using 20 threads for client) stack size changed from 128k to 512k. (stackovererror on 64 bit jvm w/o this change) Conclusion is that PKSC11 was creating a lot of Java object with native memory associated with them and the finializers that free'ed the native memory were not getting run frequently enough on 32 bit jvm, thereby exceeding 4GB process limit. (Test was setting aside 1. GB for java heap and rest of memory was native memory allocation.) Confirmed via jmap live object dump that pkcs11 native allocated objects reachable from java objects were only reachable via outstanding finalizers. Finalizers ran frequently in 64 bit jvm. Unsure why 64 bit jvm with above config changes was completely stable at 1.8 GB virtual process size. However, this is definitely a tuning issue not a native memory leak or it would have happened in 64 bit case. There exists a chance that just lowering the http-threads from 64 to 24 might have been the key. A lot of NIO buffers are retained per thread, lowering number of threads in http container might have reduced native memory consumption significantly itself. Reassigning issue to sqe-test and have meena confirm these results that tuning of test is sufficient to avoid OOM of native memory.
        Hide
        meenap added a comment -

        The run with 64 Bits JDK1.6.0_16 has passed 24X7 successfully in B71 Nightly.
        Will verify this issue after running the test again with same config on HCF
        promoted build.

        Show
        meenap added a comment - The run with 64 Bits JDK1.6.0_16 has passed 24X7 successfully in B71 Nightly. Will verify this issue after running the test again with same config on HCF promoted build.
        Hide
        meenap added a comment -

        The run with 64 Bits JDK1.6.0_16 has passed 24X3 successfully in B73.
        Marking issue as resolved fixed.

        Show
        meenap added a comment - The run with 64 Bits JDK1.6.0_16 has passed 24X3 successfully in B73. Marking issue as resolved fixed.

          People

          • Assignee:
            meenap
            Reporter:
            meenap
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: