glassfish
  1. glassfish
  2. GLASSFISH-15347

java.lang.OutOfMemoryError: Java heap space and other failures during Nile Book Store longevity run.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1_b33
    • Fix Version/s: 3.1_b34
    • Labels:
      None
    • Environment:

      Solaris Sparc jdk 6u22

      Description

      b33 started 7-day longevity runs using NileBookStore bigapp against a 3-node cluster on 4 sparc solaris machines.

      Bug:
      After a few hours of run the following exceptions were thrown massively.
      Note: Intermittently transactions succeed.

      [#|2010-12-25T13:58:04.991-0800|SEVERE|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=16;_ThreadName=Thread-1;|java.lang.OutOfMemoryError: Java heap space
      at java.util.Arrays.copyOfRange(Arrays.java:3209)
      at java.lang.String.<init>(String.java:215)
      at java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542)
      at java.nio.CharBuffer.toString(CharBuffer.java:1157)
      at com.sun.enterprise.web.PEAccessLogValve.log(PEAccessLogValve.java:652)
      at com.sun.enterprise.web.PEAccessLogValve.run(PEAccessLogValve.java:1122)
      at java.lang.Thread.run(Thread.java:662)

      #]

      [#|2010-12-24T22:35:15.600-0800|WARNING|oracle-glassfish3.1|org.shoal.ha.cache.command.load_request|_ThreadID=16;_ThreadName=Thread-1;|LoadRequestCommand timed out while waiting for result java.util.concurrent.TimeoutException|#]

      [#|2010-12-24T22:35:15.900-0800|WARNING|oracle-glassfish3.1|org.shoal.ha.cache.command.load_request|_ThreadID=16;_ThreadName=Thread-1;|LoadRequestCommand timed out while waiting for result java.util.concurrent.TimeoutException|#]

      [#|2010-12-24T22:35:17.000-0800|WARNING|oracle-glassfish3.1|org.shoal.ha.cache.command.load_request|_ThreadID=16;_ThreadName=Thread-1;|LoadRequestCommand timed out while waiting for result java.util.concurrent.TimeoutException|#]

      [#|2010-12-24T22:35:18.530-0800|WARNING|oracle-glassfish3.1|org.shoal.ha.cache.command.load_request|_ThreadID=16;_ThreadName=Thread-1;|LoadRequestCommand timed out while waiting for result java.util.concurrent.TimeoutException|#]

      org.shoal.ha.cache.command.save|_ThreadID=16;_ThreadName=Thread-1;|Aborting command transmission for ReplicationFramePayloadCommand:1 because beforeTransmit returned false|#]
      [#|2010-12-25T13:41:39.169-0800|WARNING|oracle-glassfish3.1|ShoalLogger|_ThreadID=16;_ThreadName=Thread-1;|Error during groupHandle.sendMessage(null, /NileBookStore; size=287193|#]

      java.net.SocketException: Invalid argument
      at sun.nio.ch.Net.setIntOption0(Native Method)
      at sun.nio.ch.Net.setIntOption(Net.java:157)
      at sun.nio.ch.SocketChannelImpl$1.setInt(SocketChannelImpl.java:406)
      at sun.nio.ch.SocketOptsImpl.setBoolean(SocketOptsImpl.java:38)
      at sun.nio.ch.SocketOptsImpl$IP$TCP.noDelay(SocketOptsImpl.java:284)
      at sun.nio.ch.OptionAdaptor.setTcpNoDelay(OptionAdaptor.java:48)
      at sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:268)
      at com.sun.grizzly.http.SelectorThread.setSocketOptions(SelectorThread.java:1490)
      at com.sun.grizzly.http.SelectorThreadHandler.configureChannel(SelectorThreadHandler.java:91)
      at com.sun.grizzly.http.SelectorThreadHandler.onAcceptInterest(SelectorThreadHandler.java:102)
      at com.sun.grizzly.SelectorHandlerRunner.handleSelectedKey(SelectorHandlerRunner.java:300)
      at com.sun.grizzly.SelectorHandlerRunner.handleSelectedKeys(SelectorHandlerRunner.java:263)
      at com.sun.grizzly.SelectorHandlerRunner.doSelect(SelectorHandlerRunner.java:200)
      at com.sun.grizzly.SelectorHandlerRunner.run(SelectorHandlerRunner.java:132)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:662)

      #]

      All logs:
      http://aras2.us.oracle.com:8080/logs/gf31/gms/set_12_25_10_t_14_13_41/scenario_0001_Sat_Dec_25_14_14_09_PST_2010/

      physical location:
      /net/asqe-logs.us.oracle.com/export1/gms/gf31/gms/set_12_25_10_t_14_13_41/scenario_0001_Sat_Dec_25_14_14_09_PST_2010/

        Issue Links

          Activity

          Hide
          Mahesh Kannan added a comment -

          Closing this based on Shreedhar's comment

          Show
          Mahesh Kannan added a comment - Closing this based on Shreedhar's comment
          Hide
          shreedhar_ganapathy added a comment -

          Based on feedback from Rajiv and Sony, the issue seems to be the same as the one reported in 15231 which was seen in b33 and fixed in b34.

          Also the heap size for Niles app should be -Xmx1024m based on input from Sony from runs in prior releases. The domain xml shows the run was set at 512m.

          Please run with b34 and if you see this issue, please reopen it.

          Show
          shreedhar_ganapathy added a comment - Based on feedback from Rajiv and Sony, the issue seems to be the same as the one reported in 15231 which was seen in b33 and fixed in b34. Also the heap size for Niles app should be -Xmx1024m based on input from Sony from runs in prior releases. The domain xml shows the run was set at 512m. Please run with b34 and if you see this issue, please reopen it.
          Hide
          zorro added a comment -

          7-day run against nightly build 33 stopped after 2 days with failures stated above.
          Stopping cluster failed with:
          asadmin stop-cluster clusterz1
          No response from Domain Admin Server after 600 seconds.
          The command is either taking too long to complete or the server has failed.
          Please see the server log files for command status.
          Command stop-cluster failed.

          all logs:
          http://aras2.us.oracle.com:8080/logs/gf31/gms/set_12_27_10_t_12_14_01/scenario_0001_Mon_Dec_27_12_31_14_PST_2010/

          physical location.
          /net/asqe-logs.us.oracle.com/export1/gms/gf31/gms/set_12_27_10_t_12_14_01/scenario_0001_Mon_Dec_27_12_31_14_PST_2010/

          Show
          zorro added a comment - 7-day run against nightly build 33 stopped after 2 days with failures stated above. Stopping cluster failed with: asadmin stop-cluster clusterz1 No response from Domain Admin Server after 600 seconds. The command is either taking too long to complete or the server has failed. Please see the server log files for command status. Command stop-cluster failed. all logs: http://aras2.us.oracle.com:8080/logs/gf31/gms/set_12_27_10_t_12_14_01/scenario_0001_Mon_Dec_27_12_31_14_PST_2010/ physical location. /net/asqe-logs.us.oracle.com/export1/gms/gf31/gms/set_12_27_10_t_12_14_01/scenario_0001_Mon_Dec_27_12_31_14_PST_2010/

            People

            • Assignee:
              Mahesh Kannan
              Reporter:
              zorro
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: