mq
  1. mq
  2. MQ-116

MQ Transaction reaper slow bottleneck seen when a lot of XA transaction stress load

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.4u2
    • Fix Version/s: 4.5.2, 5.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      Unix, MQ44u2p0x, Single standalone MQ with JDBC persistence

      Description

      Problem Description
      -------------------
      Using a standalone MQ with JDBC persistence and non-clustered.
      When a XA/Transactional message consumer and producers
      are run with load, and the imq.txn.reapLimit default
      it is seen that with the following MQ debug

      -Dimq.debug.com.sun.messaging.jmq.util.timer.WakeupableTimer=true \
      -Dimq.txn.reapInterval=15 \
      -Dimq.cluster.debug.txn=true

      that

      1) The reaper thread is continuous running
      2) But the rate of in transaction creation is just to much
      that the MQTXN41<brokername> table and also the TransactionState
      in the JDK histogram increases in count overtime

      This means that the reaper prevent long run of high load
      situation as it will not be stable according to M/M/1 queue
      model.

      Observation:
      --------------
      The table grows from the reapLimit 500 --> 30000 --> 48401
      (in the matter of minutes) using a 50 concurrent thread
      MT XA consumer.

      Sample jmap -histo:live (note the MQ41TXN count is around 51000
      too).

      num #instances #bytes class name
      ----------------------------------------------
      1: 269771 22938288 [Ljava.util.HashMap$Entry;
      2: 138879 14635224 [C
      3: 108637 10341744 [B
      4: 163625 7854000 java.util.LinkedHashMap
      5: 5815 6918240 [I
      6: 182597 5843104 java.util.LinkedHashMap$Entry
      7: 52442 5034432 com.sun.messaging.jmq.jmsserver.data.TransactionState
      8: 106133 4245320 java.util.HashMap
      9: 71401 4223616 [Ljava.lang.Object;
      10: 161358 3872592 java.util.HashMap$Entry
      11: 139260 3342240 java.lang.String
      12: 27900 3236560 <constMethodKlass>
      13: 52433 2936248 com.sun.messaging.jmq.jmsserver.data.TransactionInformation
      14: 108554 2605296 com.sun.messaging.jmq.jmsserver.data.TransactionUID

      Note 1: Problem is that if the broker crash it takes also some time to
      start to process these transaction entries and it may not be
      able to clear them when load continues (compounding issues) ***

      Note 2: One restarting and no load, the clearing of the MQ41TXN table
      is not that quick as it is seen in one second it takes 1000 or
      so rows.

      Testcase
      ----------
      0. Setup a single broker JDBC backed MQ
      1. Refer: 6961586 for the MQ XA stress for the testcase
      2. Query the MQ41TXN<brokername> occassionally to see
      growth
      3. Run jmap -histo:live and observe growth.

        Activity

        Hide
        gfuser9999 added a comment -

        Generally, it would probably be best to have the JMS service thread that's done with the transaction especially those with
        commit state.Say if the thing is to retain imqcmd list txn and the reapLimit feature, i suppose, the logic is that for most
        of the txnReaper.addXXXTransaction should be synchronous
        ie:

        • It still add the TID to the translist but
        • It probably should also do check like
          if translist.SIZE > REAPLIMIT
          remove top translist (not serviced by Reaper)
          so logic in reaper that remove 1 entry
          (earliest txn)

        Now one can have both the background reaper (still needed

        • since if there is no load who's going to cleanup old txn).
          and some synchronous cleanup by working jms thread
          to avoid the slow bottleneck.

        Obviously the devil is in the details

        Show
        gfuser9999 added a comment - Generally, it would probably be best to have the JMS service thread that's done with the transaction especially those with commit state.Say if the thing is to retain imqcmd list txn and the reapLimit feature, i suppose, the logic is that for most of the txnReaper.addXXXTransaction should be synchronous ie: It still add the TID to the translist but It probably should also do check like if translist.SIZE > REAPLIMIT remove top translist (not serviced by Reaper) so logic in reaper that remove 1 entry (earliest txn) Now one can have both the background reaper (still needed since if there is no load who's going to cleanup old txn). and some synchronous cleanup by working jms thread to avoid the slow bottleneck. Obviously the devil is in the details
        Hide
        amyk added a comment -

        The bottleneck in reaping thread has been fixed in 4.6 and 4.5.2, also filed MQ-119 to improve JDBC store removeTransaction efficiency
        http://java.net/jira/browse/MQ-119

        Show
        amyk added a comment - The bottleneck in reaping thread has been fixed in 4.6 and 4.5.2, also filed MQ-119 to improve JDBC store removeTransaction efficiency http://java.net/jira/browse/MQ-119
        Hide
        saradak added a comment -

        Verified the bug with MQ4.5.2-b2d.

        Steps to verify the bug:

        1. Start the broker with JDBC store.
        2. Send transacted messages.
        3. Run the imqcmd dump trans -debug command multiple times and check the committedCount in the dumpOutput file.
        4. The value should not vary in large numbers.

        -Sarada

        Show
        saradak added a comment - Verified the bug with MQ4.5.2-b2d. Steps to verify the bug: 1. Start the broker with JDBC store. 2. Send transacted messages. 3. Run the imqcmd dump trans -debug command multiple times and check the committedCount in the dumpOutput file. 4. The value should not vary in large numbers. -Sarada
        Hide
        saradak added a comment -

        Verified the bug with MQ4.5.2-b2d.

        Steps to verify the bug:

        1. Start the broker with JDBC store.
        2. Send transacted messages.
        3. Run the imqcmd dump trans -debug command multiple times and check the committedCount in the dumpOutput file.
        4. The value should not vary in large numbers.

        -Sarada

        Show
        saradak added a comment - Verified the bug with MQ4.5.2-b2d. Steps to verify the bug: 1. Start the broker with JDBC store. 2. Send transacted messages. 3. Run the imqcmd dump trans -debug command multiple times and check the committedCount in the dumpOutput file. 4. The value should not vary in large numbers. -Sarada

          People

          • Assignee:
            amyk
            Reporter:
            gfuser9999
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: