[mq-dev] Re: Persistent messages exceeding 1MB are lost

  • From: Amy Kang <amy.kang@...>
  • To: "Okubo, Takuji" <takujio@...>
  • Cc: dev@...
  • Subject: [mq-dev] Re: Persistent messages exceeding 1MB are lost
  • Date: Wed, 17 Jul 2013 22:29:12 -0700

Hi Takuji,

Thanks for the additional information. I now can reproduce the problem. The tentative fix you provided looks good. The 1M size is the message size boundary to determine whether the message should to use its own file. As you have pointed out the problem is caused by the 'scanned' flag not set true when hasMoreElements() returns false with obj == null on completion of directory first scan. The actual 'loss' of the messages happens at the step of producing 100 messages that's when the unexpected 2nd directory 'scan' occurred due to incorrect 'scanned' flag. The 2nd directory scan causes the same 'FREE' files to be added to the FilePool 2nd time, therefore the produced 100 messages actually are stored into 50 files, not 100. The problem can be reproduced also if the count 50 is 1, count 100 is 2. I'v filed following jira for this issue

https://java.net/jira/browse/MQ-322


>7.Create a destination queue. (The bug is not reproduced, if you skip this step. )
>    imqcmd create dst -t q -n JMSQueue

That's because if the destination is auto-created, the destination will be reapped at step 10, then there is no pre-existing 'FREE' message files for the destination which is a pre-condition for the bug to occur.

Thanks,
amy

On 07/17/2013 01:07 AM, Okubo, Takuji wrote:

Hi Amy,

We've reproduced the problem in using this test program.

MessageProducer producer = s.createProducer(dst);

ObjectMessage msg = s.createObjectMessage();

byte[] b = new byte[1024*1024];

msg.setObject(b);

for (int i = 0; i < count; i++) {

    producer.send(msg, DeliveryMode.PERSISTENT, 4, (long) 3000000);

}

Here is the more detailed steps which I have done.

1.Start Glassfish

asasdmin start-domain

2.Set JMS Service Type as REMOTE by Glassfish Web Admin

3.Create Connection Factory and Destination Resource by Glassfish Web Admin

4.Restart Glassfish

asadmin stop-domain

asadmin start-domain

5.Clear the MQ environment

imqbrokerd --remove instance

6.Start broker with default setting

Imqbrokerd

7.Create a destination queue. (The bug is not reproduced, if you skip this step. )

imqcmd create dst -t q -n JMSQueue

8.Send 50 messages in persistent mode. Each message exceeds 1MB (1024 x 1024 byte or more).

9.Receive all messages which have been sent in step 7.

10.Restart broker

imqcmd shutdown bkr

imqbrokerd

11.Send 100 messages in persistent mode. Each message exceeds 1MB (1024 x 1024 byte or more).

12.Check messages. The result of imqcmd list dst is 100.

---------------------------------------------------------------------------------------------

Name     Type    State      Producers Consumers                  Msgs

Total  Wildcard  Total  Wildcard  Count  Remote  UnAck  Avg Size

---------------------------------------------------------------------------------------------

JMSQueue Queue RUNNING 0 - 0 - 100 0 0 1048731.0

13.Restart broker

imqcmd shutdown bkr

imqbrokerd

14.Check messages. The result of imqcmd list dst is 50.

---------------------------------------------------------------------------------------------

Name     Type    State      Producers Consumers                  Msgs

Total  Wildcard  Total  Wildcard  Count  Remote  UnAck  Avg Size

---------------------------------------------------------------------------------------------

JMSQueue Queue RUNNING 0 - 0 - 50 0 0 1048731.0

> How did you come to that conclusion ?

We've looked at the step 12 and 14.

Regards,
Takuji

*From:*Amy Kang [mailto:amy.kang@...]
*Sent:* Wednesday, July 17, 2013 2:05 AM
*To:* Okubo, Takuji
*Cc:* dev@...
*Subject:* Re: [mq-dev] Persistent messages exceeding 1MB are lost

Hi Takuji,

Thanks for reporting and looking into this.

First I'd like to reproduce the problem. I'v tried some different message sizes over 1M (e.g. 1024*1024*2 bytes, 1024*1024+1024 bytes) sending (persistent) to a queue and didn't reproduce it - after following your steps #1-#6, 'imqcmd list dst' shows 100 messages and a receiver can receive 100 messages. Therefore I need more information.

>only 50 messages have been stored and the rest of 50 messages are lost.

How did you come to that conclusion ?

Thanks,
amy

On 07/15/2013 11:17 PM, Okubo, Takuji wrote:

    Hi there,

    We found a problem that some persistent messages are lost in Open
    MQ 4.4 and we have also confirmed that the same problem is
    reproduced in Open MQ 5.0. Our developers modified the source code
    to fix the problem and we would appreciate if you could check if
    our fix is correct. We have also found a workaround for this
    problem and we would like to check if the workaround is correct.

    I understand a bug report should be sent to the JIRA system.
    However, I am posting to the mailing list as I noticed it takes
    very long time to get response in JIRA in some cases. If I need to
    open a case in JIRA, please let me know. Thank you very much for
    your understanding.

    *Problem:*

    Persistent messages exceeding 1MB (1024 x 1024 byte) are lost.

    *Steps to reproduce:*

    This is a typical example to reproduce the problem but this can be
    reproduced in several ways.

    1. Start message broker with default setting.

    2. Send 50 messages in persistent mode. Each message exceeds 1MB
    (1024 x 1024 byte).

    3. Receive all messages which have been sent in step 2.

    4. Restart message broker.

    5. Send 100 messages in persistent mode. Each message exceeds 1MB
    (1024 x 1024 byte).

    6. Restart message broker.

    When the above steps are performed,

    We expect 100 messages are stored after restarting, as 100
    messages have been sent in step 5.

    However, only 50 messages have been stored and the rest of 50
    messages are lost.

    We found more cases in which the problem is reproduced.

    - Execute imqcmd purge dst in step 2.

    - Message expires in step 2. and moves to DMQ.

    *Cause:*

    We found a problem in loading persistent messages and file pool is
    executed twice.

    *Workaround:*

    We found a workaround by specifying
    imq.persist.file.destination.message.filepool.limit=0 to turn off
    the file pooling function. Our understanding is that setting this
    parameter only disables file pool and all other functionalities
    are not affected. Is it correct?

    *Tentative Fix:*

    We have modified the soruce code for OpenMQ 4.4 as shown in the
    attachment file and confirmed the problem is fixed. We'd
    appreciate if you could check if our fix is correct.

    In RandomAccessStore file, hasMoreElements method in internal
    class FileEnumeration determines the value of scanned by itr and
    index at the begining.

    The value of index is increased in hasMoreElements method. We
    think your intention is to set true to scanned at last when itr is
    null by executing hasMoreElements and nextElements repeatedly.

    However, as hasMoreElements method is executed again only when it
    returned true, scanned remains false in case hasMoreElements has
    returned false.

    As a result of our investigation, we have added lines at the end
    of method which set true to scanned based on itr and index. This
    has been modified so that scanned is set as true when false has
    been returned.

    Regards,

    Takuji




[mq-dev] Persistent messages exceeding 1MB are lost

Okubo, Takuji 07/16/2013

[mq-dev] Re: Persistent messages exceeding 1MB are lost

Amy Kang 07/16/2013

[mq-dev] Re: Persistent messages exceeding 1MB are lost

Okubo, Takuji 07/17/2013

[mq-dev] Re: Persistent messages exceeding 1MB are lost

Amy Kang 07/18/2013

[mq-dev] Re: Persistent messages exceeding 1MB are lost

Okubo, Takuji 07/19/2013

[mq-dev] Re: Persistent messages exceeding 1MB are lost

Amy Kang 07/19/2013

[mq-dev] Re: Persistent messages exceeding 1MB are lost

Okubo, Takuji 07/22/2013
Terms of Use; Privacy Policy; Copyright ©2013-2016 (revision 20160708.bf2ac18)
 
 
Close
loading
Please Confirm
Close