I don't see any memory leak in the referenced logs.
The thing that would actually show a memory leak is the gc.log file for each
instance – in particular, for lines of this format:
2009-04-22T17:04:54.270+0530: 29.593: [GC 29.593: [ParNew: 64640K->0K(65088K),
0.0240510 secs] 139590K->80009K(6143552K), 0.0243850 secs] [Times: user=0.06
sys=0.00, real=0.02 secs]
The amount of space the heap is occupying is the 80009K number. There is a jump
when the instance fails, of course, but that is to be expected. But, for example
in the instance101 gc.log, that value settles down and fluctuates slightly
around 3900000K. I will attach the graph that the JDK team provided me showing that.
The jmap output doesn't show anything much different. I'm not sure if the jmap
output was taken with the live option or not; without the live option it is a
little difficult to know if there is actually a leak or not. But of course,
using the live option under load will create some errors during the GC pause.
At any rate, if I look at the number of [B objects from instance101, it ranges
from 4511249 objects to 4629249 objects, but the numbers are not monotonically
increasing. That's true for all the key objects (ReplicationState in particular
being the SSR-related one of interest) in all the instances. They go up a
little, they go down a little, but they don't seem to be monotonically increasing.