[GLASSFISH-14351] [Perf] time to start domain the 2nd time is too large Created: 02/Nov/10  Updated: 03/Dec/12  Resolved: 08/Nov/10

Status: Closed
Project: glassfish
Component/s: performance
Affects Version/s: 3.1
Fix Version/s: 3.1_ms07

Type: Bug Priority: Critical
Reporter: amitagarwal Assignee: Richard S. Hall
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Windows (generic)
Platform: All


Issuezilla Id: 14,351
Tags: PSRBUG

 Description   

Startup time has regressed by huge margins in b26. We are observing around 108%
regression for regular build on Windows & 88% regression for web build on
Windows platform.



 Comments   
Comment by Tom Mueller [ 02/Nov/10 ]

Here are some timings for starting a freshly created domain on Windows:

C:\test>grep 'startup time' 3.1-b26\glassfish3\glassfish\domains\foo\logs\server
.log
[#|2010-11-02T11:31:52.547-
0500|INFO|glassfish3.1|javax.enterprise.system.core.com.sun.enterprise.v3.server|_ThreadID=15;_ThreadName=Thread-
1;|GlassFish Server Open Source Edition 3.1-b26 (26) startup time : Felix (16,422ms), startup services(812ms),
total(17,234ms)|#]
[#|2010-11-02T11:32:21.407-
0500|INFO|glassfish3.1|javax.enterprise.system.core.com.sun.enterprise.v3.server|_ThreadID=15;_ThreadName=Thread-
1;|GlassFish Server Open Source Edition 3.1-b26 (26) startup time : Felix (15,563ms), startup services(844ms),
total(16,407ms)|#]
[#|2010-11-02T11:32:33.469-
0500|INFO|glassfish3.1|javax.enterprise.system.core.com.sun.enterprise.v3.server|_ThreadID=15;_ThreadName=Thread-
1;|GlassFish Server Open Source Edition 3.1-b26 (26) startup time : Felix (2,656ms), startup services(891ms),
total(3,547ms)|#]
[#|2010-11-02T11:32:45.469-
0500|INFO|glassfish3.1|javax.enterprise.system.core.com.sun.enterprise.v3.server|_ThreadID=15;_ThreadName=Thread-
1;|GlassFish Server Open Source Edition 3.1-b26 (26) startup time : Felix (1,781ms), startup services(781ms),
total(2,562ms)|#]

Notice how the Felix startup time is over 15 seconds for the 2nd start. Normally, the time for the 2nd start is
like the 3rd and 4th starts.

Comment by Tom Mueller [ 02/Nov/10 ]

I tried this experiment:

  • unzipped b26 to a test area
  • copied felix.jar from a b25 install to the b26 test area
  • run start-domain twice

In the case, the 2nd start was fast (1.7 sec) while the first one took 15 sec.

So using the felix.jar from b25 eliminates this problem.

Comment by Sanjeeb Sahoo [ 02/Nov/10 ]

Is it really a p2 at this point of time? I beg to differ. Yes, we should
understand why this is happening so that we can eventually fix it, but fixing it
now will require yet another release of Felix which itself requires quite bit of
effort. IMO, developers start the server so many times that whether the server
started slow first two times would not change their perception. Given the time
constraint, can we change our benchmark in this release to start the server
twice to make it warm? Just my $0.02.

Comment by Tom Mueller [ 02/Nov/10 ]

I might agree if you could explain why it is doing this and why this is the only
effect and why it is a big change. Also, what are the benefits of the 3.0.5
integration. Maybe we should just revert to the previous release of Felix.
Finally, I was hoping that we could still get the cache file format change in for
GF 3.1. That would require another release of felix anyway.

Comment by Richard S. Hall [ 02/Nov/10 ]

I am seeing that the bundles are being updated in the bundle cache on the second
server start:

"FelixStartLevel"
org.apache.felix.framework.Felix.updateBundle(Felix.java:1895)
org.apache.felix.framework.BundleImpl.update(BundleImpl.java:940)
org.apache.felix.framework.BundleImpl.update(BundleImpl.java:927)
org.jvnet.hk2.osgimain.Main.update(Main.java:337)
org.jvnet.hk2.osgimain.Main.traverse(Main.java:289)
org.jvnet.hk2.osgimain.Main.start(Main.java:141)
org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:633)
org.apache.felix.framework.Felix.activateBundle(Felix.java:1822)
org.apache.felix.framework.Felix.startBundle(Felix.java:1739)
org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1143)
org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)
java.lang.Thread.run(Thread.java:680)

Comment by Richard S. Hall [ 02/Nov/10 ]

After further discussion with Sahoo, this could be related to a change in the
way Felix initializes the last modified value for bundles in the bundle cache.
We will discuss this.

Comment by Richard S. Hall [ 02/Nov/10 ]

I am fairly certain I see the issue and it is a Felix issue related to changes
for cache initialization. It looks like a bundle's last modified time is not
persisted during the first execution, so the second execution detects old
bundles and updates, which causes the last modified time to be written to the disk.

Comment by Richard S. Hall [ 03/Nov/10 ]

I am going to try to get a Felix 3.0.6 release going this week to address this
issue.

Comment by Tom Mueller [ 08/Nov/10 ]

Felix 3.0.6 is integrated.

Generated at Sun Apr 26 03:48:31 UTC 2015 using JIRA 6.2.3#6260-sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.