Issue Details (XML | Word | Printable)

Key: GLASSFISH-15393
Type: Improvement Improvement
Status: Open Open
Priority: Critical Critical
Assignee: sirajg
Reporter: Harshad Vilekar
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
glassfish

[PERF] Admin Console - slow load time

Created: 30/Dec/10 12:40 PM   Updated: 18/Jan/12 07:52 AM
Component/s: performance
Affects Version/s: 3.1_b35
Fix Version/s: None

Time Tracking:
Not Specified

File Attachments: 1. Text File dasonly-server.log (341 kB) 04/Jan/11 09:22 AM - Harshad Vilekar
2. Text File dasonly-trace.log (22 kB) 05/Jan/11 12:01 PM - Harshad Vilekar
3. Text File multinode-cluster-hudson-console.txt (5 kB) 30/Dec/10 12:40 PM - Harshad Vilekar
4. Zip Archive perf-logging.zip (1.59 MB) 04/Jan/11 09:43 AM - Jason Lee
5. Java Archive File server-logs-take2.jar (30 kB) 07/Jan/11 04:45 PM - Harshad Vilekar
6. Text File server.log (33 kB) 30/Dec/10 12:40 PM - Harshad Vilekar
7. Text File singlenode-two-instance-server-trace.log (44 kB) 05/Jan/11 12:01 PM - Harshad Vilekar
8. Text File singlenode-twoinstance-cluster-creation-log.txt (7 kB) 05/Jan/11 02:30 PM - Harshad Vilekar
9. Text File singlenode-twoinstance-server.log (21 kB) 04/Jan/11 09:22 AM - Harshad Vilekar
10. Text File twonode-threeinstance-server-trace.log (76 kB) 06/Jan/11 05:44 PM - Harshad Vilekar
11. Text File twonode-twoinstance-server-trace.log (62 kB) 06/Jan/11 07:29 PM - Harshad Vilekar

Environment:

DAS/Instances on: Solaris 10, Sparc, JDK6u23
Browser: Firefox 3.6.10, on remote Solaris 10, Sparc machine.

Issue Links:
Dependency

Tags: 3_1-exclude 3_1-need_more_info 3_1-next
Participants: Anissa Lam, Chris Kasso, Harshad Vilekar, Hong Zhang, Jason Lee, Joe Di Pol, Nazrul, Scott Fordin and sirajg


 Description  « Hide

The numbers below show the time taken by Admin Console main screen to show up in the Browser - for the first time after fresh install.

For http, the time is counted after entering the URL in the browser, and pressing return. In case of https, the time is counted after confirming security exception prompt.

1. DAS only (secure admin enabled): 30+ sec
https://host1.us.oracle.com:4848/common/index.jsf

2. Single node, two instance cluster (secure admin not enabled): 100+ sec
http://host2.us.oracle.com:4848/common/index.jsf

3. Two node, two instance cluster (secure admin enabled): 200+ sec
https://host3.us.oracle.com:4848/common/index.jsf

This is by no means any methodical performance test, but simply shows "perceived load time" on Solaris 10 Sparc machines (Niagara, T1000), that are used for day to day testing.

Attached:
Server log for #3.
Hudson cluster creation job log for #3.



Anissa Lam added a comment - 18/Jan/12 06:35 AM - edited

The issue is filed about the startup console time between DAS only and DAS with clusters.
It didn't mention whether the cluster's instances are running or not when the server startups and GUI launched. So, i run with both instances running or not.

I do not have a T1000 to run on, so i tried this on a Solaris 10 x86

=====================
This is run on a Solaris 10, x86 machine, 3.1.2 b17.

DAS only, first launch - 20sec; console-preloaded - 10 sec.

Case 1: cluster with 2 instances (both using localhost-domain1)
when instances are not running. console preloaded - 8 sec
when instances are running. console preloaded - 9 sec.

Case 3: cluster with 2 instances ( one on localhost-domain1, one on another solaris host)
when instances are not running, console-preloaded - 10sec
when instances are are running, console-preloaded - 9 sec


Joe Di Pol added a comment - 14/Jan/12 12:18 AM

We've done all the work in this area that is planned for 3.1.2. Removing the 3_1_2-review tag (see linked bugs for details).


Anissa Lam added a comment - 31/Oct/11 05:36 PM

assign go Siraj as he is leading the GUI initial performance improvement effort.


Harshad Vilekar added a comment - 23/Mar/11 09:00 PM

No need to release note this issue. I'm Including part of the offline email thread:

— From: Chris Kasso -----------
I would prefer not to Release Note this issue. It is unlikely to generate a customer call and most customers won't see the issue. There is also no workaround.
----------------------


Scott Fordin added a comment - 23/Mar/11 06:57 PM

Need more info to add issue to 3.1 Release Notes. Note at all clear what wants to be said here, either in terms of description or workaround. Is it really necessary to include in the Release Notes?


Chris Kasso added a comment - 26/Jan/11 04:17 PM

After discussing this with the submitter we decided to clear the "blocker" status.


Nazrul added a comment - 11/Jan/11 01:20 AM

Excluding from 3.1 count.


Nazrul added a comment - 10/Jan/11 05:44 PM

This issue happens on a specific hardware. Should we exclude this from 3.1?

We should spend time on this and fix it during 3.2.


Hong Zhang added a comment - 07/Jan/11 06:42 PM

Harshad: thanks for re-running the tests!
The time in the tests are pretty close to each other now, with the times slightly longer when the secure admin is enabled.
The gui application loading times are pretty close to each other too: 40.124s, 36.523s, 40,928s, 40,036s, 34,714s.
So it now seems there is no significant performance degrading from DAS-only scenario to cluster scenarios which are consistent with what were observed on other platforms.

The console accessing time do seem pretty long in the tests (105s-130s), but maybe this is expected with the machine configuration (resource constraint)?
System clock frequency: 200 MHz
Memory size: 2040 Megabytes


Harshad Vilekar added a comment - 07/Jan/11 04:45 PM

Here are the updated results (take 2) showing Admin Console load time, with clean environment:

1. all test machines now use locally installed jdk (earlier, JDK was installed in remote NFS mounted directory)
2. all test machines rebooted, with no other processes running
3. using latest nightly build
4. no apps deployed
5. browser catch cleared, unwanted plugins removed, browser restarted.
------------------
Config 2.1: DAS only - no secure admin:
Time: 105 secs
Log File: dasonly-nosec-trace.log
------------------
Config 2.2: DAS only - secure admin enabled.
Time: 110 secs
Log File: dasonly-sec-trace.log
------------------
Config 2.3: Cluster on Same host: DAS + two instances, no secure admin:
NAME HOST PORT PID CLUSTER STATE
in2 localhost 8363 1236 c1 running
in1 localhost 8107 1240 c1 running

Time taken: 105 secs
Log file: single-host-two-instance-nosec-trace.log
------------------
Config 2.4: Cluster on Same host: DAS + two instances, secure admin enabled:
NAME HOST PORT PID CLUSTER STATE
in2 localhost 8363 1873 c1 running
in1 localhost 8107 1877 c1 running

Time taken: 125 seconds
Log file: single-host-two-instance-sec-trace.log
-------------------------------------------------

Config 2.5: Cluster on two hosts (host1: DAS + in1, host2: in2), secure admin enabled:
NAME HOST PORT PID CLUSTER STATE
in2 localhost 8363 2008 c1 running
in1 intg2t1000 8107 997 c1 running

Time taken: 130 seconds
Log file: two-host-two-instance-sec-notrace.log (no trace information included)
-------------------------------------------------

Server logs for config 2.1 to 2.5 attached: server-logs-take2.jar


Hong Zhang added a comment - 07/Jan/11 07:17 AM

Harshad:
Thanks for re-run the test. In this test, the gui application loading took 37.368s with container initialization took 11.066s, and component loading in the container took 21.586s.

M1. [#|2011-01-06T19:03:01.872-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=130;_ThreadName=Thread-1;|Container : com.sun.enterprise.web.WebContainer Mark BEFORE_CONTAINER_SETUP at 3972|#]

M2. [#|2011-01-06T19:03:01.882-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=130;_ThreadName=Thread-1;|Mark CONTAINERS_SETUP_DONE at 15038|#]

M3. [#|2011-01-06T19:03:01.913-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=130;_ThreadName=Thread-1;|Container : web Mark START at 15705|#]

M4. [#|2011-01-06T19:03:01.914-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=130;_ThreadName=Thread-1;|Container : web Mark STARTED at 37291|#]


Harshad Vilekar added a comment - 06/Jan/11 07:29 PM

One more data point: twonode-twoinstance-server-trace.log

for following configuration:

asadmin list-instances --long
NAME HOST PORT PID CLUSTER STATE
in2 localhost 8363 9059 c1 running
in1 intg2t1000 8107 3083 c1 running

No apps are deployed.


Harshad Vilekar added a comment - 06/Jan/11 06:36 PM

Notes about numbers reported in twonode-threeinstance-server-trace.log:

1. Hardware config of the machine used to get these numbers:

-bash-3.00$ /usr/sbin/prtdiag
System Configuration: Sun Microsystems sun4v Sun Fire(TM) T1000
System clock frequency: 200 MHz
Memory size: 2040 Megabytes

========================= CPUs ===============================================

CPU CPU
Location CPU Freq Implementation Mask
------------ ----- -------- ------------------- -----
MB/CMP0/P0 0 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P1 1 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P2 2 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P3 3 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P4 4 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P5 5 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P6 6 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P7 7 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P8 8 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P9 9 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P10 10 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P11 11 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P12 12 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P13 13 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P14 14 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P15 15 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P16 16 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P17 17 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P18 18 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P19 19 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P20 20 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P21 21 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P22 22 1000 MHz SUNW,UltraSPARC-T1
MB/CMP0/P23 23 1000 MHz SUNW,UltraSPARC-T1

========================= IO Configuration =========================

IO
Location Type Slot Path Name Model
----------- ----- ---- --------------------------------------------- ------------------------- ---------
MB/NET0 PCIE MB /pci@7c0/pci@0/network@4 network-pci14e4,1668
MB/NET-1 PCIE MB /pci@7c0/pci@0/network network-pci14e4,1668
MB/NET-1 PCIX MB /pci@7c0/pci@0/pci@8/network network-pci108e,1648
MB/NET-1 PCIX MB /pci@7c0/pci@0/pci@8/network network-pci108e,1648
MB/PCIX PCIX MB /pci@7c0/pci@0/pci@8/scsi@2 scsi-pci1000,50 LSI,1064

===============

2. One EJB App was deployed on the cluster when these log entries are generated.


Hong Zhang added a comment - 06/Jan/11 06:03 PM

Harshad:
Thanks for the attaching the server.log of the multi-node scenario too.

Looking at the server.log, the gui application loading took 36.067s, out of the 140s of the total console accessing time.

And for the application loading part, the container initialization (8.394s) and component loading (24.442s) still took most of the time especially the component loading part.

M1.[#|2011-01-06T17:19:26.048-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=134;_ThreadName=Thread-1;|Container : com.sun.enterprise.web.WebContainer Mark BEFORE_CONTAINER_SETUP at 2481|#

M2.[#|2011-01-06T17:19:26.057-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=134;_ThreadName=Thread-1;|Mark CONTAINERS_SETUP_DONE at 10875|#]

M3.[#|2011-01-06T17:19:26.090-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=134;_ThreadName=Thread-1;|Container : web Mark START at 11558|#]

M4[#|2011-01-06T17:19:26.091-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=134;_ThreadName=Thread-1;|Container : web Mark STARTED at 36000|#]


Harshad Vilekar added a comment - 06/Jan/11 05:44 PM

Hong, I'm attaching server log - with trace enabled - for multi node cluster (scenario #3).

File name:twonode-threeinstance-server-trace.log. The relevant startup log entries begin at line 457.

asadmin list-instances --long
NAME HOST PORT PID CLUSTER STATE
in2 localhost 8363 8948 c1 running
in1 intg2t1000 8107 3011 c1 running
in3 localhost 8619 8955 c1 running
Command list-instances executed successfully.

With latest nightly build - Admin Console took about 140+ seconds to load on above config (this was tested after running some other test).


Chris Kasso added a comment - 06/Jan/11 04:14 PM

I wanted to add a data point.

I created a two node, four instance cluster:

./asadmin list-instances -l
NAME HOST PORT PID CLUSTER STATE
c1i1_sidewinder sidewinder.us.oracle.com 10048 11011 c1 running
c1i2_sidewinder sidewinder.us.oracle.com 11048 11010 c1 running
c1i1_ouch ouch.us.oracle.com 10048 13629 c1 running
c1i2_ouch ouch.us.oracle.com 11048 13631 c1 running
Command list-instances executed successfully.

The DAS is running on a S10/Sparc box (aha.us.oracle.com). The nodes are on S11 Express/x86 system. The firefox (3.6.10) browser was on the S11/x86 box.

I saw no difference in the Admin Console startup time with three different scenarios:

1) Secure Admin enabled: 67 seconds
2) Secure Admin disabled: 65 seconds
3) Secure Admin enabled with only the DAS (no nodes configured): 68 seconds

Given the age of the Solaris 10 Sparc box (SunBlade 1500), about a minute to start the Console is expected.


Hong Zhang added a comment - 05/Jan/11 06:17 PM

Harshad: I don't see anything unusual in the cluster creation steps. Tom was suggesting one possible cause is short of memory on the system, quoting his comments on issue 15412:
One thing to check is the memory on the system. If it is memory starved, everything is going to take longer, and if several local instances are started, that will consume even more memory.
The Niagra system I tested had plenty of memory (64GB or something like that).

Can you share the test system configuration, how much memory it has etc? Thanks.


Harshad Vilekar added a comment - 05/Jan/11 02:30 PM

Attaching Cluster Creation Steps: singlenode-twoinstance-cluster-creation-log.txt

Multi Node Two Instance Cluster Creation Steps are already attached: multinode-cluster-hudson-console.txt


Hong Zhang added a comment - 05/Jan/11 01:58 PM

Harshad: thanks for attaching the server.log with the tracing turned on. From your numbers, there is still significant difference between DAS-only and cluster scenarios.

1. The DAS only scenario:
The admin console application loading took 7.119s. And the main time was spent in the container initialization and loading components in web container: the container initialization took 2.874s (the time difference for mark 1 and mark 2) and the loading components in web container took 3.493 s (the time difference for mark 3 and mark 4).

M1.[#|2011-01-05T09:53:55.857-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=92;_ThreadName=Thread-1;|Container : com.sun.enterprise.web.WebContainer Mark BEFORE_CONTAINER_SETUP at 652|#]

M2.[#|2011-01-05T09:53:55.859-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=92;_ThreadName=Thread-1;|Mark CONTAINERS_SETUP_DONE at 3526|#]

M3.[#|2011-01-05T09:53:55.864-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=92;_ThreadName=Thread-1;|Container : web Mark START at 3613|#]

M4.[#|2011-01-05T09:53:55.864-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=92;_ThreadName=Thread-1;|Container : web Mark STARTED at 7106|#]

2. The single node, two cluster instance scenario:

The admin console application loading took 41.302 s. And the main time was also spent in the container initialization and loading components in web container: the container initialization took 14.341s (the time difference for mark 1 and mark 2) and the loading components in web container took 21.720s (the time difference for mark 3 and mark 4).

M1.[#|2011-01-05T11:22:58.378-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=120;_ThreadName=Thread-1;|Container : com.sun.enterprise.web.WebContainer Mark BEFORE_CONTAINER_SETUP at 4057|#]

M2.[#|2011-01-05T11:22:58.386-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=120;_ThreadName=Thread-1;|Mark CONTAINERS_SETUP_DONE at 18398|#

M3.[#|2011-01-05T11:22:58.412-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=120;_ThreadName=Thread-1;|Container : web Mark START at 19515|#]

M4.[#|2011-01-05T11:22:58.413-0800|INFO|oracle-glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=120;_ThreadName=Thread-1;|Container : web Mark STARTED at 41235|#]

Your DAS-only scenario seems to be consistent with what we observed, but your cluster scenario just seems taking much longer for every step, especially in container initialization and component loading. Can you share the exact steps of your cluster scenario, how did you create the cluster etc? And were there any other things running at the same time on the system when you tested? Tom did a testing on Niagra system today too (see issue 15412 for more details), but his numbers on DAS-only and cluster scenario (where 30 local instances were created) were more consistent with what we observed on other systems. We need to figure out what is different on your testing system. Thanks.


Harshad Vilekar added a comment - 05/Jan/11 12:01 PM

Attaching logs for scenario #1 and #2 with following option enabled in DAS domain.xml.
<jvm-options>-Dorg.glassfish.deployment.trace=true</jvm-options>

Logs for #3 are not yet available (the hosts are running other tests - so latest nightly build is not yet installed).


Hong Zhang added a comment - 04/Jan/11 06:15 PM

Harshad:
Thanks for attaching all the DAS sever.log files, those are very helpful!
The admin console application loading times in your server.log files are kind of different from the ones we observed (on linux/Mac machines): the dasonly-server.log showed 10.815s for loading console application, and the singlenode-twoinstance-server.log showed 62.399s for loading console application. For both me and Jason, the numbers were much closer than this: me on linux das-only loading time 7.974s, singlenode-two-instance 8.736s; jason das-only loading time 17.001s and singlenode-two-instance 20.134s.
I have checked in some changes to enable the deployment tracing for application loading to see the break down of the loading times (more details in 15412). And it will be very useful to see how the loading times break down in your environment. Can you please try with tomorrow's nightly build (which should have my change):
1. Right after you install the server bits in each scenario, add this jvm option to the DAS domain.xml:
<jvm-options>-Dorg.glassfish.deployment.trace=true</jvm-options>
2. Then start DAS, (create/start cluster as applicable), access console as usual.
3. The DAS server.log should have tracing messages like this which will allow us to see how the times break down:
.[#|2011-01-04T16:04:14.207-0500|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=87;_ThreadName=Thread-1;|Container : com.sun.enterprise.web.WebContainer Mark BEFORE_CONTAINER_SETUP at 878|#]
Attach the DAS server.log for all the scenarios in the issue and I will take a look to see where the extra time is spent in application loading for cluster set up.


Jason Lee added a comment - 04/Jan/11 09:43 AM

This archive contains the jars I updated to add logging to the console. Extract this in your GlassFish install root and restart your server.


Harshad Vilekar added a comment - 04/Jan/11 09:22 AM

Attached the server logs for the two scenarios
scenario #1: dasonly-server.log
scenario #2: singlenode-twoinstance-server.log


Hong Zhang added a comment - 03/Jan/11 06:11 PM

Harshad, can you please attach the DAS server.log for all the three scenarios? Want to see how the loading time of the admin gui application compare in the test environment. Thanks.


Anissa Lam added a comment - 03/Jan/11 08:28 AM

Several developer is trying to look into this. The latest finding from Jason:

On 1/3/11 8:14 AM, Jason Lee wrote:
> I added logging statements at various points in admin console adapter and installer thread. Here's what I got:
>
> ***** 09:52:40.0944: Application is not yet loaded.
> ***** 09:52:40.0945: Force REST module load
> ***** 09:52:40.0947: Starting installer thread
> ***** 09:52:40.0954: InstallThread.install()
> ***** 09:52:40.0956: building ConfigCode
> ***** 09:52:40.0957: Getting server
> ***** 09:52:40.0957: Applying changes
> ***** 09:52:41.0668: Changes applied
> ***** 09:52:41.0671: Getting ApplicationRef
> ***** 09:52:41.0672: Process application
> ***** 09:52:42.0346: Application is not yet loaded.
> ***** 09:52:59.0675: Application processed
>
> The 'Process application' and 'Application processed' entries show the greatest difference. Those messages came from these lines of code:
>
> adapter.logTime("Process application");
> habitat.getComponent(ApplicationLoaderService.class).processApplication(config, ref, log);
> adapter.logTime("Application processed");
>
> That one line of code took 18 seconds. There may be other hot spots (I'm still looking and tracing), but that seems like a good one to look at.