Issue Details (XML | Word | Printable)

Key: GLASSFISH-15599
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Bobby Bissett
Reporter: Bobby Bissett
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
glassfish

AdminConsoleConfigUpgrade fails if GrizzlyConfigSchemaMigrator not run first

Created: 18/Jan/11 02:49 PM   Updated: 25/Jan/11 06:19 AM   Resolved: 25/Jan/11 06:19 AM
Component/s: admin_gui
Affects Version/s: 3.1_ms07
Fix Version/s: 3.1_b39

Time Tracking:
Not Specified

File Attachments: 1. File fix_and_hack.diff (2 kB) 18/Jan/11 02:54 PM - Bobby Bissett
2. File patch0.diff (1 kB) 19/Jan/11 07:24 AM - Bobby Bissett
3. File patch1.diff (0.8 kB) 21/Jan/11 10:48 AM - Bobby Bissett
4. File patch2.diff (3 kB) 24/Jan/11 11:24 AM - Bobby Bissett

Environment:

Earth.


Tags: 3_1-approved 3_1-upgrade
Participants: Bobby Bissett, Chris Kasso and Nazrul


 Description  « Hide

This only happens if AdminConsoleConfigUpgrade runs before GrizzlyConfigSchemaMigrator, which I haven't seen happen in the workspace or any promoted builds. But it could happen, and if it does AdminConsoleConfigUpgrade fails with:

SEVERE: Could not upgrade security service for admin console: org.jvnet.hk2.config.TransactionFailure: Can't operate without <http-service>

The fix for this is very simple. Here's the whole fix:

— start —
Index: core/kernel/src/main/java/com/sun/enterprise/v3/admin/AdminConsoleConfigUpgrade.java
===================================================================
— core/kernel/src/main/java/com/sun/enterprise/v3/admin/AdminConsoleConfigUpgrade.java (revision 44566)
+++ core/kernel/src/main/java/com/sun/enterprise/v3/admin/AdminConsoleConfigUpgrade.java (working copy)
@@ -1,7 +1,7 @@
/*

  • DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.
    *
  • * Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
    + * Copyright (c) 2010-2011 Oracle and/or its affiliates. All rights reserved.
    *
  • The contents of this file are subject to the terms of either the GNU
  • General Public License Version 2 only ("GPL") or the Common Development
    @@ -84,6 +84,11 @@
    @Inject
    Configs configs;

+ // This will force the Grizzly upgrade code to run before
+ // AdminConsoleConfigUpgrade runs.
+ @Inject(name="grizzlyconfigupgrade", optional=true)
+ ConfigurationUpgrade precondition = null;
+
@Override
public void postConstruct() { Config config = configs.getConfigByName("server-config"); --- end --- See this email thread for more information: http://java.net/projects/glassfish/lists/dev/archive/2011-01/message/147 Note: to reproduce this, you need to make the AdminConsoleConfigUpgrade class go first. Here's one way to do it. DO NOT COMMIT THIS CODE!!! --- start --- Index: admin/config-api/src/main/java/org/glassfish/config/support/DomainXml.java =================================================================== --- admin/config-api/src/main/java/org/glassfish/config/support/DomainXml.java (revision 44566) +++ admin/config-api/src/main/java/org/glassfish/config/support/DomainXml.java (working copy) @@ -145,7 +145,21 @@ }

protected void upgrade() {
+ /* START HACK */
+ String abc = "com.sun.enterprise.v3.admin.AdminConsoleConfigUpgrade";

+ for (Inhabitant<? extends ConfigurationUpgrade> cu : habitat.getInhabitants(ConfigurationUpgrade.class)) {
+ if (abc.equals(cu.typeName())) {
+ try { + cu.get(); // run the upgrade + }
+ catch (Throwable t) { + javax.swing.JOptionPane.showMessageDialog(null, t.getMessage()); + }
+ }
+ }
+ /* END HACK */
+
// run the upgrades...
for (Inhabitant<? extends ConfigurationUpgrade> cu : habitat.getInhabitants(ConfigurationUpgrade.class)) {
try {
— end —



Bobby Bissett added a comment - 18/Jan/11 02:54 PM

Attaching a diff of my workspace that contains the fix and the hack to reproduce the failure. Please don't commit the hack!


Bobby Bissett added a comment - 19/Jan/11 07:05 AM

I'll take this one since I have the fix already. The only remaining question is whether or not this should be fixed now, but it's simple so hopefully it will go through whether it's deemed truly severe or not.


Bobby Bissett added a comment - 19/Jan/11 07:16 AM

How bad is its impact? (Severity)
When it happens (see below), part of the admin console upgrade won't happen. I don't know what the effect of this really is, but something in the admin console won't be set up right.

How often does it happen? (Frequency)
This is identical to GLASSFISH-15576 except that I haven't seen it happen naturally. I still don't know what determines the order of objects returned from the getInhabitants call, but in the final build this module will either be run before or after the grizzly one. If before, then this error will be seen during every upgrade. Otherwise, it will never happen.

How much effort is required to fix it? (Cost)
Done and done. Diff is attached and already tested, and it's the same fix we've used in several other classes.

What is the risk of fixing it? (Risk)
None that I know of. This is Jerome's mechanism for having modules force others to run first.

Does a work around for the issue exist? Can the workaround be reasonably employed by the end user?
(Same text I used in GLASSFISH-15576.) Um, if we could figure out why promoted builds are different from workspace builds, maybe there's a workaround. The changes I've suggested are a lot simpler though, and only affect the code that runs during upgrade (no changes during runtime).

If the issue is not fixed should the issue and its workaround (if applicable) be described in the Release Notes?
Big time. This could be a major problem for users, at least the users who use the admin console. Or it may never happen, but the fix is simpler now than if we see this in the final RC build.


Bobby Bissett added a comment - 19/Jan/11 07:24 AM

Attaching final patch for code review.


Chris Kasso added a comment - 19/Jan/11 08:04 AM

Approved for 3.1


Bobby Bissett added a comment - 20/Jan/11 07:41 AM

This is fixed in revision 44632.


Bobby Bissett added a comment - 20/Jan/11 10:46 AM

This is causing a problem in the QL tests, so am reopening the issue.

AdminConsoleConfigUpgrade cannot run unless GrizzlyConfigSchemaMigrator runs first. In order for it to make GrizzlyConfigSchemaMigrator run first, it injects an instance of it.

At runtime, AdminConsoleAdapter is creating an instance of AdminConsoleConfigUpgrade manually, which then tries to create an instance of GrizzlyConfigSchemaMigrator, which fails with an NPE because there is no upgrade going on. I need to figure out why AdminConsoleAdapter is doing this.


Nazrul added a comment - 20/Jan/11 02:13 PM

Should we exclude this issue from 3.1?


Bobby Bissett added a comment - 21/Jan/11 10:50 AM

Summary:

Upgrade modules are run by iterating over an unordered collection, so if module A needs to run before module B, it only happens by luck unless module B takes steps to force A to run first. Several modules need GrizzlyConfigSchemaMigrator to run first, and to ensure this they inject an instance of GrizzlyConfigSchemaMigrator before running. This was the fix for issue GLASSFISH-15576 for instance.

This same fix should be made for AdminConsoleConfigUpgrade, but AdminConsoleConfigUpgrade is used during runtime by AdminConsoleAdapter, and if GrizzlyConfigSchemaMigrator is also run at this time it causes an NPE during the quicklook tests. What's annoying is that I can't reproduce the failure in any other way, and there isn't much time now to see what's going on with those tests that are otherwise fine.

So to work around the problem, I can explicitly make GrizzlyConfigSchemaMigrator run before any other upgrade modules. Thus, it will run before AdminConsoleConfigUpgrade without any changes to AdminConsoleConfigUpgrade. This isn't an ideal solution, but we can revisit in 3.2. I think something has to be done about this, otherwise upgrades could fail randomly, or based on the user's luck.

This patch has my change to force GrizzlyConfigSchemaMigrator to run first (getting the instance runs its postConstruct method, so getting it = running it):

http://java.net/jira/secure/attachment/27697/patch1.diff

It's not the ideal solution (it's a hack), but it's safer than trying to debug the whole admin console lifecycle given where we are now. If we go with this fix, I'll file a separate admin_gui issue to properly fix this in 3.2.


Bobby Bissett added a comment - 24/Jan/11 11:24 AM

This contains the proper fix as well as a check in the Grizzly code to avoid an NPE. Tested with QL and admin dev tests. Email sent to dev list about it, but the test has a manually-created config with no HttpService child object. This was causing the NPE, and I think the Grizzly code should be changed to handle that case.


Bobby Bissett added a comment - 25/Jan/11 06:19 AM

This is fixed in rev 44696 with the attached 'patch2' changes.