glassfish
  1. glassfish
  2. GLASSFISH-17716

[REGRESSION] Instance synchronization fails with classes in lib subdirectory

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.2_dev
    • Fix Version/s: 3.1.2_dev, 4.0
    • Component/s: admin
    • Labels:
      None

      Description

      OS: solaris and Linux
      build: v3.1.2 b09

      We tried to run security BAT test against V3.1.2 and it failed to start cluster after ran a customrealm test. Steps to reproduce the bug:

      1.Checkout SQE workspace:
      cvs co appserver-sqe/bootstrap.xml
      (CVSROOT=:pserver:cvsguest@sunsw.us.oracle.com:/m/jws)
      cd appserver-sqe
      ant -f bootstrap.xml co-security
      2. install GF V3.1.12, don NOT start domain1 (otherwise it will cause port conflict when run SQE setup-cluster-profile target)
      3. Set env. variables
      S1AS_HOME <GF installation dir> (example: /export/sonia/v3/glassfishv3/glassfish
      SPS_HOME <workspace dir> (example: /export/sonia/appserver-sqe)
      ANT_HOME <ant dir>
      JAVA_HOME <java dir>
      4. cd appserver-sqe/, run "ant setup-cluster-profile"
      5. cd appserver-sqe/pe/security/customrealm/, run "ant ee all-appservrealm"
      The test failed to start cluster after configured customrealm:

      start-domain-ee:
      [exec] Deprecated syntax, instead use:
      [exec] asadmin --user admin --passwordfile /export/sonia/appserver-sqe/build-config/adminpassword.txt --echo --terse=false --host localhost --port 4848 start-cluster [options] ...
      [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /export/sonia/appserver-sqe/build-config/adminpassword.txt --interactive=false --echo=true --terse=false start-cluster --verbose=false sqe-cluster
      [exec] remote failure: clustered_instance_1: Could not start instance clustered_instance_1 on node localhost-domain1 (localhost).
      [exec]
      [exec] Command failed on node localhost-domain1 (localhost): Command start-local-instance failed.
      [exec]
      [exec] CLI802 Synchronization failed for directory lib, caused by:
      [exec] HTTP connection failed with code 500, message: Internal Error
      [exec]
      [exec] To complete this operation run the following command locally on host localhost from the GlassFish install location /export/sonia/v3/glassfish3:
      [exec]
      [exec] bin/asadmin start-local-instance --node localhost-domain1 --sync normal clustered_instance_1
      [exec] Command start-cluster failed.

      The server.log showed the following exeptions:

      [#|2011-11-11T17:48:28.758-0800|SEVERE|glassfish3.1.2|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=79;_ThreadName=Thread-2;|service exception
      java.lang.RuntimeException: java.util.zip.ZipException: duplicate entry: lib/classes/samples/security/customrealm/appserv/SimpleCustomRealm.class
      at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:243)
      at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:179)
      at com.sun.enterprise.v3.server.HK2Dispatcher.dispath(HK2Dispatcher.java:117)
      at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:238)
      at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:833)
      at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:730)
      at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1031)
      at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228)
      at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137)
      at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104)
      at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90)
      at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79)
      at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54)
      at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59)
      at com.sun.grizzly.ContextTask.run(ContextTask.java:71)
      at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532)
      at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513)
      at java.lang.Thread.run(Thread.java:722)
      Caused by: java.util.zip.ZipException: duplicate entry: lib/classes/samples/security/customrealm/appserv/SimpleCustomRealm.class
      at java.util.zip.ZipOutputStream.putNextEntry(ZipOutputStream.java:215)
      at org.glassfish.admin.payload.ZipPayloadImpl$Outbound.prepareEntry(ZipPayloadImpl.java:94)
      at org.glassfish.admin.payload.ZipPayloadImpl$Outbound.writePartsTo(ZipPayloadImpl.java:100)
      at org.glassfish.admin.payload.PayloadImpl$Outbound.writeTo(PayloadImpl.java:359)
      at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:239)
      ... 17 more

      #]

      [#|2011-11-11T17:48:28.782-0800|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=91;_ThreadName=Thread-2;|CLI802 Synchronization failed for directory lib, caused by:|#]
      [#|2011-11-11T17:48:28.784-0800|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=91;_ThreadName=Thread-2;| HTTP connection failed with code 500, message: Internal Error|#]

      The server.log is attached.

        Activity

        Hide
        Nithya Ramakrishnan added a comment -

        Sonia,

        Can you please retry with the latest build. There has been no change in the security code that might have caused this. Please let us know if this issue occurs in the latest build.

        Thanks
        Nithya

        Show
        Nithya Ramakrishnan added a comment - Sonia, Can you please retry with the latest build. There has been no change in the security code that might have caused this. Please let us know if this issue occurs in the latest build. Thanks Nithya
        Hide
        sonialiu added a comment -

        Thanks for looking at the issue. I tried the latest nightly build and it still failed. I noticed that create-auth-realm command did not work. See below:

        create-auth-realm:
        [echo] Creating auth realm realmperapp ...
        [exec] Deprecated syntax, instead use:
        [exec] asadmin --user admin --passwordfile /export/hudson/workspace/alex-linux-clu/appserver-sqe/build-config/adminpassword.txt --echo --terse=false --host localhost --port 4848 create-auth-realm [options] ...
        [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /export/hudson/workspace/alex-linux-clu/appserver-sqe/build-config/adminpassword.txt --interactive=false --echo=true --terse=false create-auth-realm --classname samples.security.customrealm.appserv.SimpleCustomRealm --property auth-type=simecustomrealm:jaas-context=simpleCustomRealm --target sqe-cluster realmperapp
        [exec] remote failure: An error occurred during replication
        [exec] FAILURE: Command create-auth-realm failed on server instance clustered_instance_1: remote failure: Creation of Authrealm realmperapp failed. com.sun.enterprise.security.auth.realm.BadRealmException: java.lang.ClassNotFoundException: samples.security.customrealm.appserv.SimpleCustomRealm not found by org.glassfish.main.security [223]
        [exec] com.sun.enterprise.security.auth.realm.BadRealmException: java.lang.ClassNotFoundException: samples.security.customrealm.appserv.SimpleCustomRealm not found by org.glassfish.main.security [223]
        [exec] Command create-auth-realm failed.

        Show
        sonialiu added a comment - Thanks for looking at the issue. I tried the latest nightly build and it still failed. I noticed that create-auth-realm command did not work. See below: create-auth-realm: [echo] Creating auth realm realmperapp ... [exec] Deprecated syntax, instead use: [exec] asadmin --user admin --passwordfile /export/hudson/workspace/alex-linux-clu/appserver-sqe/build-config/adminpassword.txt --echo --terse=false --host localhost --port 4848 create-auth-realm [options] ... [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /export/hudson/workspace/alex-linux-clu/appserver-sqe/build-config/adminpassword.txt --interactive=false --echo=true --terse=false create-auth-realm --classname samples.security.customrealm.appserv.SimpleCustomRealm --property auth-type=simecustomrealm:jaas-context=simpleCustomRealm --target sqe-cluster realmperapp [exec] remote failure: An error occurred during replication [exec] FAILURE: Command create-auth-realm failed on server instance clustered_instance_1: remote failure: Creation of Authrealm realmperapp failed. com.sun.enterprise.security.auth.realm.BadRealmException: java.lang.ClassNotFoundException: samples.security.customrealm.appserv.SimpleCustomRealm not found by org.glassfish.main.security [223] [exec] com.sun.enterprise.security.auth.realm.BadRealmException: java.lang.ClassNotFoundException: samples.security.customrealm.appserv.SimpleCustomRealm not found by org.glassfish.main.security [223] [exec] Command create-auth-realm failed.
        Hide
        Nithya Ramakrishnan added a comment -

        Sonia,

        A normal create-auth-realm on the cluster in 3.1.2 passes when the class jar file is dropped into the lib:

        Command create-local-instance executed successfully.
        nitkal@nithya-dell-ubuntu:~/gf-v3-img/glassfish3/glassfish/bin$ ./asadmin start-cluster c1
        Command start-cluster executed successfully.
        nitkal@nithya-dell-ubuntu:~/gf-v3-img/glassfish3/glassfish/bin$ ./asadmin create-auth-realm --target c1 --classname com.samplerealm.SampleRealm realmperapp2
        Command create-auth-realm executed successfully.

        It appears from your test case that you are trying to drop the classes for your custom realm into lib/classes. Is that so? There does not seem to be any directory classes under the lib. Has this location changed in the test case recently? Can you try to drop the jar file containing the classes into the lib and repeat the test case?

        Show
        Nithya Ramakrishnan added a comment - Sonia, A normal create-auth-realm on the cluster in 3.1.2 passes when the class jar file is dropped into the lib: Command create-local-instance executed successfully. nitkal@nithya-dell-ubuntu:~/gf-v3-img/glassfish3/glassfish/bin$ ./asadmin start-cluster c1 Command start-cluster executed successfully. nitkal@nithya-dell-ubuntu:~/gf-v3-img/glassfish3/glassfish/bin$ ./asadmin create-auth-realm --target c1 --classname com.samplerealm.SampleRealm realmperapp2 Command create-auth-realm executed successfully. It appears from your test case that you are trying to drop the classes for your custom realm into lib/classes. Is that so? There does not seem to be any directory classes under the lib. Has this location changed in the test case recently? Can you try to drop the jar file containing the classes into the lib and repeat the test case?
        Hide
        Nithya Ramakrishnan added a comment -

        Sonia reported that the custom realm classes are dropped inside <DOMAIN-DIR>/lib/classes. It appears that after a cluster start in 3.1.2, there is an error (Error 1) in syncing the lib/classes of the DAS with that of the instances:

        If the command bin/asadmin start-local-instance --node localhost-domain1 --sync normal i1 is provided, as per the suggestion in the error message below, there is a further error (Error 2) displayed:

        The stack trace for both are common. (Pasted below the errors). This is the root cause of the problem. Transferring to the admin team.

        Error 1:

        ./asadmin start-cluster c1
        remote failure: i1: Could not start instance i1 on node localhost-domain1 (localhost).

        Command failed on node localhost-domain1 (localhost): Command start-local-instance failed.

        CLI802 Synchronization failed for directory lib, caused by:
        HTTP connection failed with code 500, message: Internal Error

        To complete this operation run the following command locally on host localhost from the GlassFish install location /home/nitkal/gf-v3-img2/glassfish3:

        bin/asadmin start-local-instance --node localhost-domain1 --sync normal i1

        The command start-instance failed for: i1
        Command start-cluster failed.

        Error 2:

        CLI802 Synchronization failed for directory lib, caused by:
        HTTP connection failed with code 500, message: Internal Error
        Command start-local-instance failed.

        The stack traces for both are common:

        [#|2011-12-02T15:55:57.914+0530|SEVERE|glassfish3.1.2|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=24;_ThreadName=Thread-2;|service exception
        java.lang.RuntimeException: java.util.zip.ZipException: duplicate entry: lib/classes/com/samplerealm/SampleLoginModule.class
        at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:246)
        at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:179)
        at com.sun.enterprise.v3.server.HK2Dispatcher.dispath(HK2Dispatcher.java:117)
        at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:238)
        at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:833)
        at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:730)
        at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1031)
        at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228)
        at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137)
        at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104)
        at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90)
        at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79)
        at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54)
        at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59)
        at com.sun.grizzly.ContextTask.run(ContextTask.java:71)
        at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532)
        at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513)
        at java.lang.Thread.run(Thread.java:636)
        Caused by: java.util.zip.ZipException: duplicate entry: lib/classes/com/samplerealm/SampleLoginModule.class
        at java.util.zip.ZipOutputStream.putNextEntry(ZipOutputStream.java:192)
        at org.glassfish.admin.payload.ZipPayloadImpl$Outbound.prepareEntry(ZipPayloadImpl.java:94)
        at org.glassfish.admin.payload.ZipPayloadImpl$Outbound.writePartsTo(ZipPayloadImpl.java:100)
        at org.glassfish.admin.payload.PayloadImpl$Outbound.writeTo(PayloadImpl.java:359)
        at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:242)
        ... 17 more

        #]
        Show
        Nithya Ramakrishnan added a comment - Sonia reported that the custom realm classes are dropped inside <DOMAIN-DIR>/lib/classes. It appears that after a cluster start in 3.1.2, there is an error (Error 1) in syncing the lib/classes of the DAS with that of the instances: If the command bin/asadmin start-local-instance --node localhost-domain1 --sync normal i1 is provided, as per the suggestion in the error message below, there is a further error (Error 2) displayed: The stack trace for both are common. (Pasted below the errors). This is the root cause of the problem. Transferring to the admin team. Error 1: ./asadmin start-cluster c1 remote failure: i1: Could not start instance i1 on node localhost-domain1 (localhost). Command failed on node localhost-domain1 (localhost): Command start-local-instance failed. CLI802 Synchronization failed for directory lib, caused by: HTTP connection failed with code 500, message: Internal Error To complete this operation run the following command locally on host localhost from the GlassFish install location /home/nitkal/gf-v3-img2/glassfish3: bin/asadmin start-local-instance --node localhost-domain1 --sync normal i1 The command start-instance failed for: i1 Command start-cluster failed. Error 2: CLI802 Synchronization failed for directory lib, caused by: HTTP connection failed with code 500, message: Internal Error Command start-local-instance failed. The stack traces for both are common: [#|2011-12-02T15:55:57.914+0530|SEVERE|glassfish3.1.2|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=24;_ThreadName=Thread-2;|service exception java.lang.RuntimeException: java.util.zip.ZipException: duplicate entry: lib/classes/com/samplerealm/SampleLoginModule.class at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:246) at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:179) at com.sun.enterprise.v3.server.HK2Dispatcher.dispath(HK2Dispatcher.java:117) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:238) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:833) at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:730) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1031) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run(ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:636) Caused by: java.util.zip.ZipException: duplicate entry: lib/classes/com/samplerealm/SampleLoginModule.class at java.util.zip.ZipOutputStream.putNextEntry(ZipOutputStream.java:192) at org.glassfish.admin.payload.ZipPayloadImpl$Outbound.prepareEntry(ZipPayloadImpl.java:94) at org.glassfish.admin.payload.ZipPayloadImpl$Outbound.writePartsTo(ZipPayloadImpl.java:100) at org.glassfish.admin.payload.PayloadImpl$Outbound.writeTo(PayloadImpl.java:359) at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:242) ... 17 more #]
        Hide
        Nithya Ramakrishnan added a comment -

        Transferring to the admin team to investigate further.

        Show
        Nithya Ramakrishnan added a comment - Transferring to the admin team to investigate further.
        Hide
        Tom Mueller added a comment -

        This is a regression due to revision 50623. This needs to be fixed in 3.1.2.
        The problem is that the ServerSynchronizer.getFileNames method is putting both
        a directory name and a file that is contained within that directory into
        the output payload, and the output playload cannot handle that.

        Show
        Tom Mueller added a comment - This is a regression due to revision 50623. This needs to be fixed in 3.1.2. The problem is that the ServerSynchronizer.getFileNames method is putting both a directory name and a file that is contained within that directory into the output payload, and the output playload cannot handle that.
        Hide
        Tom Mueller added a comment -

        Fixed on the 3.1.2 branch in revision 51426.
        Fixed on the trunk in revision 51428.

        Show
        Tom Mueller added a comment - Fixed on the 3.1.2 branch in revision 51426. Fixed on the trunk in revision 51428.
        Hide
        josesuero added a comment -

        I'm using 3.1.2 build 23 and jars are still not being copy to cluster instances, please check

        Show
        josesuero added a comment - I'm using 3.1.2 build 23 and jars are still not being copy to cluster instances, please check
        Hide
        Tom Mueller added a comment -

        Can you please provide details about where you put the JARs and the steps you took?

        Note, that if you just copy a JAR into the domain's lib directory, and then restart a server, the file is not going to be copied over because the domain.xml file did not change. Each try touching the domain.xml file on the DAS or using the add-library subcommand rather than just copying the file into the domain's lib directory.

        Show
        Tom Mueller added a comment - Can you please provide details about where you put the JARs and the steps you took? Note, that if you just copy a JAR into the domain's lib directory, and then restart a server, the file is not going to be copied over because the domain.xml file did not change. Each try touching the domain.xml file on the DAS or using the add-library subcommand rather than just copying the file into the domain's lib directory.

          People

          • Assignee:
            Tom Mueller
            Reporter:
            sonialiu
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: