glassfish
  1. glassfish
  2. GLASSFISH-19391

server will not start after download and domain cannot be created due to unresolvable hostname

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1.2.2, 4.0_b86_RC2, 4.0
    • Fix Version/s: 4.0_b87_RC3
    • Component/s: admin
    • Labels:
      None
    • Environment:

      Fedora 17

      Description

      Fresh after download I unzip glassfish and do:

      ./asadmin start-domain

      and I get

      There is a process already using the admin port 4848 -- it probably is another instance of a GlassFish server.

      there is nothing running on port 4848 for sure, I even changed the port in glassfish config and I still get

      There is a process already using the admin port 114849 -- it probably is another instance of a GlassFish server.

      Many other ppl are having the same issue:
      http://www.javaprogrammingforums.com/web-frameworks/19238-glassfish-error-process-already-using-port-4848-a.html
      http://www.java.net/forum/topic/glassfish/glassfish/glassfish-wont-start-because-port-4848-use-however-nothing-listening-port-4848
      http://www.java.net/forum/topic/glassfish/glassfish/glassfish-error-process-already-using-port-4848

        Issue Links

          Activity

          Hide
          Tom Mueller added a comment -

          Typically, the root cause of this issue is that the host name is not resolvable to an IP address.

          However, the process that GlassFish uses to check to see if a port is available is actually quite complicated, so there can be other reasons that this message is printed.

          At a minimum, the message must be rewritten, as it is possible for it to be output when there isn't another process already using the admin port. The message must be more accurate about the possible reasons for the problem. The trick will be to make the message concise while also conveying the possible reasons.

          Here is exactly what happens when the server is performing the check for which failure results in this message:

          1. StartServerHelper.checkPorts calls StartServerHelper.adminPortInUse(), which calls StartServerHelper.adminPortInUse(addresses) where addresses is a list of admin host names and port numbers. By default, this is just one entry, localhost:4848.

          2. adminPortInUse calls NetUtils.isPortFree on the host and port.

          3. isPortFree checks to see if the port is a valid port number (4848 is)

          4. isPortFree checks calls NetUtils.isThisMe to determine if this address represents an interface on this system. isThisMe makes a sequence of calls resulting in a call to InetAddress.getAllByName(InetAddress.getLocalHost().getHostName()). It calls this list myadds. Then it calls InetAddress.getAllByName(hostname) where hostname is "localhost" or whatever the admin host has been set to in the config and calls this list theiradds. isThisMe then looks at the theiradds list, and if any one of them is the loopback address (based on calling isLoopbackAddress) or if the address is in the myadds list, then true is returned. Otherwise false is returned.

          5. Back in isPortFree, if isThisMe is true (which should be true for the default localhost:4848 case), it calls NetUtils.isPortFreeServer(port). Note that here the hostname is ignored. If isThisMe is false, then NetUtils.isPortFreeClient(host, port) is called.

          6. NetUtils.isPortFreeServer(port) checks to see if the port is free on three addresses: 0.0.0.0, InetAddress.getLocalHost(), and InetAddress.getByName("localhost"). It does this by calling NetUtils.isPortFreeServer(host, port) for each of them. All have to be available for this test to pass.

          7. NetUtils.isPortFreeServer(host, port) creates a ServerSocket (which internally does a bind on that socket). If that constructor throws an exception, the port is considered in use.

          Normally, if a server is really running on the port, this process will fail in step 7, and the current message reflects that failure. But there are several other places in this process where it can fail, and the message will be output even though there is no server running.

          The most typical case is in step 6, on the call to InetAddress.getLocalHost. If there is no entry for the hosts hostname in /etc/hosts or DNS lookup on the hostname fails, then this call with throw an UnknownHostException, and the isPortFreeServer check fails.

          There are several questionable steps in this algorithm:

          a) Why does isPortFree call isPortFreeServer(port) rather than isPortFreeServer(host, port)? If it called the latter, this would avoid the call to InetAddress.getLocalHost in step 6.

          b) Why is StartServerHelper.adminPortInUse call isPortFree rather than isPortFreeServer(host, port), thereby bypassing the isThisMe call?

          Show
          Tom Mueller added a comment - Typically, the root cause of this issue is that the host name is not resolvable to an IP address. However, the process that GlassFish uses to check to see if a port is available is actually quite complicated, so there can be other reasons that this message is printed. At a minimum, the message must be rewritten, as it is possible for it to be output when there isn't another process already using the admin port. The message must be more accurate about the possible reasons for the problem. The trick will be to make the message concise while also conveying the possible reasons. Here is exactly what happens when the server is performing the check for which failure results in this message: 1. StartServerHelper.checkPorts calls StartServerHelper.adminPortInUse(), which calls StartServerHelper.adminPortInUse(addresses) where addresses is a list of admin host names and port numbers. By default, this is just one entry, localhost:4848. 2. adminPortInUse calls NetUtils.isPortFree on the host and port. 3. isPortFree checks to see if the port is a valid port number (4848 is) 4. isPortFree checks calls NetUtils.isThisMe to determine if this address represents an interface on this system. isThisMe makes a sequence of calls resulting in a call to InetAddress.getAllByName(InetAddress.getLocalHost().getHostName()). It calls this list myadds. Then it calls InetAddress.getAllByName(hostname) where hostname is "localhost" or whatever the admin host has been set to in the config and calls this list theiradds. isThisMe then looks at the theiradds list, and if any one of them is the loopback address (based on calling isLoopbackAddress) or if the address is in the myadds list, then true is returned. Otherwise false is returned. 5. Back in isPortFree, if isThisMe is true (which should be true for the default localhost:4848 case), it calls NetUtils.isPortFreeServer(port). Note that here the hostname is ignored. If isThisMe is false, then NetUtils.isPortFreeClient(host, port) is called. 6. NetUtils.isPortFreeServer(port) checks to see if the port is free on three addresses: 0.0.0.0, InetAddress.getLocalHost(), and InetAddress.getByName("localhost"). It does this by calling NetUtils.isPortFreeServer(host, port) for each of them. All have to be available for this test to pass. 7. NetUtils.isPortFreeServer(host, port) creates a ServerSocket (which internally does a bind on that socket). If that constructor throws an exception, the port is considered in use. Normally, if a server is really running on the port, this process will fail in step 7, and the current message reflects that failure. But there are several other places in this process where it can fail, and the message will be output even though there is no server running. The most typical case is in step 6, on the call to InetAddress.getLocalHost. If there is no entry for the hosts hostname in /etc/hosts or DNS lookup on the hostname fails, then this call with throw an UnknownHostException, and the isPortFreeServer check fails. There are several questionable steps in this algorithm: a) Why does isPortFree call isPortFreeServer(port) rather than isPortFreeServer(host, port)? If it called the latter, this would avoid the call to InetAddress.getLocalHost in step 6. b) Why is StartServerHelper.adminPortInUse call isPortFree rather than isPortFreeServer(host, port), thereby bypassing the isThisMe call?
          Hide
          Romain Grécourt added a comment -
          Show
          Romain Grécourt added a comment - Could it be related to http://java.net/jira/browse/GLASSFISH-17990 ?
          Hide
          walec51 added a comment -

          PS.My /etc/hosts file:

          127.0.0.1		localhost.localdomain localhost zrobmikompa.dev.pl
          ::1		localhost6.localdomain6 localhost6
          91.201.155.85    lupo.qcd.com
          91.201.155.85    testowy.qcd.com
          
          Show
          walec51 added a comment - PS.My /etc/hosts file: 127.0.0.1 localhost.localdomain localhost zrobmikompa.dev.pl ::1 localhost6.localdomain6 localhost6 91.201.155.85 lupo.qcd.com 91.201.155.85 testowy.qcd.com
          Hide
          walec51 added a comment -

          reduced my hosts file to just:

          127.0.0.1		localhost
          

          still the same result

          Show
          walec51 added a comment - reduced my hosts file to just: 127.0.0.1 localhost still the same result
          Hide
          Tom Mueller added a comment -

          Try running:
          nslookup `hostname`

          If this fails, then this GlassFish is not working for the same reason. Your host's hostname must be resolvable to an IP address with the current GlassFish implementation.

          Show
          Tom Mueller added a comment - Try running: nslookup `hostname` If this fails, then this GlassFish is not working for the same reason. Your host's hostname must be resolvable to an IP address with the current GlassFish implementation.
          Hide
          walec51 added a comment - - edited
          [walec51@walec51-linux ~]$ nslookup `hostname`
          Server:		192.168.1.1
          Address:	192.168.1.1#53
          
          ** server can't find walec51-linux: NXDOMAIN
          
          [walec51@walec51-linux ~]$ nslookup localhost
          Server:		192.168.1.1
          Address:	192.168.1.1#53
          
          Name:	localhost
          Address: 127.0.0.1
          
          

          nslookup works fine with localhost

          on linux desktops hostname is rarely in the hosts file

          shouldn't localhost be a fall back during startup i hostname fails ?

          I use PostgreSQL, Tomcat, Jetty, JBoss and tons of other servers on my laptop. Non of them have problem starting up.

          Show
          walec51 added a comment - - edited [walec51@walec51-linux ~]$ nslookup `hostname` Server: 192.168.1.1 Address: 192.168.1.1#53 ** server can't find walec51-linux: NXDOMAIN [walec51@walec51-linux ~]$ nslookup localhost Server: 192.168.1.1 Address: 192.168.1.1#53 Name: localhost Address: 127.0.0.1 nslookup works fine with localhost on linux desktops hostname is rarely in the hosts file shouldn't localhost be a fall back during startup i hostname fails ? I use PostgreSQL, Tomcat, Jetty, JBoss and tons of other servers on my laptop. Non of them have problem starting up.
          Hide
          Tom Mueller added a comment -

          The name resolution for your host name is not working fine (you got the message "** server can't find walec51-linux: NXDOMAIN")
          If you fix this, then GlassFish will work.

          You are correct that it would be better if GlassFish did not insist on the host name being resolvable in order to start. However that isn't how the 3.1.2.2 release works. I am trying to help you get the current software running.

          Show
          Tom Mueller added a comment - The name resolution for your host name is not working fine (you got the message "** server can't find walec51-linux: NXDOMAIN") If you fix this, then GlassFish will work. You are correct that it would be better if GlassFish did not insist on the host name being resolvable in order to start. However that isn't how the 3.1.2.2 release works. I am trying to help you get the current software running.
          Hide
          walec51 added a comment -

          Ok, thanks. I know how to work around this.

          I would just like to ask you to schedule this issue to some fixVersion so that people new to glassfish don't get discouraged when they download it. I can try to hack the source a little in the weekend but I never worked with glassfishes internals.

          Show
          walec51 added a comment - Ok, thanks. I know how to work around this. I would just like to ask you to schedule this issue to some fixVersion so that people new to glassfish don't get discouraged when they download it. I can try to hack the source a little in the weekend but I never worked with glassfishes internals.
          Hide
          walec51 added a comment -

          Just tested trunk of Glassfish 4 and the issue does not happen there.
          The stage distribution started without a problem in the same enviorment.

          Show
          walec51 added a comment - Just tested trunk of Glassfish 4 and the issue does not happen there. The stage distribution started without a problem in the same enviorment.
          Hide
          Tom Mueller added a comment -

          This problem also prevents a domain from being created. The output from the create-domain command is:

          $ asadmin create-domain domain2
          Enter admin user name [Enter to accept default "admin" / no password]>
          Default port 4848 for Admin is in use. Using 53996
          Default port 8080 for HTTP Instance is in use. Using 53997
          Default port 7676 for JMS is in use. Using 53998
          Default port 3700 for IIOP is in use. Using 53999
          Default port 8181 for HTTP_SSL is in use. Using 54000
          Default port 3820 for IIOP_SSL is in use. Using 54001
          Default port 3920 for IIOP_MUTUALAUTH is in use. Using 54002
          Default port 8686 for JMX_ADMIN is in use. Using 54003
          Default port 6666 for OSGI_SHELL is in use. Using 54004
          Default port 9009 for JAVA_DEBUGGER is in use. Using 54005
          Port 53,996 is in use( com.sun.enterprise.admin.servermgmt.InvalidConfigException: Port 53,996 is in use )
          CLI130: Could not create domain, domain2
          Command create-domain failed.

          To recreate this problem, just run:

          sudo hostname bogusname

          (where bogusname is an unresolvable hostname).

          This problem is seen in GlassFish 4.

          Show
          Tom Mueller added a comment - This problem also prevents a domain from being created. The output from the create-domain command is: $ asadmin create-domain domain2 Enter admin user name [Enter to accept default "admin" / no password] > Default port 4848 for Admin is in use. Using 53996 Default port 8080 for HTTP Instance is in use. Using 53997 Default port 7676 for JMS is in use. Using 53998 Default port 3700 for IIOP is in use. Using 53999 Default port 8181 for HTTP_SSL is in use. Using 54000 Default port 3820 for IIOP_SSL is in use. Using 54001 Default port 3920 for IIOP_MUTUALAUTH is in use. Using 54002 Default port 8686 for JMX_ADMIN is in use. Using 54003 Default port 6666 for OSGI_SHELL is in use. Using 54004 Default port 9009 for JAVA_DEBUGGER is in use. Using 54005 Port 53,996 is in use( com.sun.enterprise.admin.servermgmt.InvalidConfigException: Port 53,996 is in use ) CLI130: Could not create domain, domain2 Command create-domain failed. To recreate this problem, just run: sudo hostname bogusname (where bogusname is an unresolvable hostname). This problem is seen in GlassFish 4.
          Hide
          Byron Nevins added a comment -

          After fixing:
          – notice how the message is logged once. Actually it is logged once as a warning, then every time the code is triggered at FINE level
          Otherwise it would spew warnings.

          Bad Network Configuration. DNS can not resolve the hostname:
          java.net.UnknownHostException: bogusname.xyz.com: bogusname.xyz.com: nodename nor servname provided, or not known
          Using default port 4848 for Admin.
          Using default port 8080 for HTTP Instance.
          Using default port 7676 for JMS.
          Using default port 3700 for IIOP.
          Using default port 8181 for HTTP_SSL.
          Using default port 3820 for IIOP_SSL.
          Using default port 3920 for IIOP_MUTUALAUTH.
          Using default port 8686 for JMX_ADMIN.
          Using default port 6666 for OSGI_SHELL.
          Using default port 9009 for JAVA_DEBUGGER.
          Distinguished Name of the self-signed X.509 Server Certificate is:
          [CN=localhost,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US]
          Distinguished Name of the self-signed X.509 Server Certificate is:
          [CN=localhost-instance,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US]

          Show
          Byron Nevins added a comment - After fixing: – notice how the message is logged once. Actually it is logged once as a warning, then every time the code is triggered at FINE level Otherwise it would spew warnings. Bad Network Configuration. DNS can not resolve the hostname: java.net.UnknownHostException: bogusname.xyz.com: bogusname.xyz.com: nodename nor servname provided, or not known Using default port 4848 for Admin. Using default port 8080 for HTTP Instance. Using default port 7676 for JMS. Using default port 3700 for IIOP. Using default port 8181 for HTTP_SSL. Using default port 3820 for IIOP_SSL. Using default port 3920 for IIOP_MUTUALAUTH. Using default port 8686 for JMX_ADMIN. Using default port 6666 for OSGI_SHELL. Using default port 9009 for JAVA_DEBUGGER. Distinguished Name of the self-signed X.509 Server Certificate is: [CN=localhost,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] Distinguished Name of the self-signed X.509 Server Certificate is: [CN=localhost-instance,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US]
          Hide
          Byron Nevins added a comment -

          1) hostname bogusname.xyz.com

          2) create-domain d2

          3) start-domain d2

          Results:
          ~/dev/main/nucleus/common/common-util> asadmin start-domain d2
          Bad Network Configuration. DNS can not resolve the hostname:
          java.net.UnknownHostException: bogusname.xyz.com: bogusname.xyz.com: nodename nor servname provided, or not known
          Waiting for d2 to start .....
          Successfully started the domain : d2
          domain Location: /Users/wnevins/glassfish4/glassfish/domains/d2
          Log File: /Users/wnevins/glassfish4/glassfish/domains/d2/logs/server.log
          Admin Port: 4848
          Command start-domain executed successfully.

          Show
          Byron Nevins added a comment - 1) hostname bogusname.xyz.com 2) create-domain d2 3) start-domain d2 Results: ~/dev/main/nucleus/common/common-util> asadmin start-domain d2 Bad Network Configuration. DNS can not resolve the hostname: java.net.UnknownHostException: bogusname.xyz.com: bogusname.xyz.com: nodename nor servname provided, or not known Waiting for d2 to start ..... Successfully started the domain : d2 domain Location: /Users/wnevins/glassfish4/glassfish/domains/d2 Log File: /Users/wnevins/glassfish4/glassfish/domains/d2/logs/server.log Admin Port: 4848 Command start-domain executed successfully.
          Hide
          Byron Nevins added a comment -

          This is an annoying issue if the user has his hostname set wrong. It is NOT catastrophic if that is the case. But we currently treat it that way. Instead, it should be logged as a WARNING.

          Specifically, whether or not the hostname is setup correctly has nothing to do with whether or not a given port is free.

          This has a big impact on customers. The error says that, say, port 4848 is in use. The user checks and finds that nothing is using port 4848. Confusion sets in at that point.

          It is unlikely that the customer will bump into this issue. He would have to have a bad hostname. I.e. a hostname that can't be resolved by DNS. With the fix we emit a warning with the exact problem so that he can now fix it permanently.

          The cost to fix it is minimal, in fact it is already fixed and waiting to go in. If it doesn't go into 4.0 it'll go into 4.0.1

          The fix is not too complicated. The main complication is setting it up to emit only one warning message, then fine messages after that.
          The actual fix itself is simply catching the right exception at the exact right place and swallowing it rather than turning it into a fatal error.

          There is little risk. Automated tests, including QuickLook test this area all the time.

          No doc impact.

          QA need only run their usual standard tests

          Show
          Byron Nevins added a comment - This is an annoying issue if the user has his hostname set wrong. It is NOT catastrophic if that is the case. But we currently treat it that way. Instead, it should be logged as a WARNING. Specifically, whether or not the hostname is setup correctly has nothing to do with whether or not a given port is free. This has a big impact on customers. The error says that, say, port 4848 is in use. The user checks and finds that nothing is using port 4848. Confusion sets in at that point. It is unlikely that the customer will bump into this issue. He would have to have a bad hostname. I.e. a hostname that can't be resolved by DNS. With the fix we emit a warning with the exact problem so that he can now fix it permanently. The cost to fix it is minimal, in fact it is already fixed and waiting to go in. If it doesn't go into 4.0 it'll go into 4.0.1 The fix is not too complicated. The main complication is setting it up to emit only one warning message, then fine messages after that. The actual fix itself is simply catching the right exception at the exact right place and swallowing it rather than turning it into a fatal error. There is little risk. Automated tests, including QuickLook test this area all the time. No doc impact. QA need only run their usual standard tests
          Hide
          Tom Mueller added a comment -

          Approved for 4.0

          Show
          Tom Mueller added a comment - Approved for 4.0
          Hide
          Byron Nevins added a comment -

          QL before the change (with a bad hostname)

          testng-summary:
          [echo] [testng] ===============================================
          [echo] [testng] QuickLookTests
          [echo] [testng] Total tests run: 117, Failures: 44, Skips: 21
          [echo] [testng] Configuration Failures: 1, Skips: 0
          [echo] [testng] ===============================================
          [echo] [testng]
          [INFO] Executed tasks

          QL After the change (still with bad hostname)

          testng-summary:
          [echo] [testng]
          [echo] [testng] ===============================================
          [echo] [testng] QuickLookTests
          [echo] [testng] Total tests run: 117, Failures: 0, Skips: 0
          [echo] [testng] ===============================================
          [echo] [testng]
          [INFO] Executed tasks
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD SUCCESS

          Show
          Byron Nevins added a comment - QL before the change (with a bad hostname) testng-summary: [echo] [testng] =============================================== [echo] [testng] QuickLookTests [echo] [testng] Total tests run: 117, Failures: 44, Skips: 21 [echo] [testng] Configuration Failures: 1, Skips: 0 [echo] [testng] =============================================== [echo] [testng] [INFO] Executed tasks QL After the change (still with bad hostname) testng-summary: [echo] [testng] [echo] [testng] =============================================== [echo] [testng] QuickLookTests [echo] [testng] Total tests run: 117, Failures: 0, Skips: 0 [echo] [testng] =============================================== [echo] [testng] [INFO] Executed tasks [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS
          Hide
          Byron Nevins added a comment -

          Sending common/common-util/src/main/java/com/sun/enterprise/util/CULoggerInfo.java
          Sending common/common-util/src/main/java/com/sun/enterprise/util/net/NetUtils.java
          Transmitting file data ..
          Committed revision 61658.

          Done

          Show
          Byron Nevins added a comment - Sending common/common-util/src/main/java/com/sun/enterprise/util/CULoggerInfo.java Sending common/common-util/src/main/java/com/sun/enterprise/util/net/NetUtils.java Transmitting file data .. Committed revision 61658. Done

            People

            • Assignee:
              Byron Nevins
              Reporter:
              walec51
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: