We are looking at some different test cases, which are showing us essentially the same thing but in a different context: ServiceLocatorImpl.getService() is proving to be far slower than Habitat.getComponent().
For example, if we create and remove stateful beans in 4.0, there is about a 5-10% regression vs. 3.1.2. This all tracks back to extra time spent in getService() vs getComponent(). In this particular use case, the calls to getService come from four locations: two calls in com.sun.enterprise.naming.impl.SerialContext.<init> (the first to retrieve the ProcessEnvironment, the second to retrieve the common class loader), and two calls in com.sun.ejb.containers.BaseContainer – one in createEjbInstanceAndContext and one in injectEjbInstance, each of which is retrieving the JCDI service.
Part of the regression is that there is still some lock contention on the service LRU cache. The remaining lock contention could be somewhat lessened if the CacheKey were constructed outside of holding the lock, and if the constructor for the CacheKey generated the hashcode and stored it in an instance variable to be returned by the hashCode() method. I did that and shaved the regression down by 33% (to 6.6%). [Though to be clear, this is a pretty focused micro-benchmark, and lock contention points are greatly exaggerated by it.]
I changed the glassfish code for the four calls in question so that the first call saved the service as a static and subsequent calls used that; that eliminted the remaining regression for this test case. I'm still not quite clear on the semantics here; I guess that since these are not injected it is not similar to the InvocationManagerImpl case, and possibly in this case it isn't legal to cache the service retrieval like this. Still, the root of the iterator case seems to be this as well: the new, somewhat slower calls to the service locator. [Though if I'm wrong and it should be a different bug, let me know and I'll open another one.]