Details
-
Type:
Bug
-
Status:
Reopened
-
Priority:
Critical
-
Resolution: Unresolved
-
Affects Version/s: 3.0.RC2
-
Fix Version/s: 3.0.RC3
-
Component/s: Core, Reverse Ajax
-
Documentation Required:No
-
- Description:
-
HideThis happens at random in our application and the CPU usage goes to 100% and have to kill the offending thread to release the cpu.
Below is the stack trace captured using JConsole. Environment details: JDK 1.5, Tomcat 6.0.35
---------------
Name: ajp-8087-4
State: RUNNABLE
Total blocked: 1,613 Total waited: 17,518
Stack trace:
java.util.HashMap.put(HashMap.java:420)
java.util.HashSet.add(HashSet.java:194)
org.directwebremoting.impl.DefaultScriptSessionManager.associateScriptSessionAndHttpSession(DefaultScriptSessionManager.java:184)
org.directwebremoting.impl.DefaultScriptSessionManager.getScriptSession(DefaultScriptSessionManager.java:126)
org.directwebremoting.impl.DefaultWebContext.checkPageInformation(DefaultWebContext.java:87)
org.directwebremoting.dwrp.BaseCallHandler.handle(BaseCallHandler.java:97)
org.directwebremoting.servlet.UrlProcessor.handle(UrlProcessor.java:120)
org.directwebremoting.servlet.DwrServlet.doPost(DwrServlet.java:141)
javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:470)
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
org.apache.coyote.ajp.AjpAprProcessor.process(AjpAprProcessor.java:448)
org.apache.coyote.ajp.AjpAprProtocol$AjpConnectionHandler.process(AjpAprProtocol.java:399)
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1675)
java.lang.Thread.run(Thread.java:595)
----------------
Name: ajp-8087-5
State: RUNNABLE
Total blocked: 544 Total waited: 23,961
Stack trace:
java.util.HashMap.removeEntryForKey(HashMap.java:614)
java.util.HashMap.remove(HashMap.java:584)
java.util.HashSet.remove(HashSet.java:204)
org.directwebremoting.impl.DefaultScriptSessionManager.disassociateScriptSessionAndHttpSession(DefaultScriptSessionManager.java:206)
org.directwebremoting.impl.DefaultScriptSessionManager.invalidate(DefaultScriptSessionManager.java:310)
org.directwebremoting.impl.DefaultScriptSession.invalidate(DefaultScriptSession.java:138)
org.directwebremoting.impl.DefaultScriptSessionManager.checkTimeouts(DefaultScriptSessionManager.java:359)
org.directwebremoting.impl.DefaultScriptSessionManager.maybeCheckTimeouts(DefaultScriptSessionManager.java:328)
org.directwebremoting.impl.DefaultScriptSessionManager.getScriptSession(DefaultScriptSessionManager.java:90)
org.directwebremoting.impl.DefaultWebContext.checkPageInformation(DefaultWebContext.java:87)
org.directwebremoting.dwrp.BaseCallHandler.handle(BaseCallHandler.java:97)
org.directwebremoting.servlet.UrlProcessor.handle(UrlProcessor.java:120)
org.directwebremoting.servlet.DwrServlet.doPost(DwrServlet.java:141)
javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:470)
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
org.apache.coyote.ajp.AjpAprProcessor.process(AjpAprProcessor.java:448)
org.apache.coyote.ajp.AjpAprProtocol$AjpConnectionHandler.process(AjpAprProtocol.java:399)
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1675)
java.lang.Thread.run(Thread.java:595)
ShowThis happens at random in our application and the CPU usage goes to 100% and have to kill the offending thread to release the cpu. Below is the stack trace captured using JConsole. Environment details: JDK 1.5, Tomcat 6.0.35 --------------- Name: ajp-8087-4 State: RUNNABLE Total blocked: 1,613 Total waited: 17,518 Stack trace: java.util.HashMap.put(HashMap.java:420) java.util.HashSet.add(HashSet.java:194) org.directwebremoting.impl.DefaultScriptSessionManager.associateScriptSessionAndHttpSession(DefaultScriptSessionManager.java:184) org.directwebremoting.impl.DefaultScriptSessionManager.getScriptSession(DefaultScriptSessionManager.java:126) org.directwebremoting.impl.DefaultWebContext.checkPageInformation(DefaultWebContext.java:87) org.directwebremoting.dwrp.BaseCallHandler.handle(BaseCallHandler.java:97) org.directwebremoting.servlet.UrlProcessor.handle(UrlProcessor.java:120) org.directwebremoting.servlet.DwrServlet.doPost(DwrServlet.java:141) javax.servlet.http.HttpServlet.service(HttpServlet.java:637) javax.servlet.http.HttpServlet.service(HttpServlet.java:717) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:470) org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) org.apache.coyote.ajp.AjpAprProcessor.process(AjpAprProcessor.java:448) org.apache.coyote.ajp.AjpAprProtocol$AjpConnectionHandler.process(AjpAprProtocol.java:399) org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1675) java.lang.Thread.run(Thread.java:595) ---------------- Name: ajp-8087-5 State: RUNNABLE Total blocked: 544 Total waited: 23,961 Stack trace: java.util.HashMap.removeEntryForKey(HashMap.java:614) java.util.HashMap.remove(HashMap.java:584) java.util.HashSet.remove(HashSet.java:204) org.directwebremoting.impl.DefaultScriptSessionManager.disassociateScriptSessionAndHttpSession(DefaultScriptSessionManager.java:206) org.directwebremoting.impl.DefaultScriptSessionManager.invalidate(DefaultScriptSessionManager.java:310) org.directwebremoting.impl.DefaultScriptSession.invalidate(DefaultScriptSession.java:138) org.directwebremoting.impl.DefaultScriptSessionManager.checkTimeouts(DefaultScriptSessionManager.java:359) org.directwebremoting.impl.DefaultScriptSessionManager.maybeCheckTimeouts(DefaultScriptSessionManager.java:328) org.directwebremoting.impl.DefaultScriptSessionManager.getScriptSession(DefaultScriptSessionManager.java:90) org.directwebremoting.impl.DefaultWebContext.checkPageInformation(DefaultWebContext.java:87) org.directwebremoting.dwrp.BaseCallHandler.handle(BaseCallHandler.java:97) org.directwebremoting.servlet.UrlProcessor.handle(UrlProcessor.java:120) org.directwebremoting.servlet.DwrServlet.doPost(DwrServlet.java:141) javax.servlet.http.HttpServlet.service(HttpServlet.java:637) javax.servlet.http.HttpServlet.service(HttpServlet.java:717) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:470) org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) org.apache.coyote.ajp.AjpAprProcessor.process(AjpAprProcessor.java:448) org.apache.coyote.ajp.AjpAprProtocol$AjpConnectionHandler.process(AjpAprProtocol.java:399) org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1675) java.lang.Thread.run(Thread.java:595)
Attachments
-
- tomcat-thread.JPG
- (178 kB)
Activity
I don't see a deadlock here.
protected final Map<String, Set<String>> sessionXRef = new ConcurrentHashMap<String, Set<String>>();
protected void associateScriptSessionAndHttpSession(DefaultScriptSession scriptSession, String httpSessionId)
{
if (httpSessionId == null)
{
return;
}
scriptSession.setAttribute(ATTRIBUTE_HTTPSESSIONID, httpSessionId);
Set<String> scriptSessionIds = sessionXRef.get(httpSessionId);
if (scriptSessionIds == null)
{
scriptSessionIds = new HashSet<String>();
sessionXRef.put(httpSessionId, scriptSessionIds);
}
scriptSessionIds.add(scriptSession.getId()); // line 184
}
protected void disassociateScriptSessionAndHttpSession(DefaultScriptSession scriptSession)
{
Object httpSessionId = scriptSession.getAttribute(ATTRIBUTE_HTTPSESSIONID);
if (httpSessionId == null)
{
return;
} }
Set<String> scriptSessionIds = sessionXRef.get(httpSessionId);
if (scriptSessionIds == null)
scriptSessionIds.remove(scriptSession.getId()); // line 204
if (scriptSessionIds.size() == 0)
scriptSession.setAttribute(ATTRIBUTE_HTTPSESSIONID, null);
}
Please look into this and report back or I will be closing the issue over the next week. Our Jira system is only for DWR issues and we ask that in the future you follow the directions on the support page or the Jira login page before creating issues.
In JConsole, I'm getting the stacktrace using the Thread tab> select the individual thread. Please see attached screenshot. The thread that has state as "RUNNABLE" for a long time (> 30 seconds) is what is causing the cpu to go 100%. It is always thread that ends in disassociateScriptSessionAndHttpSession() and associateScriptSessionAndHttpSession() methods is at runnable state for long time. If I kill those two threads the cpu goes down to almost 0. I'm not sure if I should call this a deadlock or something different. Does that help?
To answer your question about usage, there are an average of 100 active tomcat sessions at peak usage time.
Please follow the JConsole documentation in order to find deadlocks. The stack trace you attached does not in itself indicate a problem and surely does not indicate a deadlock. Here is the JConsole documentation: http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html (search for deadlocks).
Based on what I have looked at so far this does not appear to be an issue with DWR.
Also, are you using reverse AJAX? Our reverse AJAX does not support Tomcat's asynchronous API's so each request is blocking. I would check your Tomcat configuration and see how many threads you are configured to handle.
I will follow that JConsole documentation the next time this occurs. The application did not have this issue when it was using DWR 2.0 for past two years. 3 weeks ago we upgraded to 3.0.RC2 and noticing this issue at random. Like I said earlier, the top of the trace always ends with DefaultScriptSessionManager which leads me to believe it is something in DWR causing this. One thread is trying to add to the ConcurrentHashMap and other is trying to remove from it.
I will keep you posted but please do not close this ticket.
Thanks.
Also, I would recommend installing the Tomcat Manager (if it is not already installed) and monitoring Tomcat when this happens. I will look at the code in more details we may need to make some changes.
The application does not have any functionality that uses the Reverse AJAX feature. But I noticed in the web.xml, activeReverseAjaxEnabled is set to true.
The value in the ConcurrentMap is a HashSet (not synchronized). I will make some changes here but I am not sure it will resolve the problems you are having.
Ok, thanks. I will turn off the reverse ajax and redeploy the app tonight to see if that helps.
I just checked in a fix:
http://ci.directwebremoting.org/bamboo/browse/DWRTRUNK-ALL-422/artifact
Give it a try and let us know.
Is this the only change?
Line #181: scriptSessionIds = new HashSet<String>();
Changed to
Line #181: scriptSessionIds = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());
We are using Java 1.5 and Collections.newSetFromMap() is available starting with Java 1.6
Ok, I was afraid of that I missed the 1.5 in the description.
I just made a fix that will be compatible with 1.5, try the next build.
Ok great. I'll try it out tomorrow. Thanks for the quick responses!
synchronized(scriptSessionIds) throws NullPointerException at the first attempt to invoke dwr.
I can confirm same behaviour with dwr 3.0rc2, java 6 and tomcat 7. It happend to us three times during last month. No deadlock just high cpu usage for threads with stacktraces like:
java.lang.Thread.State: RUNNABLE
at java.util.HashMap.put(HashMap.java:374)
at java.util.HashSet.add(HashSet.java:200)
at org.directwebremoting.impl.DefaultScriptSessionManager.associateScriptSessionAndHttpSession(DefaultScriptSessionManager.java:184)
java.lang.Thread.State: RUNNABLE
at java.util.HashMap.removeEntryForKey(HashMap.java:569)
at java.util.HashMap.remove(HashMap.java:538)
at java.util.HashSet.remove(HashSet.java:216)
at org.directwebremoting.impl.DefaultScriptSessionManager.disassociateScriptSessionAndHttpSession(DefaultScriptSessionManager.java:206)
Corrupted hashmap maybe?
Jiri, you are using the latest 2.x build from bamboo per my latest comment? You can't possibly be using 3.0 RC2 as you claim as those methods don't even exist is 3.0 RC2.
Sorry, I mispoke. I was looking at the wrong branch. The fix that I put in for this is in the latest 3.x build on Bamboo. If you want to get a fix you need to try out that build. Please report back with your findings.
http://ci.directwebremoting.org/bamboo/browse/DWRTRUNK-ALL-424/artifact
T Tiger if you can write back that would be polite. It is frustrating when people create issues and then don't write back with their findings.
Hello David, I apologize for the delay in replying. The fix in build 424 seems to have worked because we deployed it in production four days ago and its been stable so far.
Thanks again.
Awesome. Thanks T. I will close this issue. Please re-open it if you determine there is still an issue.
I have met the bug again:
23 threads with following stacktrace taking 100 % cpu:
"ajp-bio-8009-exec-1086" daemon prio=10 tid=0x00002aaad120d800 nid=0x5dcc runnable [0x00000000563c7000]
java.lang.Thread.State: RUNNABLE
at java.util.HashMap.removeEntryForKey(HashMap.java:569)
at java.util.HashMap.remove(HashMap.java:538)
at java.util.HashSet.remove(HashSet.java:216)
at org.directwebremoting.impl.DefaultScriptSessionManager.disassociateScriptSessionAndPage(DefaultScriptSessionManager.java:261)
at org.directwebremoting.impl.DefaultScriptSessionManager.invalidate(DefaultScriptSessionManager.java:315)
at org.directwebremoting.impl.DefaultScriptSession.invalidate(DefaultScriptSession.java:138)
at org.directwebremoting.impl.DefaultScriptSessionManager.checkTimeouts(DefaultScriptSessionManager.java:365)
at org.directwebremoting.impl.DefaultScriptSessionManager.maybeCheckTimeouts(DefaultScriptSessionManager.java:334)
at org.directwebremoting.impl.DefaultScriptSessionManager.getScriptSession(DefaultScriptSessionManager.java:90)
at org.directwebremoting.impl.DefaultWebContext.checkPageInformation(DefaultWebContext.java:87)
at org.directwebremoting.dwrp.BaseCallHandler.handle(BaseCallHandler.java:97)
at org.directwebremoting.servlet.UrlProcessor.handle(UrlProcessor.java:118)
at org.directwebremoting.spring.DwrController.handleRequestInternal(DwrController.java:234)
dwr-version.properties:
#Fri Apr 13 12:26:55 PDT 2012
bamboo.build.number=424
major=3
revision=0
minor=0
title=RC3-dev
java 1.6.0_30 and tomcat 7
Looks like a slightly different error. I don't think going to 422 will help you. I will take a look.
I saw one or two things that might still be possible to improve in the code, so I'll have a go at doing that.
Mike, I am assigning this to your per your most recent comment.
How exactly are you capturing the trace in JConsole? Are you selecting the "findMonitorDeadlockedThreads" button? If so you should be able to provide us with more information. Please see the JConsole docs: http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html.
It looks like you have a heavy load on your Tomcat servers is that true?