DWR version 3.0 Release Candidate 1
DWR version 3.0
The much awaited DWR version 3.0 has reached release candidate 1. What's new?
- RPC Enhancements
- Varargs support
- Method overloading (DWR tries to copy Java's method matching rules)
- Typed parameters (so you can say
new Apple()in JavaScript and pass it to theaddFruit()method and DWR will instantiate the correct type on the server) - Lightweight typed parameters (as above, but by adding
$dwrClassName:"Apple", for when you are getting the objects from something else)
- More natural synchronous XHR (so you can call
var reply = Remote.getData()when doing 'Sjax') - Improved Marshalling
- Binary file upload/download (
byte[],java.awt.BufferedImage,InputStreametc andFileTransfercan be uploaded from aninput type=file, offered for download, or sent to animg)
- Binary file upload/download (
- Functions (Store a reference to a JavaScript function on the server for later execution)
- Objects by Reference (Store a reference to a JavaScript object, and then call methods on that)
- Locale, Currency (DWR will marshal to and from
java.util.Localeandjava.util.Currencyobjects) - Reverse Ajax
- JavaScript can now implement a Java interface (For simple integration with Java Events/Listeners)
- More scalable Reverse Ajax APIs (See
org.directwebremoting.Browser) - DOM Manipulation Library (
WindowandDocumentcan now be manipulated from the server) - The server now runs in 3 modes: stateless (New - save memory with no page tracking), passiveReverseAjax (the default) and activeReverseAjax (comet enabled)
- TIBCO GI Integration
- Complete set of Reverse Ajax Proxy APIs (So you can manipulate your GI user interface from Java on the server)
- Dojo Integration
- Data Store (Keep a server side data store in sync with data in a client browser with both sides able to send updates. The data store also supports paging, sorting and filtering)
- Packaging Integration (
dojo.requireall your DWR scripts)
- Server Support
- Asynchronous servlet support for Tomcat and Glassfish
- Improved Spring and Guice support
- Over the wire
- JSONP support
- JSON-RPC support
- Tech Previews
- JMS Integration (Publish to the browser directly from JMS)
- Aptana Jaxer Integration (Zero configuration for trusted environments)
- Infrastructure
- SVN (We've moved from CVS to SVN)
- Related Projects (Our repository contains a set of related projects including a number of demos)
- CLA (We've been through a legal review and have signed CLAs for dwr.jar)
- Dojo Foundation (We joined the Dojo Foundation and are now hosted by their servers)
- Better Documentation (DWR version 1.x had great docs. Version 2.x let things slide a bit, but we've dropped Drupal, and have our own system now)
There are also a bunch of things like better logging, error reporting and so on, but the full list would get quite dull. 2 things dropped out that we'd previously talked about: Bayeux Support and Gears Offline integration. We'll get to those, particularly Gears, soon.
I'm sure there will be lots of questions about how to use these features. Please don't ask in the comments; join the mailing list and ask there. As we roll out the new documentation system in the next week or so, all the details will be in there, and I'll then come back and link up this blog post.
You can download it now.
XSS Filtering
If you want to protect yourself from a XSS attack, what characters should you escape? I've seen 2 recommendations:
- ', ", <, > and & should be converted to ', ", <, >, &
- Convert anything that isn't ASCII alphanumeric to &#xx;
I've seen the second recommended more and more recently. Which is best?
The argument for escaping all non-ASCII alphanumeric
It's a known security tenet that whitelisting is safer than blacklisting. If you're just escaping ', ", <, > and & then you're blacklisting, which isn't as safe as whitelisting.
There are some practical examples of how this can play out -
<a href="$">
(I'm using $ to represent the injection point. This would probably crop up in a template something like this:
<a href="<%= escape(userInput) %>">)
If all the escape() function does is to escape ', ", <, > and &, then what if the user entered a data: URL? You could end up with the following output:
<a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgnWFNTJyk8L3NjcmlwdD4K">test</a>
Which in case you can't do base64 in your head is equivalent to this:
<a href="data:text/html;<script>alert('XSS')</script>">test</a>
Clearly this is bad - we've let a user XSS us even though we are filtering for XSS. There are many more examples that are similar.
The argument for escaping only ', ", <, > and &
The bad news is that more filtering does not help. If we enhance our escape function to encode every non-alpha, then we would get the following output:
<a href=data:text/html;base64,PHNjcmlwdD5hbGVydCgnWFNTJyk8L3NjcmlwdD4K>test</a>
Here's the bad news - the above works. (Look: test (if this script gets into your RSS aggregator, then you need a new RSS aggregator.))
Adding the extra filtering has had the following effect:
- It's hidden the hole, so now we're less likely to notice it, and fall in.
- It's wasted bandwidth
So how do we keep ourselves clear of XSS attacks?
The solution is to understand about insertion points.
The following insertion points, are ones that I believe are safe if ', ", <, > and & are escaped:
<div>$</div>(Where div could be p, h*, li, etc - things expecting textual content)<input value="$" ...>(i.e. somewhere else that expects textual content)<script>str = "$";</script>(needs different escaping rules)
I think it's likely that virtually any other insertion point is likely to be dangerous. Some examples:
<script>$</script>(no amount of escaping will protect you, prepare to die)<div $>(there are countless events we could latch into, including several non-standard, hard to find ones)<div style="$">...(JavaScript pops up in CSS in many places like width:expression(script_here))<a href="$">...(The example we used above)<img src="$">(For similar reasons)- etc.
The key it to understand the environment into which we are allowing injection. The trend for separating content, style and action into separate files is good because it more clearly defines the environment, but that doesn't stop HTML from being able to embed CSS.
I once saw some code that was JSP containing Java containing HTML containing CSS and JavaScript containing SQL all on one line. An environment so confused that it contained it's very own security hole built right in.
Filtering in DWR
DWR version 3 is nearly cooked, and our escaping functions use the simpler escaping system of just escaping ', ", <, > and &. If anyone knows of any attack that a broader filtering system would protect people from, then please comment.
Why is the web the default development platform?
10 years ago the default was probably VB6/Windows, these days it’s just the web. Why?
If we don't know what's right about the web, then it's hard to know how to build on the success.
4 reasons why; and they've all got something vital in common.
Zero Install
This is the easy one, in retrospect it's amazing that it took so long for the concept of a 'player' to catch on. A player (like a web browser, flash runtime etc) allows the download component to be smaller, it can provide a sandbox to keep the user safe and it can provide many of the libraries that otherwise would have been part of the install.
Zero install saves time, builds trust, reduces clicks and confusion, saves space and is much easier to use.
UI Model
HTML makes it hard to create overlapping windows, complex dialogs, hidden options, deep menu structures - all the things that can make traditional applications harder to use. It's easy, when someone doesn't 'get' an application, to think that they've got a problem, and that they need a training course.
But it would be better if we saw it another way: the application author has a problem - their app is too complex. Developers are very good at coping with complexity, but too often assume that others are too.
The web makes it hard to pass the complexity on, so people create simple UIs instead.
Lazy Text
Lazy Text means that web pages are:
- hackable. Which means advanced users can scrape, mash and plot, and normal users can embed YouTube videos in their blogs.
- debuggable. When makes them easier to fix, even outside of the development environment.
- learnable. Which means HTML can be taught in most schools.
- Postel’s Law compliant. Which means they work. Postel’s Law makes browsers harder to write, can makes pages a mess and is a disaster for security, but there are no exceptions to Postel’s Law.
Openness
Creating the development platform for the world is quite a responsibility. It would be a mistake to give it to Dr. Evil. There are degrees of openness, and while the web is not in the ideal position, it does appear that there are forces currently taking it in that direction.
Probably, the ideal position would have:
- Open patent free specs
- Multiple implementations, at least one being open source
- No monopolies
Clearly the web does not have a monopoly free position - it’s not as open as most people would like, but at the moment, each month we getting closer to the ideal.
What do these 4 have in common?
Zero Install means that the web will scale to billions of pages.
Lazy Text means that the web will scale to millions of developers.
A Simple UI Model means that the web will scale to billions of users.
Openness means that the web will scale to thousands or millions of enablers (the creators of browsers, servers, development tools, etc)
The web is probably the most scalable system that anyone has ever designed. The Chinese Army, the Indian State Railways and the UK Health Service are all big, but they're nothing on the size of the web. It’s perhaps not surprising that what made it successful was it’s scalability potential.
(Note: This is taken from a talk I gave at JBoye08 on the Open Web. See the rest of the slides at SlideShare.)
Defining The Open Web
Brad asked what the 'Open Web' is. Twice. My mum was always cross if she had to ask 3 times, so here's my stab.
The Open Web is the user-remixable technologies that are shipped by the clear majority of major browsers
So, for example:
- XHTML 2.0 is not part of the open web because the browsers didn't go for it even though the W3C did.
- XMLHttpRequest is part of the open web even though the W3C haven't gone for it (yet) because it's in all the browsers.
- Canvas is part of the open web since it's in everything except IE these days.
- XUL is not part of the open web and wouldn't be even if Opera supported it, but if Opera and Safari did, then it would.
Some rationale might be needed:
'User-Remixable Technologies'
One of the key things about the web is how anyone can take part. It's easy for technologists to dismiss what 'normal people' want to do. But cutting and pasting the HTML to embed a YouTube video into your profile is within reach of many people. Maybe not your grandma, but certainly most school leavers.
Are there any significant parts of current browsers that are not user-remixable? Maybe this is built into the ecosystem enough that it doesn't need stating?
'Shipped'
I considered 'endorsed/implemented/supported', but I think the vote comes when it's included in a download.
'Clear Majority'
I don't think we need to be totally unanimous. Indeed requiring unanimity implies a simple lowest common denominator approach. Maybe you could argue for a 75% majority or something. I just went with something simple for now. The implication is that 3 out of the 4 must agree. But that requires me to justify the 4 ...
'Major browsers'
I'd venture to suggest that today there are 4 major rendering engines Trident, Gecko, WebKit and Presto/Core 2. Sorry to Amaya, Lynx and WGet. This isn't a fixed list - if the Java HTMLEditorKit suddenly becomes useful as a browser and widely used, we should pick our jaws off the floor, and include it in the list.
No W3C
Some people would like to make the W3C king and have all innovation be standards lead. But I don't think that history has shown that to be a successful strategy. Some people think the W3C is totally irrelevant, but it is a place where the browser vendors 'sit down' together (by which I mean argue over email). I thought about saying 'major browsers and the W3C' but in the end left it simply with 'major browsers'. Feel free to disagree.
Implications
If this is right, then it turns out that Flash and Silverlight are not part of the Open Web. I think this is perhaps what many people intuitively feel. I'm not sure that either qualify as user-remixable and they are not shipped with the majority of browsers either.
I wondered about defining the term based on browser usage: For example: "The open web is the technologies available to 80% of the web using public". But this is bad because it's against innovation. The old and dying, but still highly used browsers get to hold the open web back, and it opens up the chance that one browser can get to define what the Open Web is. The whole idea behind 'Open' is lack of lock-in. Maybe the Web is defined in this way, but not the Open Web.
I think this definition builds in many of the concepts of Decentralization, Transparency, Openness that Brad originally argued for, and stuff that is not explicit is perhaps included by implication.
There are some interesting implications for Gears in this proposal. While the purpose behind Gears is to support the Open Web, it is itself not part of it for the same reason that Flash and Silverlight are not.
Lessons from Hosting a Website
You might have noticed that the getahead.org domain has been replaced by directwebremoting.org. There’s a story behind it ...

3 years ago when DWR started to be successful, I bought 2 identical servers, built them identically with the latest Debian, and latest Drupal, bought some hosting space for one of the servers and kept the other at home as a backup with a nightly rsync, and thought I was doing a good job of managing a website.
The first thing to go wrong was Drupal dropping support for several of the modules that we were using: the theme engine, and the wiki module. So we were stuck on an old version of Drupal unless we spent lots of time recreating the themes and updating all the content. Lesson: 3rd party modules are OK but don't depend on them in a big way.
We realized we needed to upgrade, but it was a big job and when I began talking to the Dojo Foundation about them hosting DWR and talking to SitePen about me working for them, it seemed sensible to do the upgrade when we moved to Dojo Foundation hardware.
Then the backup died. Is it just me or does everyone assume that the live server will die first? We hoped that we could get everything onto the new hardware before the live server died.
The bad news is, but if you buy identical bits of hardware, then there is a good chance that they will have identical failings. That proved to be true in our case and we ended up with a huge rush to get everything moved to new servers, which included a fair bit of downtime. Sorry if it affected you.
I’m not sure if this is a reason to buy different hardware, where do you stop after all? Do you insist on different copper vendors for the power cable? The lesson is - if you discover a hardware fault in some part of a live site, the same thing is likely to happen to other identical hardware.
The good news is that the DWR website is now back on new hardware and a new URL. http://directwebremoting.org/dwr/. We’ve tried really hard to ensure we’ve got re-directs for all URLs, but please give us a shout if you notice that anything is 404ing.