Jimako’s Blog

Looking forward, aware of the past…

Archive for the 'Ruby' Category

What’s the logical next step in web development?

I’m a big fan of Ruby, the language.

I’m really impressed with the potential of the Rails framework (although to be honest there seems to be a lot of voodoo going on behind the scenes, which is probably just an indication that I have not yet done enough with this framework to be comfortable with it).

But there is no denying the reality that, for most of us, adopting Ruby and RoR is not a simple decision, because we already have code that has been developed atop a Java web stack.

If we are looking at a new project, and there is some ability to absorb the risk of going with a new technology stack, I would be the first to suggest that RoR be given serious consideration. On the other hand, if we are involved in the on-going development of a Java web application that has been going on for several years, does it make sense to try to integrate a new stack into the mix?

Let me set some parameters. You have a successful product that is continually being improved. Using “best practices” (I’ll come back to that term in another post), the application has been developed using an MVC framework — let’s say Struts, JSP and Hibernate as a concrete and all-too-common example.

Now, we want to make some changes as a result of customer feedback. Typically, these changes are either new features or modules, or else changes to the existing functionality.

Changes to the existing functionality are hard to do in any other technology. Sure, if the changes are broad enough, it might make sense to redevelop one or more modules from scratch; in that case, of course, we can treat it the same way we would treat the development of a new feature.

So, for a new feature, what alternatives do we have?

We can certainly continue on with our existing technology stack. In many ways, this is the low-risk option, because we know the technology. We know the tool-sets, we have knowledge within our development team, and we have historical data about how quickly we can do things using that technology. And let’s not forget that the stack itself is well-proven, because we (and countless others) have deployed applications on that stack and we know how the stack scales, how it deploys, how it responds to machine and network resource allocations. There is a lot of value to maturity.

But is it really a low-risk option? What if the development velocity is not fast enough? Sure, we have good predictability, but if our predictions are that we can’t do it as quickly as we need to in order to satisfy the customer, or prevent the work being assigned off-shore, then surely persisting with that technology is in reality the high risk option.

And here’s the real issue. Java web development using the classic development stack is just not fast enough. I’ve heard the arguments — you need to do all the fancy plumbing and documentation / annotation so that the app scales and is flexible and maintainable when it gets hit by millions of users per minute. The reality, however, is that most applications never need that scalability, especially if they are not finished because application development is too slow.

So what do we do? Is there some way to incorporate some of the really rapid web application development techniques into an existing Java web application?

I think that we are at a difficult time in Java web development. We have a lot of systems that have been developed over a long time (“long” being relative to the rate of change of software development, of course). While these systems are often still in active enhancement, they are also in a very real sense legacy systems, built using tool sets, frameworks and architectures that, well, seemed like a good idea at the time. Looking back at what we have done, and also looking at all the shining new toys all the new kids are playing with that show just how much faster web application development can be, we can’t help but feel frustrated and itchy to do something, well, different.

At the same time, we have some interesting things “just around the corner” in the Java space. But we need something we can use right now.

I see JRuby is coming along in leaps and bounds. This is an implementation of the Ruby language for the Java Virtual Machine. It is almost at the point that it can run Rails applications. However, I don’t see a clean way to do new-feature development for an existing web application using Rails, even if it is on the JVM. Another very nice dynamic language, Python, has a JVM implementation in Jython, but this is languishing and seems to have been largely orphaned when its initial developer switched focus to IronPython, which is an implementation for the .NET platform.

Groovy is coming along nicely, but slowly. It is likely to be an “official” scripting language as a result of having a JSR. Also, it has a Java-like syntax, which means that there is a shorter pick-up time required for Java developers.

If I thought that language is the limiting factor, then I would look at Groovy because it has a lot of the syntactical conveniences of the popular scripting languages with full access to existing objects that have been coded in Java (including the Java libraries).

However, while I think Java can be too wordy, requiring lots of boilerplate code in some circumstances, I am not at all convinced that this is the major reason that web development in Java is too slow.

In reality, I think that the real reason web development in Java is too slow is that we are making it too complicated. The real reason that frameworks like RoR are so incredibly productive, in my opinion, are more related to the use of very simple ORM designs like ActiveRecord, and the Convention over Configuration philosophy.

Sure, Hibernate is REALLY powerful. But it is not ideal for all sorts of database access, at least not when used naively. Sometimes, a simple SQL query, processed as JBOF (Just a Bunch Of Fields, and yes, I did just make that up) is totally appropriate.

Consider for example presenting a user with a filtered, paged list of widgets. In the prehistoric era of web development (that is, about 8 years ago, and using VB6 COM behind IIS/ASP) I designed a relatively simple, generic technique. I created an SQL statement by putting together the WHERE clause dynamically. I then did a SELECT statement, retrieving only the IDs that matched the criteria. IDs were just 32 bits each, so even a million of the suckers was just 4M — most lists were a few hundred to a (very) few thousand rows. I just stuck them in an array and stuffed them into the session. Then, paging was simple: just calculate the array indexes that correspond to the desired page, create an SQL statement that retrieves only the ID and the columns required for the list display (using an SQL WHERE ID IN … statement) and displayed the list. All this is totally generic, it scales REALLY well, and has not let us down after years of very heavy use in the field.

More recently, and in the Java world, we end up retrieving lists of objects. We rely on Hibernate or the ORM de jour to do magic, multi-level caching and lazy object instantiation and hope that it all works. And then we dump the list into some magic JSP taglib that does sorting in memory. And when the list gets to a few hundred items, the list takes MINUTES to display, and customers are unhappy, and developers say “you didn’t specify performance criteria”, and analysts say “but of course it has to handle more than a dozen items in a list”, and you need to divert resources to do major investigation and refactoring or redevelopment, and you start to think that things are not meant to be this hard.

In business application development, the needs of the application for data access are not complex. We need to get filtered lists of items, then we need to get complete individual items. That’s pretty much it, and that’s what DATABASE servers do — we should let them do their job and not try to replicate that in the application. Updating is only a little more complex.

The other lesson that we can learn from RoR is that we seriously need to tame the configuration frenzy that Struts brings. I need more time to think about this, but I think that a good way to begin simplifying this in an existing product is to add a single Struts action that further parses the request URI and uses some convention to identify the class and method that should handle it. That class could be written in Java, or any of the new, JVM-hosted scripting languages. Do it well, and write a suitable class loader, and you could even hot-deploy a URI request handler class or JAR file.

The reason that I am considering this is not because I don’t want to use an existing framework like Ruby on Rails (or for that matter Grails, Turbogears or Django). It’s that I need to be able to integrate whatever framework we use into the application as it exists so far, and everything I see (and my gut instinct) tells me that these frameworks are good for new projects but are likely to be a bitch to configure and integrate with a Java/Struts/JSP stack.

I have not yet clarified my own thinking about all this, but I wanted to post it to get some feedback. What do others think? Am I alone in thinking that we are making Java web development harder than it needs to be?

No comments

Ruby

I’m going to get off the topic of the Apple for today — not that nothing has happened, but because in reading over the blog I sound like some Mac fanatic. Today, Chris, a good friend of mine, showed me his new HP laptop. Huge, 17″ monster, very powerful, but battery life of about an hour, and he couldn’t get it set up to access the network. Sigh!

But I said, no Apple today.

Over the past couple of weeks, I have been working on a particular Java application, and I needed to extract a whole bunch of data into flat files for a particular client requirement. Cutting a long story short, I ended up writing a set of scripts to generate an XML specification of an extract that is going to be used to control the total extract process, and this gave me a chance to try my hand at Ruby.

Now, I have heard a lot of good things about Ruby, but had not really used it before. Everyone I knew, who I respected as a programmer, and who had tried Ruby, raved about it. So, even though I knew Python, I made a point of nutting my way round Ruby.

Obviously, it took me a little while to get moving — there is always a bit to learn when starting with a new language. But I bought a PDF copy of the Pickaxe book and zoomed through the highlights. I have to say, I like Ruby a LOT.

Ruby is OO to the core. Everything is an object, and it has a remarkably convenient set of built-in functionality. I am not going to put together a tutorial on Ruby, at least not here, but here are a few examples to whet your appetite.

In Java, to define a class with a set of accessor methods, you do something like this:

public class Dog
private String name;
public Dog(String name)
{
super();
this.name = name;
}
public String getName()
{
return name;
}
public void setName(String value)
{
name = value;
}
}
Here’s the same thing in Ruby:

class Dog
attr_accessor :name
def initialize(name)
@name = name
end
end
Creating an instance in Java:

Dog dog = new Dog("Rover");

and in Ruby:

dog = Dog.new("Rover")

so the classes a pretty much equivalent, except that the Ruby one is (a) much shorter and (b) eliminates the need to write a whole lot of plumbing, no-brain code. Now I know that any modern IDE generates this boilerplate code for you, but it is still there and needs to be navigated and mentally discounted while you work on the stuff that DOES matter. In Ruby, the only code you write is what you need for the application — well, most of the time anyway :)

Here’s a really cool thing you can do in Ruby. When you call a function, as well as passing a number of arguments to it, you can also, optionally, attach a code block to it. A code block is delimited by either a the keywords do and end, or braces (they’re the same). Inside the called function, the code can determine whether a code block has been attached to it and, if so, essentially call that block any number of times. Here is an example:

def send(message)
if block_given?
yield "connecting"
end
connect(...)
if block_given?
yield "sending"
end
send(message)
if block_given?
yield "sent"
end
disconnect(...)
if block_given?
yield "done"
end
end
This is a dummy, skeletal procedure. We assume that it sends a message somewhere, and there are several steps — connecting, sending and disconnecting.

If you call it like this:

send("Hello world")

it just does its thing. But you can optionally attach a code block like this:

send("Hello world") {|stage| puts "... now #{stage}" }

Let’s look at this line. The braces define a code block — the convention seems to be that short blocks like this use braces, while long, multi-line blocks use do/end. The two vertical bars delineate a parameter list; here, the parameter is called “stage”. The single line inside the code block uses puts to display a string. I’ll get to the string in a moment, but for now just accept that this results in the following printout:

... now connecting
... now sending
... now sent
... now done

The string that is displayed is delimited by double-quote characters, which means that the string is processed by Ruby. One of the effects of this is that the #{x} construct embedded in the string is replaced with the value of the variable x — this works everywhere, not just in these attached code blocks.

This mechanism is used to implement a really simple, generic and pervasive iterator-like mechanism. For example, to allow arrays to be iterated, the Array built-in class implements a method “each” which, you guessed it, takes a code block. So, to iterate over an array, you use this sort of code:

my_array.each {|element| puts element }

The beauty of this is that any object can exhibit this behaviour — just implement an “each” method that expects a code block, and “yield” once for each element your object contains. There is no need to be in any other way related to an array.

Which leads me to the topic of Duck Typing. This is the Ruby philosophy about object typing. While Ruby does implement a single-inheritance object hierarchy model, you can actually use unrelated object polymorphically as long as they implement a common subset of methods. The idea is that if it walks like a duck, and looks like a duck, and quacks like a duck, then it can be treated like a duck. Yes, this is NOT as bullet-proof as a strongly-typed language like Java, but in reality I don’t actually end up assigning a Debit object instance to an Animal object reference very often, and if I do, I will rely on my tests to pick that up. In return, I save myself a lot of unnecessary casting and fiddling in perfectly good code just to tell the compiler what I already know.

Ruby also has mix-ins, called modules. A module is a bit like an interface and a bit like an abstract class. Like an interface, a module aggregates a set of methods — these are included by classes that want to, regardless of their position in the object inheritance hierarchy. But unlike interfaces, modules have code in them too — implemented methods. These methods become part of any class that includes the module, and have access to class methods, exactly as if the code had been copied and pasted into that class. Also like interfaces, a single class can mix in, or include, any number of modules.

Like an abstract class, it implements some code, and through access to non-coded variables and methods, can set up an expectation on the classes that include it, but unlike an abstract class, the class that includes it does NOT need to descend from it (indeed, it can’t do so, because modules are not classes per se).

Very powerful indeed, and I don’t claim to fully appreciate all the implications of how these can be used, but just intuitively it seems to be really useful. And just plain cool.

Anyway, that’s more than enough for one post. Tomorrow is going to be a busy day. Toodles.

No comments