Les Hazlewood Where Les is More

18Feb/060

Haptic User Interfaces

I found a blog entry on Haptic User Interfaces while googling. Pretty cool stuff.

Its amazing that we still use a mouse and keyboard. I want to use a twiddle keyboard and that haptic interface, like, today!

Here's an amazing video of one haptic UI.

Filed under: General, Software No Comments
18Feb/063

Full Circle to Smalltalk

While attempting to quench my thirst for the latest and greatest software that would make my life easiser, I went 'a searchin' on the net to find frameworks and technologies.

Since I'm responsible for a lot of web 2.0 / enterprise software products, I wanted to find a framework and/or suite that I could leverage to cut my development time down. One of my biggest criteria in determining what is valuable is that of intuitiveness. I want things to make sense to me. Granted, this is a purely subjective statement, but I've found that what makes sense to me makes sense to most of the people in my field.

So, what makes sense to me? Object Oriented design.

I'm talking about pure OO - everything is an Object, objects communicate with other objects via messages, interface-driven design, abstraction, polymorphism, etc.

Well, since I'm a Java guy, I knew that I was at a dead end, since Java isn't purely OO (ints and booleans are not objects, they're "primitives"). But I love Java, and Java 1.5 helped a little bit with its auto boxing/unboxing feature, but that's not perfect. Yes its verbose, and yes the compile/build/deploy cycle is cumbersome, but its strong type safety finds a heckuva lot of errors. When you're in a team environment, thats a huge safety net. And, when you throw in frameworks like Spring, Hibernate and JUnit, I do still find myself very happy and productive. With their support, I can churn out a lot of functionality very fast in Java. But Java has its quirks, and to be honest, the strict typing thing ties your hands in a lot of ways.

What I was looking for was the next revolution. Something huge. Something that would drop my development time literally in half. I'm not kidding...something that would realistically reduce my coding/testing/rollout time 50% or more. Concretely.

So, I started searching for "OO-like" scripting languages to complement my Java knowledge. I came across Ruby (thanks to the hype surrounding Rails - which is very cool), Jython, Python, BeanShell, Groovy, JavaScript, and some others. Being a CS guy and actively involved in the open-source/cutting-edge world of software, I'm no stranger to all of these languages at some level. I put in probably a good week or so researching what would be best for me. I did want something that could leverage the massive amount of libraries available to Java too. Spring, Hibernate, etc, are so far ahead of the framework game, I don't want to relinquish their power by leaving them behind.
Here's what I've decided:

  • I don't really like Ruby all that much. The syntax is a little too cryptic which can introduce ugly code that's hard to read (Python anyone?). But more so, its slow as hell. Rails is a great concept, but I'm not sure that Ruby is the best choice for such a framework. There is already a massive flame war about why Ruby is great or sucks, but I'm not going to get into that here. Suffice it to say its not right for me. It doesn't leverage Java, so that means its out, even if its better than sliced bread.
  • I don't like Jython or Python - not 'pure' enough for me. Yes, they can adhere to the OO paradigm, but they allow a ton of functional programming. This makes code too sloppy for my taste. Jython is cool in that it can leverage Java, but it can't win out due to the code readability thing.
  • BeanShell is cool and is really just interpreted Java. But Java/BeanShell doesn't do everything I want. I want closures!
  • JavaScript - supports closures and is very powerful. I can't remember where I found a benchmark test, but JavaScript on Rhino beat out all the other scripting languages that I mention here in speed. This accounts for a lot actually. However, I don't really like programming in JavaScript. Its such a pain to debug, and consistent browser support doesn't exist (thanks IE!). I actually find myself making JS code look more like Java to make it more readable. Again, I say Java is a little on the verbose side, but dang its easy to read. This is a huge thing for me. Code readability is ultra important, and thats why I abandon a lot of scripting languages (Jython and Python and Perl are so flexible, there are like 3 to 5 ways to do anything - very frustrating when trying to read code written by someone else.)
  • Groovy supports everything I want. The syntax is very similar to Java, at least, similar enough to be able to learn the differences in a single day. It also supports everything in Java. That means I can leverage the amazing amount of functionality already written in Java (Spring, Hibernate, Ant, etc). Its not quite as fast as Rhino/JS, but it does have a decent support base over at CodeHaus. Then I found Grails.

I found Grails and it was like the heavens opening and angels singing "AAAAaaaaawh". Its already based on Spring/Hibernate (my pairing of choice on almost all projects), and does exactly what I want.

You see, I think the OO world will return to behavior AND state in the same Class within 5 years, as OO intends. AOP makes this ability possible. Because of distributed systems and the complexities of state management, users of OO languages like Java/C++/C# etc. have split their programming paradigm. We usually write business logic in classes as functions and pass in an object POJO that contains state and nothing else. Why can't they be the same?

Distributed systems threw a wrench into that concept. If I passed an object from one VM to another, I couldn't guarantee that the object transferred would work correctly in VM1 as it would in VM2. Thats why folks created "state-only" objects, and let the behavior reside elsewhere in "Manager" or "Service" stateless components. I think AOP allows us to move back to what OO is meant to be - state and behavior together.

Grails helps move in that direction. Where I used to have a Book domain model object, and a BookManager service object which in turned used a BookDAO object to do persistence logic, I can have all 3 in one Class with grails:

Book b = new Book( title: "Hello World", author: "John Smith" );
b.save();
//at this point the book is saved in the RDBMS.

def books = Book.findAll();
//books is now a collection of all books in the RDBMS

books = Book.find( "from Book b wher b.author like %mith%" );
//now books contains only those books whose author's name is like "mith";

books = Book.findByTitle( "Hello World" );
//now books contains only those books whos title is "Hello World";

When using Grails, you don't even have to write that last findByTitle call in your Groovy class. Grails automatically interprets that method call as a hibernate call if it doesn't exist in your class, and essentially creates and executes the implementation dynamically, during runtime on your behalf. Pretty sweet magic going on there.

(It still rubs me the wrong way a little propagating queries up to such a high level in the code - instead of using a DAO for example - but there is nothing that prevents me from using DAOs should I choose to do so, and that discussion is worth a new post entirely. Suffice it to say doing this gets you up and running _very_ fast, and you could always implement a DAO strategy very easily if time and costs allow).

Anyway, this is a huge step in the right direction - logic and state in the same entity. Brilliant! (mmm...Guinness)

So, I started thinking.

Groovy/Grails are doing things that have been around for years. Smalltalk has type-safety. It has interpreted support (highlight a chunk of code and click "doit!", and it evaluates). It even has closures (called blocks). AND - here's the kicker - it is truly pure OO - everything is an object. Numbers, booleans, even the compiler itself - are all represented as instances of a class. Even null is an object (called 'nil', it is the only instance of the UndefinedObject class).

So, this leaves me a little frustrated. I've come full circle from C to Java to all these new fangled scripting langauges, all the way back to Smalltalk. Its like we CS idiots just keep reinventing the wheel when Smalltalk has had all this stuff all along! I learned Smalltalk back in college at Georgia Tech, and I have to be honest, I didn't like it that much. We used an open-source variant called Squeak, and it was slower than molasses, it had a poorly documented API, and the IDE sucked (it looks much better now, 5 years later).

But the language itself was awesome. It made sense to me as a fledgling CS student (Java did too, so I guess that doesn't say much). But I've heard of stories of 8 year olds learning Squeak in a couple days (I think Disney did some learning research with it). It took CS students in a university an entire semester to really learn Java. It takes even longer to fully understand C and C++, because you have to know the intracacies of pointer and memory management.

So, again, I'm like "what the hell!?!?". Squeak's VM, compiler, and IDE are written in its own language. I've even heard of folks modifying live code in production without restarting anything. Are you kidding me? Thats just freaking amazing. Why the heck have we reinvented the wheel like 30 times?

Smalltalk was written in the late 1970's by Xerox Parc. It had a VM back then. How revolutionary! When CS was just a fledgeling concept and departments were one or two rooms in a Mathematics school, they were already on to the concepts of virtual machines, garbage collection, OO abstraction, design by contract, etc, etc. Holy crap those guys were smart.

So, why isn't everyone using Smalltalk today instead of Java or C#?

The answer is Smalltalk's price in the early days. Unfortunately, IBM and other Smalltalk vendors charged an arm and a leg for a single developer license. I think the average around 1985 was about $3000 per developer. That was a lot of money back then, and is even too expensive for today's market when the best IDE's are free or inexpensive. That, and open-source was not even close to being on the radar in the late 1970's.

So, just as Smalltalk was starting to hit the mainstream among companies that could afford it, Sun Microsystems came in in 1995 and blew up the world with a FREE developer toolkit and with marketing techniques that piggybacked the hype of the newly popular internet. Smalltalk never had a chance. Hobbyists would never pay for something so expensive, so they adopted Java, and the rest was history.

Sucks for us. Don't get me wrong, I absolutely love Java. But I think I could love Smalltalk even more given proper community support and adoption (and a good IDE - I would much rather have an Idea or Eclipse environment than what Squeak has, even with Squeak's improvements).

Unfortunately, I'm not sure Smalltalk can recover. Java and .NET are the really only heavy hitters out there for enterprise development. Does Smalltalk/Squeak have an ORM tool as powerful and sophistacted as Hibernate? What about application frameworks like Spring, Tapestry, Struts, etc.? I would assume so since the language has been around a lot longer than Java, but I haven't heard of them. I would love for an experienced Smalltalker to point them to me.

So I conclude discouraged and disheveled, because the coolest language out there, the one that has been around for 30 years, and has had ALL of the feature that the most modern languages are just now starting to support - not many of us use it. There is not enough industry adoption. There is not enough of a marketing machine out there. Even IBM, an early Smalltalk supporter (VisualAge Smalltalk), has ditched most of its offerings for Java.

I love Java. I really do. I just wish Smalltalk had a heavy hitter like Sun behind it. Until then, I look to Java, Groovy and Grails, which are much more likely to catch the eye of today's developers. Too bad we're just now able to do things that Smalltalk has been able to do for more than 20 years.

I end this post disgusted. I'm going to get a Guinness...

Filed under: General, Java, Software 3 Comments
11Feb/060

Mind blowing magic / sleight of hand

While searching Google Video, I found this Japanese-American master magician named Cyril Takayama.
Then, I found a Cyril Takayama YouTube page

In one video, he passes large salt shaker right through a glass table - with people right next to him! Absolutely mind blowing.
Usually with sleight of hand, you can see the magician contorting his hands a little bit, which sorta gives away that he might be hiding or palming things. This guy doesn't do that - you swear its not magic and that he has discovered holes in the universe or something - a true master. Unbelievable!!!

Filed under: General, Japanese No Comments
4Feb/0650

Email Validation using Regular Expressions (the Right Way)

UPDATE: This article was updated on February 1st, 2008 to account for domain literals and quoted strings such as "John Smith" <john.smith@somewhere.com>. It is now effectively the only complete and semantically correct email validator for Java.

PETTY REQUEST: The update required considerably more effort than the original as it now accounts for all valid RFC parsing conditions. Because of this, and that this page is easily my most visited, I'd appreciate it if you could show your appreciation by hooking a brother up and clicking on some ads. It helps pay for my hosting. Thanks!

In Object-Oriented design, I'm a firm believer in modeling things in they way they truly exist (in as much is possible given abstraction and time restrictions). So, whenver I design a system's domain model, I create Classes that represent entities as they exist in real life. That being said, I've accrued a nice library of Classes that I reuse in a lot of projects.

For example, I don't save or reference an email address as a String: strings as objects don't tell me anything about the email address itself, like if its valid, if its bouncing, if it has been verified by the user with which it is associated, etc, etc. As such, I have created an EmailAddress class to represent this information. Doing this is a small example of the beauty of OO over functional programming.

Anyway, I was a little lax in the past in my validation logic. This time on my last project, I was determined to get things right once and for all.

I googled quite a while for the Right Way to validate an email address. In my opinion, there is only one Right Way - the RFC 2822 way. This is the standard after all.

I never came across anything I was happy with. All the responses seemed to be perl or php variant regular experessions or some horribly convoluted text string nearly impossible to decipher. I was disappointed to see so many interpretations of a standard. I mean, c'mon people, its written in pure black and white!!!

I guess the old addage "If you want something done right, you've got to do it yourself" resonated in my head this time. I actually took the time out to read the RFC (something I hadn't done in a long while, probably since college).

After reading the RFC, I translated the grammar into usable, *readable* source code that now resides in my EmailAddress class, and I've included it below for the benefit of anyone that wishes to use it. It is written in Java, but the same code could be replicated in C# or PHP or whatever. Just keep it clean!

N.B: Look at the to the first two constants, ALLOW_DOMAIN_LITERALS and ALLOW_QUOTED_IDENTIFIERS - enable or disable them as you see fit for your application.

/*
* Copyright 2008 Les Hazlewood
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/**
* This constant states that domain literals are allowed in the email address, e.g.:
*
*

someone@[192.168.1.100] or

* john.doe@[23:33:A2:22:16:1F] or

* me@[my computer]

*
*

The RFC says these are valid email addresses, but most people don't like allowing them.
* If you don't want to allow them, and only want to allow valid domain names
* (RFC 1035, x.y.z.com, etc),
* change this constant to false.
*
*

Its default value is true to remain RFC 2822 compliant, but
* you should set it depending on what you need for your application.
*/
private static final boolean ALLOW_DOMAIN_LITERALS = true;

/**
* This contstant states that quoted identifiers are allowed
* (using quotes and angle brackets around the raw address) are allowed, e.g.:
*
*

"John Smith" <john.smith@somewhere.com>
*
*

The RFC says this is a valid mailbox. If you don't want to
* allow this, because for example, you only want users to enter in
* a raw address (john.smith@somewhere.com - no quotes or angle
* brackets), then change this constant to false.
*
*

Its default value is true to remain RFC 2822 compliant, but
* you should set it depending on what you need for your application.
*/
private static final boolean ALLOW_QUOTED_IDENTIFIERS = true;

// RFC 2822 2.2.2 Structured Header Field Bodies
private static final String wsp = "[ \\t]"; //space or tab
private static final String fwsp = wsp + "*";

//RFC 2822 3.2.1 Primitive tokens
private static final String dquote = "\\\"";
//ASCII Control characters excluding white space:
private static final String noWsCtl = "\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x7F";
//all ASCII characters except CR and LF:
private static final String asciiText = "[\\x01-\\x09\\x0B\\x0C\\x0E-\\x7F]";

// RFC 2822 3.2.2 Quoted characters:
//single backslash followed by a text char
private static final String quotedPair = "(\\\\" + asciiText + ")";

//RFC 2822 3.2.4 Atom:
private static final String atext = "[a-zA-Z0-9\\!\\#\\$\\%\\&\\'\\*\\+\\-\\/\\=\\?\\^\\_\\`\\{\\|\\}\\~]";
private static final String atom = fwsp + atext + "+" + fwsp;
private static final String dotAtomText = atext + "+" + "(" + "\\." + atext + "+)*";
private static final String dotAtom = fwsp + "(" + dotAtomText + ")" + fwsp;

//RFC 2822 3.2.5 Quoted strings:
//noWsCtl and the rest of ASCII except the doublequote and backslash characters:
private static final String qtext = "[" + noWsCtl + "\\x21\\x23-\\x5B\\x5D-\\x7E]";
private static final String qcontent = "(" + qtext + "|" + quotedPair + ")";
private static final String quotedString = dquote + "(" + fwsp + qcontent + ")*" + fwsp + dquote;

//RFC 2822 3.2.6 Miscellaneous tokens
private static final String word = "((" + atom + ")|(" + quotedString + "))";
private static final String phrase = word + "+"; //one or more words.

//RFC 1035 tokens for domain names:
private static final String letter = "[a-zA-Z]";
private static final String letDig = "[a-zA-Z0-9]";
private static final String letDigHyp = "[a-zA-Z0-9-]";
private static final String rfcLabel = letDig + "(" + letDigHyp + "{0,61}" + letDig + ")?";
private static final String rfc1035DomainName = rfcLabel + "(\\." + rfcLabel + ")*\\." + letter + "{2,6}";

//RFC 2822 3.4 Address specification
//domain text - non white space controls and the rest of ASCII chars not including [, ], or \:
private static final String dtext = "[" + noWsCtl + "\\x21-\\x5A\\x5E-\\x7E]";
private static final String dcontent = dtext + "|" + quotedPair;
private static final String domainLiteral = "\\[" + "(" + fwsp + dcontent + "+)*" + fwsp + "\\]";
private static final String rfc2822Domain = "(" + dotAtom + "|" + domainLiteral + ")";

private static final String domain = ALLOW_DOMAIN_LITERALS ? rfc2822Domain : rfc1035DomainName;

private static final String localPart = "((" + dotAtom + ")|(" + quotedString + "))";
private static final String addrSpec = localPart + "@" + domain;
private static final String angleAddr = "<" + addrSpec + ">";
private static final String nameAddr = "(" + phrase + ")?" + fwsp + angleAddr;
private static final String mailbox = nameAddr + "|" + addrSpec;

//now compile a pattern for efficient re-use:
//if we're allowing quoted identifiers or not:
private static final String patternString = ALLOW_QUOTED_IDENTIFIERS ? mailbox : addrSpec;
public static final Pattern VALID_PATTERN = Pattern.compile(patternString);

Anyway, the above java code allows you to do things like the following.

In the EmailAddress class, you can have a method:

public static boolean isValid( String userEnteredEmailString ) {
return VALID_PATTERN.matcher( userEnteredEmailString ).matches();
}

Then you can write validation logic wherever you want (hopefully in a dedicated Validator ;) ):

if ( !EmailAddress.isValid( userEnteredEmailString ) {
throw InvalidFormatException( "Invalid e-mail format!" );
}

Better yet, if you want to see if any email address instance is valid, the EmailAddress class has the following method that you can use for 'pure' OO 'messaging' (i.e. a method invoked on an object is a 'message' from the calling object to the target object):

public boolean isValid() {
//use static method call as helper w/ class attribute 'text'
return isValid( getText() );
}

which enables you to do checks this way (this is 'pure' OO):

if ( anEmailAddressInstance.isValid() ) {
//do something
} else {
//do something else
}

Happy validating!

Filed under: General, Java, Software 50 Comments