Programming is hard by Stephan Schmidt

Hg versus Git - and why I did chose Hg

After my unpleasant experience with setting up Git, I’ve had some time to play around with Git and use it in a project. Git is really nice for a DVCS. What I like was the git status view, especially with colors turned on. Grouping added and modified files is much nicer than the Subversion style “A”, “D” (…) list. What I also like is that Git is fast. And not very difficult to use for day-to-day jobs, with tutorials like the Svn-Git Crashcourse. Killer features for Git are git rebase and especially branching. Git is different to other VCS that it uses revisions and branches as views to a directory. Other VCS - like HG and SVN - use different directories for different branches. This makes it much harder to jump between those branches. Git seems to have the biggest community, mindshare and momentum to become the next SVN/CVS.

What made it unusable for me was the Windows port. It just does not work reliable. Cygwin is a problem in it’s own. Msysgit isn’t up to par neither. I use a MacBookPro so Git works for me, but others who needed to access the repo from Windows were out of luck. Windows support isn’t very high on the list of Git priorities - perhaps because it’s linked to Linux kernel development.

Now to Mercurial. It seems to have the second biggest mindshare (more than bzr) after Git. Some big Java projects use it and there are some good blog posts about Mercurial. So I gave it a try. The setup was easy for a central server over http, it just worked after following the instructions. It’s command style is more like SVN than Git so the learning curve is even shallower than with Git. It works with Windows, the main reason for me the switch to Hg from Git. The downsides: SVN style hg status, no rebase and SVN style branches. Nice that hgignore is just a file in the repo and accessible by all developers (Maven target and generated files).

Should Git become usable under Windows, I’ll probably move again.

(My personal take is, SVN will become a DVCS by adding local repos, moving to hash revisions and eat the DVCS market)

Thanks for listening.

Update: I forgot about that post:

For Subversion 2.0, a few of us are imagining a centralized system, but with certain decentralized features. We’d like to allow working copies to store “offline commits” and manage “local branches”, which can then be pushed to the central repository when you’re online again.

I want to meet Cameron Purdy ;-) Who do you want to meet?

Partially because of the good discussions on TSS about Coherence and the knowledge he has, but mostly because of this recent presentation. It’s about “The Top 10 Ways to Botch Enterprise Java Application Scalability and Reliability”. I’ve enjoyed the video very much and laughed several times so loud my colleague looked up. Cameron made a joke every 30 secs - noone laughed in the audience though I found them all funny.

Meeting Cameron - well no chance - I know - and I wouldn’t know what to say.

Others I’d like to meet are above all Crazy Bob for Dynaop (and Guice), Cedric for his stand on dynamic languages and Rickard of course.

Whom do you want to meet and why?

“Yahoo will go down in flames”

Funny, I wrote that more than a year ago.

Update 10/24/08: Looks more and more likely

Using Google Guice Providers to Solve Law of Demeter Problems

A post on the Google testing blog made me think. Their post presents an example of a class

class Mechanic {
 Engine engine;
 Mechanic(Context context) {
   this.engine = context.getEngine();
 }
}

which depends on an object Context in the constructor, when indeed it only depends on Engine, a violation the law of demeter. This often happens with Context objects which play the role of a central object repository to give access to objects in different parts of an application. The resulting code is hard to test and hard to reuse and the Google testing team suggest refactoring the code.

Sometimes that’s not possible. With IoC this can be solved without (much) refactoring.

For example with Google Guice one can write a Provider that provides an object of a given class, in this case Engine.

public class EngineProvider extends Provider<Engine> {
    private Context context;

    @Inject
    public EngineProvider(Context context) {
       this.context = context;
    }

    public Engine get() {
       return context.getEngine();
    }
}

Binding the Provider to Engine,

  bind(Engine.class).toProvider(EngineProvider.class);

the application will use the provider (probably from the @Request scope) to extract the engine from the context. The Mechanic can be rewritten to use Engine directly, but no other code in the potentially large application needs to change.

class Mechanic {
  Engine engine;  

   @Inject
  Mechanic(Engine engine) {
    this.engine = engine;
  }
 }

Thanks for listening.

Update: Are more clever Provider could support the NullObject Pattern.

 public class EngineProvider extends Provider {
     private Context context;  

     @Inject
     public EngineProvider(Context context) {
        this.context = context;
     }  

     public Engine get() {
        if (null == context ||context.getEngine() == null) {
           return new NullEngine(); // better Engine.NULLOBJECT
        }
        return context.getEngine();
     }
 }

Unscientific Jetty versus Glassfish for REST

This post was too unscientific and was updated. Jetty is an excellent container and the container of choice whenever I do something with servlets. Ever since we’ve developed SnipSnap some years ago I love Jetty. Glassfish has some very promising features like the admin console and I´m eager to try Glassfish in a project sometimes in the future.

Reading about another story of Rails performance, I grabbed JMeter to benchmark one of my current projects. Not so much as a comparison for Ruby - which managed 320 requests per second - but more as a comparison of Jetty and Glassfish.

The application is a small REST server which reads data from a JDBM storage, transforms it with my own framework to Json and delivers the result with Jersey.

Both servers were started with their default configuration through their maven plugins (wonderful easy to use mvn glassfish:run). Unscientific as it may be, the numbers are:

  • around 1000 requests/sec for both containers

Both with 200 threads and 50 requests per thread. Both numbers are great for my MacBookPro and good enough for me. They also are so close to each other so they are not a deciding factor for either Glassfish or Jetty. At the risk of comparing apples to oranges I have no fear of deploying this to a production system and scaling cheap (and even better with E-Tag caching), keeping in mind the requests per second with 25 servers in the Rails example.

Thanks for listening.

Update:: Another Rails application which thinks it did scale - at least with Merb, to 650k page views per day, well that’s “650K hits per day is ‘only’ around 8 per second (assumed a 20 hour day to spike it a little). This doesnt actual seem all that much?”.

The JMeter speed and 1000req/sec (for an admittantly simple REST GET) results in … 86.4M requests per day. Uh. On my MacBookPro.

Current code coverage tools (for maven)?

What to use for code coverage? Clover seems the only option but costs (which I will probably buy when my project makes some money). Emma and Cobertura seem to be dead, IDEA code coverage doesn’t work - obviously - with maven. Any ideas?

Unpleasent Git experience

I’m too stupid for git. I’ve run several SVN servers over the years but a Sunday afternoon isn’t enough for me to get git working.

  • Debian stable has git 1.4, for git init one needs 1.5. Some major reconfigurations later (think Debian backports) and updates and updates I had 1.5 working
  • Lots of “fatal” errors during git configuration on the server, with no helpful explanations
  • The client - MacOS X isn’t any better:git add . is followed by fatal: pathspec '' did not match any files. Perhaps git is only for Linux?

And more problems and more problems and more problems. My first SVN repo took 10min to setup and 10min for configuring the client. Voila. Perhaps git is only for the geniuses out there and I’m stuck in the 90s of SCMs.

Update: No this won’t stop me - yet. Though the thought comes to mind if mercurial would have been a better choice ;-)

Update 2: The server works and is accessible. Now I only need to fix the fatal: pathspec '' did not match any files for my local add and then push the contents to the server. Look here for help with the git backports on Debian.

Update 3: More fun: Updating remote server info PUT error: curl result=22, HTTP code=403

Update 4: I’m getting very very old, see: “I experienced these phenomena first-hand. Git was the first version control system I used, and I grew accustomed to it, taking many features for granted.”. I’ve started with RCS, or the version file system on VMS, 18 years ago.

Update 5: fatal: no matching remote head when cloning a new repository.

Update 6: Everything seems to work now!

My interview

Carl did a short interview with some software engineering guys and me in February, the results are up now:

“Java Experts: Server Side is Where Java Shines”.

Compared to the others my answers are rather short. That’s what people sometimes complain, my answers and mails are too short ;-)

The Erlang hype is grotesque

Take for example this recent post on hacker news with the title “Amazon and Google Discover Erlang (IMDB is switching from Perl to Erlang)”. There in the comments someone gives a source for the IMDB claim: “IMDb on Java/Erlang (a job posting)”. Going to the job listing results in

We are currently working in Perl but have plans to use Java, Erlang and any other language that we think will suit our purposes.

Constructing an “IMDB is switching from Perl to Erlang” (though everything is better than Perl) out of this job listing is grotesque.

As a side node, I’ve been looking into Erlang for some time but I’m still undecided. With some functional style in Java, Kilim, Terracotta and a Supervisor modul (see Scala), I guess most of the Erlang failover/distribution stuff can be done in Java too. I’m interested in CouchDB, but I have no reliability numbers and performance is said to be very low (for now?). The feature set of CouchDB is compelling though.

Ah in the end I only wish for some more functions in Java, better separation of concerns, constructors which can return any type instance, syntactic sugar like in Groovy, structural interfaces, enforced nice-style non-null references, catch(E1,E2,E3) … well well well. I’ll better stop before this turns into another How-to-improve-Java rant.