Programming is hard by Stephan Schmidt

Why the Toyota Product Development System is a thing of the past

This post tries to explain why website/app development is Production and should use the Toyota Production System (TPS) and why classic software application development is product development and can use the Toyota Product Development System (TPDS).

Overloading QA
Creative Commons License photo credit: drewgstephens

Recently there has been a conundrum in parts of the agile and lean community: Obviously the Toyota Product Development System (TPDS) and software development uses a waterfall (or similar spiral) process:

What might surprise some, was that they were using a waterfall model (in Ishii-san’s own words - in reality I think it was more like the spiral model). In spite of that, I had a feeling afterwards that I had just talked to perhaps the most skilled software development managers I have ever met! Does that sound like a paradox? I do not think so.

Toyota today looks so much like a religion that people are willing to suddenly proclaim “waterfall” is a good thing if it’s done by Toyota:

I once said to myself that I did not want to waste my time as a developer on non-agile projects. In the Toyota case, I would certainly make an exception “Toyota is using waterfall!”

Or even, from Mary Poppendieck,

When you are dealing with embedded software in production hardware, a 3 month waterfall is really fast.

And as Sean tries to explain in his comment on another post:

If the projects that Toyota is working on span only a couple of weeks, then waterfall is probably going to work fine. They’ll only develop requirements for a few weeks worth of work, and their in-process code inventory will still be pretty small.

So why should we use TPS and lean in software development, when Toyota does Waterfall and TPDS? We’re confused. But remain calm. There is no conundrum. If you have clear requirements, a limited project, a fixed feature set and a fixed deadline, use waterfall. This is what Toyota probably experiences. Most people in the agile community will tell you it’s only best to use agile and go lean when you have unclear or changing requirements, changing feature priorities and work in a flow model instead of a project.

After reading the Toyota Way book, and twittering about TPS, I got the following reply:

@flowchainsensei: “Might like to take a look at TPDS too (more direct relevance to software development) Kennedy, Ward, etc.”

So TPS/lean isn’t for software development? After some more thinking, whether software development is more like Toyota Product Development (TPDS) or more like production (TPS), I had an insight. For your web app development there are 2 phases:

Phase 1: Version 1.0 done with product development (PD)
Phase 2: After that companies shift to a production (P) model

Note: You should keep the PD phase as short as possible.

Websites and web applications are different than your usual software application development. Website/app development has many more direct stakeholders: Reporting, Controlling, Marketing, Sales, Product Development, Customer Support, Backend Services and more. Contrary to classic software apps, they all have direct involvement into features. In webSite/app development, those stakeholders write their own (!) stories and directly contribute features. On the other hand in software app development, features are mainly developed by a product development department with indirect input from other deparments.

Therefore software app development is much more product development (PD) heavy, while website/app development is much more production (P) heavy. With more direct involvement, smaller stories and changing requirements, more and more development will move into production style in the future.

The same will happen inside Toyota. My prediction: Over time at Toyota TPDS and TPS will merge, when every car is build-to-order, with it’s own design, with custom colors, each with a one-in-a-kind motor etc.

This is what Scrum tries to accomplish, merging production and product development. An enviroment in which web companies are already. Each story to be developed is the same - consisting of web forms, services, database code, reporting etc., but at the same time every story is different. Think of stories as being all the same, e.g. web-service-db stories, but highly customized and build-to-order for each customer (marketing, product development, sales). Then it becomes clear that development in web companies is production, not product development.

This merging is also the area where Scrum struggles: How to put design, architecture work, technical debt etc into development, and Scrum still hasn’t settled for the one best practice.

Hope this helps clear up the “Conundrum”, which isn’t one.

New Version of my Simple Kanban Board Application

Over the weekend I’ve worked on my Simple-Kanban application. Simple Kanban is a small Kanban board application in one Html file. New features are a data mode that displays the data in raw format for easier cut & paste and drag & drop support for moving stories around.

Simple Kanban Screenshot

There is a website now! I’ve added a small website at http://www.simple-kanban.com, where you can find new versions. I’ve also created a GitHub repository for Simple Kanban, where I’m planning to post the code (funny, the code is already open source as part of the Html file :-)

Much fun with using Simple-Kanban in your company, think lean!

Micro Book Review: Agile Retrospectives, making good teams great

Title: Agile Retrospectives, making good teams great
Author: Esther Derby / Diana Larsen
Pages: 165

What the book is about
The book is about leading retrospectives. Retrospectives came into fashion with agile software development, especially Scrum has retrospectives every sprint. The beginning of the book motivates retrospectives, explains them, shows how to lead them and establishes the concept of phases: set the stage, gather data, generate insights, decide what to do, close the retrospective. The second part explains detailed several activities you can use in each phase. A description is structured like the famous design pattern book: Purpose, Time needed, Description, Steps, Materials and Preperation and Examples.

What I’ve learned from the book

  • Retrospectives have several phases: set the stage, gather data, generate insights, decide what to do, close the retrospective
  • There are many activities for retrospectives, you should change activities from time to time
  • Use an activity like”appraisals” for closing a retrospective

Should you buy this book?
Yes, highly recommended

Who should buy the book

  • Every ScrumMaster or Iteration Manager
  • Everyone who leads retrospectives

Notes
Book bought by myself due to a mentioning on Twitter.

I’ve chosen the micro review format because it lends itself to be used as a future micro format and I like short reviews myself. You can read the table of contents elsewhere, I don’t like it when reviews iterate the content.

Kanban Board Application in One Html File

For some years I’ve been interested in lean software development and how to reduce waste. While introducing lean practices, I’ve needed a small, simple Kanban Board application. Thought I’d write one.

You can download a very first alpha here or see it in action here.

Suprised? The application is just one HTML file, no installation needed, no Java, Ruby or PHP. One HTML file, no other dependencies.

How to use the application?

Edit the source of the HTML file with an editor of your choice, preferably one which knows HTML. You will find a list of stories. The example contains those:

T_Q,S18,Checkout optimize
DE,S2,Build old shop
DE,S4,Rebuild with SOAP
DE_Q,S10,Rebuild with REST
P,S17,Do something with OpenID
D,S3,Make application faster
D_Q,S7,Credit Card Payment
DE,S13,Build something astonishing
P_Q,S17,Fix YSlow
R,S39,Google Page Speed fix

There are three columns for stories. The first column contains the state the story is in, the second contains an identifier for your story and the last column the name of the story. You can edit the columns, change states, change names, remove and add stories. You can also export form Excel to csv, then cut&paste into the application source.

The available states are described next in the file:

D,Design
DE,Development
T,Test
R,Release

They need to be available for the stories and match the states of the stories. You can add, remove or change states. Every state has a “sub-state” as a ready queue. For example the ready queue state for Design “D” is “D_Q”, for Test “T” it is “T_Q”. You do not need to describe the ready states, they are automatically created and a ready queue is shown in front of every state. For example “Test Ready” is shown left to “Test”, if there are stories in that particular state.

Customize the colors

The colors of each state are defined in a CSS block.

.box_P { background-color: #FFFF00  ; color: #000000;}
.box_P_Q { background-color: #F0F0F0; color: #606060;}

Feel free to change them to your taste. “.box_P” is for the “P” state box, “.box_P_Q” for the corresponding ready queue.

The main use case for this application is to inform a company about the stories which are in development and in which state they are. It’s an ideal information radiator. I plan to use it on a huge screen.

Future
I’m thinking about a CouchDB storage implementation for storing data and application logic. Or storing data in the file, with drag and drop, inlining Jquery, editing and storing like TiddlyWiki does.

Future features? Add WIP limits, add “From here” signs to display cycle time until live.

Have fun with this One-File-HTML-application. Tip: you can easily mail it around, no install needed.

Top 5 Things to Know About Constructors in Scala

I’ve been toying with Scala for some months now, one thing I’ve struggled with coming from Java are constructors in Scala. They are comparable to Java, but the syntax is different.

scala
Creative Commons License credit: .Paolo.

To help get you going faster in Scala, the top 5 things to know about constructors. Here we go:

  1. How to do constructors with a parameter
    public class Foo() {
       public Bar bar;
    
       public Foo(Bar bar) {
           this.bar = bar;
       }
    }
    

    Looks in Scala like this:

    class Foo(val bar:Bar)
    

    In this case val creates an immutable final public field, using var would create a mutable public field.

  2. How to have private fields
    public class Foo() {
       private final Bar bar;
    
       public Foo(Bar bar) {
           this.bar = bar;
       }
    }
    

    Looks in Scala like this:

    class Foo(private val bar: Bar)
    

    Update: Changed due to comments. Thanks for the commentors to point this out

    Private fields are not as necessary as in Java, you can have public fields for attributes and change them to a method (def) later - without changing your clients.

  3. How to use super() ?
    public class Foo() extends SuperFoo {
       public Foo(Bar bar) {
          super(bar);
       }
    }
    

    Looks in Scala like this:

    class Foo(bar:Bar) extends SuperFoo(bar)
    
  4. How to have more than one constructor?
    public class Foo {
        public Bar bar;
    
        public Foo() {
           this(new Bar());
        }
    
        public Foo(Bar bar) {
    	   this. bar = bar;
        }
    }
    

    Looks in Scala like this:

    class Foo(val bar:Bar) {
      def this() = this(new Bar)
    }
    
  5. Secondary constructors like this() need to delegate to another constructor to work (Thanks @Synesso).

  6. How to get bean style setters and getters?
     public class Foo() {
       private Bar bar;
    
       public Foo(Bar bar) {
           this.bar = bar;
       }
    
       public Bar getBar() {
          return bar;
       }
       public void setBar(Bar bar) {
          this.bar = bar;
       }
    }
    

    Looks in Scala like this:

    class Foo(@BeanProperty var bar:Bar)
    

    The attribute bar will still be public, which is not a big issue (see above) in Scala. But @BeanProperty helps when working with Java libraries and you need Bean conventions for the libraries to work.

    To add getBar and setBar but not a public field you need to:

    class Foo(aBar:Bar) {
        @BeanProperty
        private var bar = aBar
    }
    

Update: Changed to var, thanks to @eivindw.

Hope this helps you, if you have something to add, leave a comment. Should you struggle with the limited this() syntax (only one expression), then perhaps your constructors are doing too much. Consider the factory or better builder pattern instead.

Additional tip: use @Serializable to make your Scala classes serializable.

Nice coding in #Scala.

7 More Good Tips on Logging

Logging in web applications is important - to know what’s going on, for performance tuning and incident analyis. This is my second post about logging. The first post “7 Good Rules to Log Exceptions” was specific to logging exceptions, ths is about logging in general. What makes your logs more useful to you?

Nerdy Bookshelf Part One
Creative Commons Licensecredit: schoschie

1. No debugging logs in production

I have seen time and again that debug logging is enabled in production. This can be intentional or happening by some developers who accidently checked in a debugging logging configuration. Enabled debugging slows down your application remarkedly and makes it impossible to read production logs due to noise. Make sure during deployments - best with some scripts - that debugging level logging is disabled during production.

2. Look through your logs

Some companies have good logging in their production system, but do not look into their logs. Look into your logs, discover issues (bugs, performance, memory) with your application and fix them. Essentially your logs should be without known errors.

3. Log to the correct log level

Developers who write logging code often don’t know which log level to use. Have a document ready which explains which log level developers should use. For example SEVERE should only be used for technical problems which need immediate action. ERROR should be used for errors that someone needs to look into and fix, like not getting a databasde connection, low resources or failing integration points. This is specific to your company and application.

4. Do not log locally

If your server has major problems like resource troubles, it’s often impossible to log in. Therefor you can’t get to your logs finding the problem. Logs should be written to a network drive, copied over to another host or written to the network e.g. with Syslogd. A nice solution is to use the Spread Toolkit to write to a network group with multicasting. This also enables easy monitoring (see “Scalable Internet Architectures”).

5. Monitor your logs

Similar to “Look into you logs”, you should setup a monitoring solution which looks at SEVERE entries, ERROR entries, exceptions and other conditions in you logs. With Spread it’s easy to add monitors. A good idea is also to classify and count exceptions, then do something about the severe and most frequent ones.

6. Use a human readable format

Developers often don’t think about the output they produce. This leads to hard to read log files. “Release It!” has an example for readable output:

[8/14/06 8:22:14:653 CDT] 0000a SSLComponent I CWPKI00001I: SSL service not available
[8/14/06 8:22:14:813 CDT] 0000a WSKeyStore   W CWPKI0041W: One or more key

This row oriented format makes it easier to fast scan logs. Compare this to the your logs.

7. Use error codes in logging

Each cause which leads to log output should have a unique error code. Without a unique error code it’s hard to find the cause in your source code. Error codes make it also much easier to count and classify log statements and enables communications between development and operations.

Want to know more? Books with good sections on web site logging are “Release It!” by Michael T. Nygard (really excellent book!) and “Scalable Internet Architectures” by Theo Schlossnagle.

Better Null Handling Strategies for Java

Uploaded a presentation on “Better Null Handling Strategies for Java” to SlideShare. Enjoy.

View more presentations from Stephan Schmidt. (tags: java null)

Scrum is not about engineering practices

Scrum is not about engineering practices - which is a good thing. Martin Fowler writes:

They adopt the Scrum practices, and maybe even the principles
After a while progress is slow because the code base is a mess

and connects Scrum failure to missing engineering practices. This completely misses the point. Scrum is not about engineering practices, it’s about management.

Ruby ruby
Creative Commons License credit: elliottcable

Engineering practices are a responsibility of the team. Scrum creates self organized teams. They organize themselves, they organize their quality. They organize their tools. They organize their engineering practices (TDD, pair programming). Why? Because they are responsible for delivering. No one else is.

Craftsmanship and level of done

Often quality and craftsmanship are organized by a level of done agreement in the team which describes what a team considers when code is done. This can include

  • Documentation / JavaDoc
  • Functional tests
  • Unit tests
  • Bug free
  • Refactored, maintainable code
  • Reviewed code

What about technical dept? Technical dept is not professional. Seeing an ice berg and keeping a crash course is not professional. With the right level of done there will be no technical dept.

Scrum helps with good engineering practices

Concerning Marting Fowlers comments Dean Wampler twitters:

His comments match my experience at client sites. Teams using Scrum w/out the XP prog. practices don’t work long-term.

Not very insightful it says: developers who do not use professional practices will fail. Of course they fail, but they fail independently of the process you use.

The main difference to classic project management is that developers have the freedom to define their level of done and the amount of work they do. Developers define what stories they work on in each sprint, management doesn’t set a (unrealistic) finishing date as often happens in classical software projects.

Is there too much quality? Doesn’t the product owner care if the team takes “too much” time for quality? The product owner is entitled to two things:

  • Story estimates
  • Shippable products
  • Professional developers

The product owner is not entitled to speed. Scrum sets resources and time in the Iron Triangle and let’s developers decide about scope. Speed is scope / time and is an output variable of the team, not an input variable. If she thinks the velocity of the team is too low, talk to the ScrumMaster and remove impediments. But engineering practices should never be dropped. Other crafts will never drop theirs. Ask a doctor to drop sterilizing to gain speed. Ask a banker to drop double-entry bookkeeping. They won’t. Neither should developers drop theirs [1].

[1] This could mean developers need to be trained to be professionals. But a company needs to teach professionalism to developers independently of what process it uses. If you do not have skilled, professional developers you’re doomed and no process will help you.

Micro Book Review: The Definitive Guide to Terracotta

Title: The Definitive Guide to Terracotta: Cluster the JVM for Spring, Hibernate and POJO Scalability
Author: Ari Zilka (Terracotta CTO) and his team
Pages: 368

What the book is about
The books is an introduction about Terracotta which helps you distribute -transparently- the Java Virtual Machine memory over several JVMs. The main part of “The Definitive Guide to Terracotta” focuses on use cases. Those are quite good motivated, explained and described with many examples and working code.

What I’ve learned from the book

  • What Terracotta and virtual heaps are
  • How to use TC with ehCache, Hibernate and for session clustering
  • Dropping in ready-to use functionality with TC integration modules (TIM) is easy

What I didn’t like

  • Chapter about optimizations but not extensive enough and not enough information about deployments and deployment scenarios.

Should you buy this book?
Yes, highly recommended, it’s written by the Terracotta guys, you can’t get better and more accurate information.

Who should buy the book

  • Every developer or architect who wants to use or evaluate Terracotta

Notes
Book kindly supplied by the publisher. This is a short version of my former review

I’ve chosen the micro review format because it lends itself to be used as a future micor format and I like short reviews myself. You can read the table of contents elsewhere, I don’t like it when reviews iterate the content.

What do you think about this short review style?

ScrumMaster and ZenMaster: The joke of certification

Many people object to ScrumMaster certifications:

  1. It’s a money making machine
  2. Scrum Masters do not learn anything during classes
  3. The certification is nothing worth - because nothing is certified

I have been a Certified ScrumMaster (CSM) and a Scrum practioner for some years. People who object to the certification do see it from the wrong angle. You need to understand Zen to understand the goodness in CSMs.


Nénuphare
Creative Commons License photo credit: darkpatator

Certification is a Zen joke, because the role of a ScrumMaster cannot be certified. It’s not about knowing some technical questions. What should a trainer certify in such a class? That you can lead an agile Scrum team as a ScrumMaster? No one can certify the fact that you’re a leader, catalyst and enabler. You either are or you aren’t. Zen masters (ha, another master without a master!) would laugh at the fun in the ScrumMaster certification. They laugh about the idea of certifying enlightenment.

Scrum without ScrumMasters

As another parallel, both in Scrum and in Zen, masters are only enablers. They are not needed after the act of enabling Zen/Scrum. My Scrum trainer told me, the goal of a ScrumMaster is to make himself obsolete. There is a Zen koan which goes like this:

If you meet the Buddha, kill him.
— Linji

If you see a ScrumMaster, kill him. Zen tells you:

If you are thinking about Buddha, this is thinking and delusion, not awakening. One must destroy preconceptions of the Buddha. Zen master Shunryu Suzuki wrote in Zen Mind, Beginner’s Mind during an introduction to Zazen, “Kill the Buddha if the Buddha exists somewhere else. Kill the Buddha, because you should resume your own Buddha nature.”

If you think the ScrumMaster is Scrum, you’re delusioning yourself. In Scrum the product owner and the scrum team can, and should from my view, act by themselves, without the need of a ScrumMaster. The ScrumMaster helps them achieve their Scrum. Helps them overcoming initial obstacles in their productivity.

Kick your ScrumMaster
If the ScrumMaster is not good enough for them, certification and coaching inside the company hasn’t helped, the Scrum team has always the right to kick their SM if he isn’t good enough for them. And they should do so. If in Zen a master isn’t good, pupils will just leave him. This might lead to problems within the organization, especially if the ScrumMaster is their boss, but that should be the problem of the organization, not a team problem.

Practitioner certification

There are many more certifications from the Scrum alliance. If you dig deeper, the real fun part is that CSM doesn’t mean anything, practitioner means much more:

The practitioner level of certification (CSP) is only offered to those CSMs who have hands-on experience using Scrum. Applicants must complete an extensive questionnaire with probing questions that focus on applicants’ real-world experience using Scrum on software development projects. Their application is reviewed for answers demonstrating competence and comprehension of principles that can only result from hands-on work. The applicant may be questioned to determine eligibility. To maintain CSP status, you must submit a new application every two years.

Is the certification any use?

Yes. The Certified Scrum Master training has several merits:

  1. Calling the Scrum training “Certified” guaranties the quality of the trainer
  2. It motivates the Scrum master to think in Scrum
  3. If managers take part, it helps the organisation adopt a “we can do it” view about Scrum
  4. Certification (CSM) seems to be one of the main reasons for Scrum success in the enterprise. The certification makes Scrum compatible for managment.

The view about acceptance is shared by Peter Stevens:

It is also about branding, and has been quite successful. The acceptance of the CSM program is high (especially from corporate customers, and this is where the money is). I believe the CSM program is an important reason why Scrum is better accepted than say, XP, in corporate management circles.

Scrum is successful. I’ve seen it help development departments gain productivity. If you do not scrum yet, go for it.

Feedburner trouble: Lost 70% subscribers and feed not migrated - HELP!

Google is migrating Feedburner account to Google. During their migration I’ve lost 70% subscribers to the feed and when logging into feedburner.google.com my Feed doesn’t show up - it wasn’t migrated. Any ideas? Similar experiences?

It also looks different:

Book Micro-Review: Blog Blazers

Title: Blog Blazers: 40 Top Bloggers Share Their Secrets
Author: Stephane Grenier
Pages: 232

What the book is about
The books is a collection of interviews with 40 top bloggers. The author askes the same questions about how people define success, what websites they recommend to bloggers or how they did market their blogs. The answers are very insightful and with the same questions in every interview you can see emerge patterns for what successful bloggers do - like writing at a sustainable, predictable pace.

What I’ve learned from the book

  • Digged posts get 50.000 to 150.000 page views
  • Most of the successful bloggers recommend reading ProBlogger.net and CopyBlogger.com
  • Keep a steady pace, success will come

Should you buy this book?
Yes, highly recommended

Who should buy the book

  • Every blogger who wants to learn how blogging works
  • Every blogger who wants more readers for his blog

Notes
Book kindly supplied by the author.

I’ve chosen the micro review format because it lends itself to be used as a future micor format and I like short reviews myself. You can read the table of contents elsewhere, I don’t like it when reviews iterate the content.

What do you think about this short review style?

Relevance of ‘Atlas Shrugged’


Pound to Salinger
Creative Commons License photo credit: igb

I’ve finished reading “Atlas Shrugged” again, mostly inspired by the references in the Bioshock game. I’m no objectivist, but I found again a lot likable in Atlas Shrugged. I was astonished about how the book is relevant even today. There are many good quotes in it, like this gem:

Did you really think that we want those laws to be observed?” said Dr. Ferris. “We *want* them broken. You’d better get it straight that it’s not a bunch of boy scouts you’re up against– then you’ll know that this is not the age for beautiful gestures. We’re after power and we mean it. You fellows were pikers, but we know the real trick, and you’d better get wise to it. There’s no way to rule innocent men. The only power any government has is the power to crack down on criminals. Well, when there aren’t enough criminals, one makes them. One declares so many things to be a crime that it becomes impossible for men to live without breaking laws. Who wants a nation of law-abiding citizens? What’s there in that for anyone? But just pass the kind of laws that can neither be observed nor enforced nor objectively interpreted – and you create a nation of law-breakers – and then you cash in on guilt. Now that’s the system, Mr. Rearden, that’s the game, and once you understand it, you’ll be much easier to deal with.

With all the rescue packages for banks and automobile companies, the present reads a lot like the dystopian US in Atlas Shrugged. A Wall Street Journal article ‘Atlas Shrugged’: From Fiction to Fact in 52 Years’ writes:

We already have been served up the $700 billion “Emergency Economic Stabilization Act” and the “Auto Industry Financing and Restructuring Act.” Now that Barack Obama is in town, he will soon sign into law with great urgency the “American Recovery and Reinvestment Plan.” [...] The current economic strategy is right out of “Atlas Shrugged”: The more incompetent you are in business, the more handouts the politicians will bestow on you. That’s the justification for the $2 trillion of subsidies doled out already to keep afloat distressed insurance companies, banks, Wall Street investment houses, and auto companies — while standing next in line for their share of the booty are real-estate developers, the steel industry, chemical companies, airlines, ethanol producers, construction firms and even catfish farmers. With each successive bailout to “calm the markets,” another trillion of national wealth is subsequently lost. Yet, as “Atlas” grimly foretold, we now treat the incompetent who wreck their companies as victims, while those resourceful business owners who manage to make a profit are portrayed as recipients of illegitimate “windfalls.”

Ayn Rand got a lot of bad press for her books in the last decades - that it’s only materialism that count. But all things considered, I don’t think the book is about materialism, but about doing things. There are people who do things, and people who don’t. It’s about taking responsibility. About entrepreneurship. Perhaps this view I hold is from my role as a programmer, as someone who produces something which people use for something all the time (though sometimes things no one really needs ;-) It’s Zen - don’t scream, I have the feeling Zen and Atlas Shrugged are the same thing - perhaps on opposing sides.

The WSJ article continues:

But as recently as 1991, a survey by the Library of Congress and the Book of the Month Club found that readers rated “Atlas” as the second-most influential book in their lives, behind only the Bible

I really wonder why then, people don’t do more. Why they don’t act. But it fits with the WSJ article. The author

Mr. Moore is senior economics writer for The Wall Street Journal editorial page.

and

Some years ago when I worked at the libertarian Cato Institute, [...]

Funny. He loves Atlas Shrugged. But what has he done? What did he invent? What did he produce at Cato? As a writer for the WSJ? He is the archetypal anti-hero of Rand. Someone who doesn’t produce something himself, but lives from the work of others. The wrong people like Atlas Shrugged!

If you haven’t read the book, go read the book. Then have your own opinion.

ActiveMQ vs. Jabber

If you have or plan an application with synchronous communications over an external API, it will sooner or later break. Why do we need asynchronous communications? Matt Tucker is clear about that:

Take, for example, Twitter. High Scalability recently covered the load stats on Twitter reporting that they average 200-300 connections per second with spikes that climb to 800 connections per second. Their MySQL server handles 2,400 requests per second! Recently, the [2008] Macworld keynote became the most recent culprit for causing Twitter to cut off its API, which has 10x the load of their website.

When one of my web pet projects needed a messaging backbone which extends to the browser. Whenever a resource did change on the server, all users watching the resource should get a notification without need to reload their browser. Two candidates are Javascript for ActiveMQ, which uses Comet

ActiveMQ supports Ajax which is an Asychronous Javascript And Xml mechanism for real time web applications. This means you can create highly real time web applications taking full advantage of the publish/subscribe nature of ActiveMQ.

ActiveMQ is a messaging bus, often used as an Enterprise Service bus as mentioned in my recent concurrency rant. Components can send messages to the bus and subscribe to topics.


smokin
Creative Commons License photo credit: mudpig

The other unsuspected contender is Javascript for Jabber. Jabber with the XMPP protocol is usally used for sending chat messages. Comparing these two and my thoughts:

ActiveMQ

  • Standard solution, JMS based
  • Routing solutions like Camel available
  • Easy access for different languages via Stomp
  • Attach Jabber as a service
  • Notification easily over topics

Jabber

  • Free OpenFire server
  • Messaging with only one user with UUID for resource which did change
  • Messaging with many users, who join one chat room
  • Chat rooms as topics
  • Server side filtering? How to make it secure, that people only get their own messages?

In the end I decided to go with Jabber/XMPP. The main points for me have been:

  • Server does scale to connections
  • Chat client can be used for debugging
  • Very easy to use with different programming languages
  • Presence protocol to detect services
  • Easy to implement additional chat solution

This worked quite well as a spike. I followed a similar mode as Adrian Sutton, who had good experiences with Jabber/XMPP too when spiking a cache solution:

We grabbed the Smack API and started playing with it and quickly discovered that sending and receiving messages was ridiculously easy. It turns out that the absolute simplest way you can minimize stale data in your caches is to simply have all the servers join a preconfigured chat room. Whenever they save a change to a resource they send a message to the room with the unique ID of that resource and whenever they receive a message from the room they assume it’s a unique ID and remove any cached versions of that resource.

Though I had some major problems accessing Jabber consistently from Javascript. With more on messaging in the backend, I would have went with ActiveMQ as a message bus. And perhaps I might move to ActiveMQ in the backend and then I’m still free to attach Jabber on top of that and keep the frontend code. Best from two worlds.

Think innovative, use technologies in a way to help you. Jabber/XMPP is more than a chat protocol.

Want Erlang concurrency but are stuck with Java: 4 Alternatives (+1)

You’ve by now have read a lot about Erlang style concurrency. In Erlang actors are sending messages to inboxes of other actors and react to messages in their own inbox. The advantage of this approach with immutable messages is that you can’t get as easily in a deadlock as with basic Java concurrency with synchronized and threads. A simple ping pong example looks like this:

ping(N, Pong_PID) ->
    Pong_PID ! {ping, self()},
    receive
        pong ->
            io:format("Ping received pong~n", [])
    end,
    ping(N - 1, Pong_PID).

pong() ->
    receive
        finished ->
            io:format("Pong finished~n", []);
        {ping, Ping_PID} ->
            io:format("Pong received ping~n", []),
            Ping_PID ! pong,
            pong()
    end.

Two actors sending each other ping and pong messages with the ! symbol and acting on those with receive. Although there is more to concurrency than Erlang style message passing, as I’ve written lately in a post, for some problems it’s best and easiest to use message passing.

Is Erlang the next Java, but you are stuck with Java and can’t use Erlang style concurrency? Wrong.

There are concurrency solutions for Java which mimic Erlang style concurrency:

Update: “One clarification: ActorFoundry just uses Kilim’s weaver and therefore is not built on top of Kilim.” (see comments, thanks)
Update 2: Another option sugested by Diego Martinelli is to use Functional Java

Sujit Pal has written a comprehensive post comparing the performance of those frameworks with lots of example code:

Over the past few weeks, I’ve been looking at various (Java and Scala based) Actor frameworks. In an attempt to understand the API of these frameworks, I’ve been porting the same toy example consisting of three pipelined Actors responding to a bunch of requests shoved down one end of the pipe. To get a feel for how they compare, performance-wise, to one another, I’ve also been computing wallclock times for various request batch sizes.

And as Kilim has been shown to be as fast as Erlang - I’ve written about the fact here - these Java solutions do indeed look comparable to Erlang in the concurrency space.

What does this mean to your Java Enterprise?

Don’t worry about the multi-core future, Java has plenty to give for your multi-core platforms. And if you’re only stuck to the Java VM but not the Java language, you could go for Scala and gain some functional programming capabilities on top.

Thanks for listening. Hope you’ve learned something. As ever, please do share your thoughts and additional tips in the comments below, or on your own blog (I have trackbacks enabled).

British Pound Down Against EUR, buy at Amazon.co.uk

The pound is rapidly going down and will go down even more next year (some say to 0.90 EUR). Comparing a Nikon D700 on Amazon.co.uk vs. Amazon.de results in

1571 EUR vs 2054 EUR

That’s a 25% difference for electronics (p&p etc not taken into account).

Ioke 0 released

Congratulations to Ola Bini for releasing Ioke, with an astonishing amount of documentation for a 0 release. Great job.

Ioke is a dynamic language targeted at the Java Virtual Machine. It’s been designed from scratch to be a highly flexible general purpose language. It is a prototype-based programming language that is inspired by Io, Smalltalk, Lisp and Ruby.

Concurrency Rant: Different Types of Concurrency and Why Lots of People Already use ‘Erlang’ Concurrency

People talk a lot about concurrency. With the rise of multi-core processors, concurrency becomes more important. It’s sad that developers don’t know much about concurrency - and most of them just parrot what they have read in other blogs. I wanted to write this post for quite some time to shed more light into concurrency.

There are mainly two different types of concurrency

  • One Task, many workers: For example parallel Fibonacci Numbers
  • Many Tasks, many workers: For example web requests

They have two different characteristics:

  • First: Need to share data
  • Second: No need to share data (or “not to share in real time”)

Most people think of the first kind, when they discuss concurrency. Although most applications are of the second kind. I wish people would not confuse those two and try to fix the second problem with their solutions, because the second problem is solved sufficiently (think FaceBook). So this post will only discuss the first type of concurrency.

The breakthrough for concurrency of the first kind came with threads and the synchronizd keyword in Java 15 years ago. Before that concurrency was an esoteric topic for niches, with Java it became a topic for every developer. Today most people recognize that threads and synchronized are too low level though and create quite some problems if you don’t know what you do. One half of the blogosphere damns Java for this and favors Erlang style concurrency: Message passing between objects, each object has an inbox (a queue) of messages which it works through and objects send messages to the inbox of other objects, where messages are immutable. The benefits are clear: No shared state, no need to regulate the access to shared state and the freedom to implement the scheduler of the objects the way it works best (not being limited to thread libraries).

This half proposing Erlang over everything usually doesn’t know what the other half does and think they still use synchronized. But this is only the most basic technique (and not even the best basic one with the advent of atomics). Surprise, enterprise Java developers don’t use threads (directly) and synchronized. I’ll tell you what they do.

There are lots of better methods for concurrency nowadays like Futures, Executor Services and especially Doug Leas Fork/Join framework in Java JSR 166y. The FJ framework splits a task into subtasks, distributes them, solves them and joins the result (in a very clever way with queues which are written on one side and read on the other and tasks which can steal work from others). The algorithm for forking and joining looks like this:

Result solve(Problem problem) {
  if (problem is small)
    directly solve problem
  else {
    split problem into independent parts
    fork new subtasks to solve each part
    join all subtasks
    compose result from subresults
  }
}

As an example to calculate Fibonacci numbers:

class Fib extends FJTask {
  static final int threshold = 13;
  volatile int number; // arg/result

  Fib(int n) { number = n; }
  int getAnswer() {
    if (!isDone())
      throw new IllegalStateException();
    return number;
  }

  public void run() {
    int n = number;
    if (n <= threshold) // granularity ctl
      number = seqFib(n);
    else {
      Fib f1 = new Fib(n − 1);
      Fib f2 = new Fib(n − 2);
      coInvoke(f1, f2);
      number = f1.number + f2.number;
    }
  }
}

(you can use threads or a different scheduler for this to work)

A similar approach to concurrent work is MapReduce as implemented by Hadoop. It also splits work into sub tasks, does a mapping step for transforming data and then reduces the result. It works best for data crunching and reducing input data to output data.

Other techniques developers often use instead of concurrency primitives are communication abstractions, like communicating over concurrent access queues, for example java.util.concurrent.ArrayBlockingQueue (ah queues again! or call them inboxes).

Threads talk to each other by adding (hopefully immutable) messages to a queue. Sounds familiar? Those can even be distributed (also see Hazelcast for distributed queues). This is very similar to Erlang concurrency, just imagine input queues for all your workers, aka actors.

Scala has an Erlang like message passing actor concurrency implementation. When looking into the Scala code, Scala uses the Fork/Join Framework of Java, the cirlce closes. And Scala uses different schedulers for it’s implementation with one thread per actor only being one option.

We can abstract concurrency even more. Jonas writes an excellent piece about abstractions on top of actors in a piece about fault tolerant, Asynchronous concurrency but misses one crucial point:

Actors can simplify concurrent programming and reasoning immensely and I believe that Scala Actors is a key piece in the future Java concurrency puzzle. However, programming with actors and with explicit message passing and message dispatch loops can feel a bit unnatural and unnecessary verbose for Java developers that are used to regular OO method invocations and synchronous control flow.

As pointed out before, people use queues and are used to work with asynchronous flows. Java enterprise developers in particular as they use Enterprise Service Buses (ESBs). There are many implementations like ServiceMix, OpenESB, Camel, OpenMQ, ActiveMQ, Mule and others. They range from pure message buses to routing and integration solutions. Because ESBs are fault-tolerant, asynchronous concurrency, the developers who use them know about asynchronous flows. Companies like LinkedIn uses ESBs to distribute tasks in a fault tolerant and parallel way. Nothing new there. Java developers think in asynchronous messaging already.

Stand back a thousand feet and ESBs look like the Erlang model. Worker cling to the bus and wait for work. Each is listening to different messages, the waiting messages for each worker are a virtual input queue similar to Erlang. There isn’t as much difference between concurrency in different languages as people want you to believe there is!

As a final word: It’s interesting that the first and second kind of concurrency start to merge. With applications like Twitter or identi.ca, the second type is sometimes becoming the first type of concurrency because different requests need to share data with each other (hopefully you don’t use the database for this as Twitter did in the beginning). One can argue that more and more applications need to share data between sessions. You can use an actor model for this (Lift does this). You could use ESBs. Or you could go a very different way with distributed method calls and objects - Terracotta to the rescue.

Thanks for listening. Hope you’ve learned something. I did by writing this post. As ever, please do share your thoughts and additional tips in the comments below, or on your own blog (I have trackbacks enabled).

Update: Actors Guild for Java looks also interesting

Update 2: If you find this post offending, go back read it again without you favorite programming language in mind which you need to defend. There is nothing to defend. You’ve chosen the programming language to the best of your knowledge, as have others (and if they haven’t but just chose the language because of some hype or others told them to, well, not a good idea). Spread your knowledge, don’t feel offended. Be happy if others find solutions to their problems. This is not some kind of football game which you win if you gain enough points against a different language. If the other one is better, use it. If not then don’t. Sounds awfully common sense, but we sometimes seem to forget this. Merry xmas.

1046 RSS subscribers! Major Milestone!

Thanks to all of you for following. I do my best to write interesting - somtimes provoking - blog posts to keep you interested and amazed how wonderful software development is :-)

The Feedburner stats which describe your rising following:

(as people asked: The jump was on October 1st, from 481 to 674, there was no jump in the Google analytics visits for that time frame, the second jump was on October the 27, I’ve looked into it and the jump came from the time where the website was slashdotted by my 6 reasons why my startup failed story on Reddit)

Feedburner Stats

7 Good Rules to Log Exceptions

I’ve been helping to debug some nasty problems and bugs lately. It occurred to me that some best practices on how to log exceptions go a long way towards easier debugging. Some of the best practices I’ve learned to log exceptions are compiled in this post.

1. Only log technical exceptions not user exceptions
User exceptions are either ok and need not to be logged (”login name already exists”) but shown to the user, or no exception at all (”user has no credit left”). Technical exceptions are those you need to debug (”no file storage left”, “could not book product”) and react to. If you log everything you will probably get too many log entries to have a meaningful reaction to exceptions in your log. You should inquire into every exception in your log files and find the cause for it (”is it a bug?”). Too many exceptions will make you sloppy with exceptions in your log files (”nah, just another exception”).

2. Store data in your exceptions to make them easier to log
Taking the exception “could not charge money from account” you should store the context of the exception just like Junit does (”expected but got …”) to make debugging easier

CannotChargeMoneyAccountException(Money moneyInAccount, Money toCharge, Account account)

The message could be: “Tried to charge 20 EUR from account 1234567890 but 10 EUR available” compared to “Charge failed”. This makes it much easier later to log the exception in a meaningful way. Be careful to create no memory leaks though.

3. Log the description of your exception
Very bad example from Sun: The ClassCastException didn’t show you what class you did want an object to cast to for a long time.

Now it even detects

final String s = "Hello";
final int x = (Integer) s;

and tells you

T.java:4: inconvertible types
found   : java.lang.String
required: java.lang.Integer
          final Integer x = (Integer) s;

During runtime the exception thrown by Java is now:

Exception in thread "main" java.lang.ClassCastException:
java.lang.String cannot be cast to java.lang.Integer

Much better than before.

4. Output all causes to your exception
If your exception has an exception wrapped as a cause, log all causes. Some logging frameworks do this for you, some don’t. Be sure to have all causes of your exception in the log file. Be sure the beginning of all relevant stack traces in your log, not scrambled ones.

5. Log to the right level
If your Exception is critical, log it as Level.CRITICAL. You need to decide what critical means for you (most often it means losing money). For example if a booking didn’t work, or a user could not register due to technical problems then you have a CRITICAL problem you need to solve.

Monitor your log files for critical exceptions. You’re losing money.

Have your own exception implement isCritical() or a CriticalException interface and test when logging the exception in your wrapper to log it on the right level. If your logging framework hasn’t got an appropriate level, create one.

6. Don’t log and rethrow
Logging and rethrowing an exception is an exception anti-pattern.

catch (NoUserException e) {
  LOG.error("No user available", e);
  throw new UserServiceException("No user available", e);
}

Don’t do it. Your log files will then contain the same exceptions several times on several stack levels. Only log the exception once.

7. Do not log with System.out or System.err
Always use a log framework, it can handle logging better than you.

I hope those rule help you with your exception logging and enable you to easier debug your problems.

Thanks for listening. As ever, please do share your thoughts and additional tips in the comments below, or on your own blog (I have trackbacks enabled).

Update: For some Hackernews comments: Logging everything is fine but leads to ~50gig/day, 300gig/week, 1.2tb/month log files for a moderate site. Your grep won’t work very good for that. Splunk will obviously help of course but is only free for 0.5gig/day, a 1/100 of the log files you will get.