Programming is hard by Stephan Schmidt

Problems with Jersey, REST, JSON and UTF-8 [Update]

UTF-8 is always a problem. Unbelievable. 2008 and we still haven’t fixed this. One of my current projects is a Javascript frontend with a REST backend. The backend stores to MySQL (a famous UTF-8 trouble maker) and creates JSON to REST calls. The problems starts with UTF-8 characters. Somewhere in the callchain - as always - characters don’t get correctly written. MySQL and the JDBC driver should work, the JSP page is UTF-8 (@page and meta-equiv), jQuery - which does the AJAX - and JS do know UTF-8 and Jersey should be UTF-8 too. But with some experiments now I’m quite sure that Jersey (JSR 311 REST framework) is to blame. I’m not sure how to specify UTF-8, this

  @ProduceMime("text/plain;charset=UTF-8")

doesn’t help. Funny, every major project with several frameworks along the call chain and several languages (JS, C, Java) makes UTF-8 problems somehow. I’m so fed up with this, it’s 2008.

Update: Jersey uses InputStreams for all encodings, especially StringProvider is relevant to me (se above). Does this work with Unicode?

If you liked this post, subscribe to my free full RSS feed.
Filed under: JSR 311, Javascript, Jersey, REST, UTF-8, Unicode

You can share this post!
Do you want to tell others about this article? Use the social bookmark icons to submit this artice to the service of your choice. Thanks.

Get free updates by email

If you did like this article you can get free updates with your RSS reader, you can follow me on Twitter or get free update to new posts by email. Enter your email:

 
About the author: Stephan has been working as a head of development and CTO. He has experiences in different technologies since 20 years including Java, Rails and Python. Stephans main field of interest is maintainablity and productivity in software development. Want to know more? All views are only his own.

Comments

[...] No signal, no noise. « Problems with Jersey, REST, JSON and UTF-8 [Update] [...]

Hi Stephan,

Yes, i think this is a problem with Jersey, thanks for reporting it.

The EG recently found this issue as well and we have updated the JSR-311 specification, version 0.7 [1], to state:

When writing responses, implementations SHOULD respect application-supplied character set
metadata and SHOULD use UTF-8 if a character set is not specified by the application or if the
application specifies a character set that is unsupported.

and this will be implemented in the 0.7 release of Jersey (scheduled for April 18th).

I should be able to provide you with a specific solution for StringProvider fairly quickly if you are happy working with the latest builds or the trunk.

Paul.

[1] https://jsr311.dev.java.net/

New medication like plavix….

Plavix verses generic. Plavix….

Leave a Reply