[jruby] "invalid byte sequence in UTF-8" from shell command with binary output [JRuby-]

Lenny Marks lenny at aps.org
Wed Jul 29 11:54:21 JST 2015

We recently updated some deployments from Java 6 to Java 7 and suddenly started seeing "invalid byte sequence in UTF-8” errors from some code we have that shells out to pdftk. I did some digging and discovered the reason for the behavior change between Java 6 and 7 was because ProcessManager is only used on Java > 1.6.


The process_manager.rb code runs a gsub on the process output which can explode when the process outputs binary data (e.g. pdf). JRuby 1.7.20 and master.


This seems incorrect to me and differs from MRI behavior (see below). I wouldn’t expect JRuby to munge process output. File a bug?


1.9.3-p547 :001 > File.open('foo', 'w') { |f| f.write("\x92") }
 => 1 
1.9.3-p547 :002 > `cat foo`
 => "\x92" 

jruby- :010 > File.open('foo', 'w') { |f| f.write("\x92") }
 => 1 
jruby- :011 > `cat foo`
ArgumentError: invalid byte sequence in UTF-8


More information about the JRuby mailing list