Have you ever faced encoding issues in some Ruby code — eventually ran inside a Docker container? It is painful and on the other hand it is easy to fix when you know what to do.Continue reading
The task is pretty simple: a Ruby code parses some JSON back from an standard out output (stdout) shelled out by an executable (through
This code worked totally fine until we containerised the process for Docker. For some mysterious reasons,
JSON.parse complained about a wrongly formatted string.
Pair-debugging helped to debunk the issue, and as @garethadams fairly pointed out, Ruby is supposed to use UTF-8 by default… the error claimed the opposite.
OK Docker ruby, show me your encoding:
$ docker run -ti --rm ruby:2.1.5 ruby -e 'puts STDIN.external_encoding'
This is wrong. It might be inherited from the default system locale:
$ docker run -ti --rm ruby:2.1.5 locale
Which leads to the popular article Docker and locales by Jared Markell… but it did not solve the issue.
ruby official image is based on
debian:jessie. Debian removed their dependency on the
locales package in 2011. It explains the unavailability of
The quickest solution is to opt in for
C.UTF-8, the UTF-8 locale provided by
libc-bin, on Debian at least:
$ docker run -ti --rm -e LANG=C.UTF-8 ruby:2.1.5 ruby -e 'puts STDIN.external_encoding'
Problem solved! Although it is a bit cumbersome to specify that manually. A good and persistent way of doing is to specify that environment variable in any image based on a Ruby one:
Et voilà ! It is a lightweight and recommended solution as it does not require to install an
Knowing about the chain of base images is very helpful as you can understand how to configure a container. Because the configuration can vary from a base image to another one, you might want to ensure that all/most of your Docker images inherit from a same consistent base.
Notice: Node.js base image is not affected by the system locale, as Node deal with either Buffers or UTF-8 encoded strings anyway.