diffutils: Internationalization
18.1.1 Handling Multibyte and Varying-Width Characters
------------------------------------------------------
'diff', 'diff3' and 'sdiff' treat each line of input as a string of
unibyte characters. This can mishandle multibyte characters in some
cases. For example, when asked to ignore spaces, 'diff' does not
properly ignore a multibyte space character.
Also, 'diff' currently assumes that each byte is one column wide, and
this assumption is incorrect in some locales, e.g., locales that use
UTF-8 encoding. This causes problems with the '-y' or '--side-by-side'
option of 'diff'.
These problems need to be fixed without unduly affecting the
performance of the utilities in unibyte environments.
The IBM GNU/Linux Technology Center Internationalization Team has
proposed patches to support internationalized 'diff'
(http://oss.software.ibm.com/developer/opensource/linux/patches/i18n/diffutils-2.7.2-i18n-0.1.patch.gz).
Unfortunately, these patches are incomplete and are to an older version
of 'diff', so more work needs to be done in this area.