Find the answer to your Linux question:
Results 1 to 2 of 2
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    Lightbulb suggestion of default non-unicode locale feature for Linux/Ubuntu/etc

    Dear Ubuntu Programmer,

    I am now proposing a new and very useful feature for Linux, called "default non-unicode locale" (such as those in Windows).

    The problem is that when Ubuntu is accessing some DOS/windows file-systems, or some media files like mp3/mp4/etc, some contains a combination of unicode and non-unicode filenames, folders and text strings. Multimedia applications like Rhythmbox/VLC-player/KMPlayer never displays artists and album names perfectly for mp3/mp4/etc... This is because some text are in UTF8, some are in GBK (I'm from China). There're many other cases where the text cannot be displayed properly and garbage codes are shown.

    For example,
    Screenshot from 2013-10-28 15:32:14.jpg
    (see attachment for higher resolution)

    In general, regardless of any application, the OS shall NOT assume that all non-ascii characters are in UTF-8. This is because the user may choose a particular code page for efficient encoding of text in his/her own language (e.g. gbk/gb18030 for Chinese, tis-620 for Thai, etc). Therefore, I strongly suggest that we should introduce "default non-unicode locale" into Ubuntu's system setting, like those in Windows, so that if a string cannot be displayed in unicode, then the default non-unicode locale is chosen.

    It's going to be very useful as you will see. Btw, is there any other operating system besides Microsoft Windows which has such capability?

    Wang Xuancong
    Attached Files Attached Files

  2. #2
    Linux Engineer
    Join Date
    Jan 2005
    Saint Paul, MN
    The fact that Windows is not international and the Linux is "i8n", is not a reason to move Linux back in time to a time about 20-25 years ago. Better work at moving Windows (or your usage into the current time.

    UTF-8 is fully backwards with ANSII character set (they only use 8-bits) while the international characters might be 16 or 32 bit as needed.

    Linux, OS X, and Unix are all unicode based and have been for years (decades, etc). Code pages is very 1980s.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts