Accessing Japanese Character Texts on the Web



[Note: The following information is now out-of date, and is retained here merely for historical reference. For up-to-date references, see our help page.]


Reading and writing Japanese characters on the Web is still an art, not a science. With the 4.0 versions of Netscape and Internet Explorer, reading kanji and kana characters has become much easier. But inputting characters to Web forms is still a tricky and frustrating business.


How Japanese is Coded for Computer Display

In order to display Japanese kanji and kana characters on your computer, you need programs that can accommodate one of the three basic Japanese encoding methods: JIS, shift-JIS, or EUC (Extended UNIX Code). (For the Japanese Text Initiative, we use EUC encoding.) We strongly recommend that anyone who wants to understand Japanese encoding methods should read Ken Lunde's Understanding Japanese Information Processing (Sebastopol, CA: O'Reilly, 1993). This book is now out of print but will be replaced by a document on CJK (Chinese/Japanese/Korean) information processing, entitled Understanding CJK Information Processing and to be published by O'Reilly this winter. An early version of Lunde's CJK book is at his home page. His home page also has a great deal of other valuable information on Japanese characters and links to other useful Web pages.


Japanese Operating Systems; Apple Macintoshes

The easiest way to read and write (input) Japanese on the Web is to use a Japanese operating system, such as Japanese Windows 95 for IBM-compatibles. Equally easy, if you have a Macintosh, is to use the Japanese Language Kit or KanjiTalk. These software packages are available from vendors such as World Language Resources or Orbit Computer Supplies and Services or Cheng & Tsui. It is to the people who have IBM-compatibles (PCs) with English-language operating systems such as Windows 95 that the remainder of this note is addressed.


How to Read Japanese on an IBM-Compatible Computer

If you just want to read Japanese, you should be running Windows 95 or 98. (Windows 3.1 will make it nearly impossible for you to use current versions of Web browsers.) Next install Microsoft Internet Explorer (IE) 4.0 or 5.0. For 4.0 you will need to return to the Microsoft site and download the Japanese Language Support add-on. For 5.0 you can install Japanese as part of your regular installation.

Pages of the Japanese Text Initiative that require display or input of Japanese characters include EUC META tags. You should not need to set the language in your browser to EUC, but sometimes, depending on your browser configuration, you may need to tweak it, as described below.

Try clicking on a Japanese page, such as Hojoki. The page should display its Japanese characters correctly. If it doesn't, in the IE browser click on View, then Fonts, then Japanese (EUC). (You can also right-click on the browser screen to bring up a menu; click on Language; and then click on Japanese (EUC).) The IE Japanese add-on is currently the most robust way to read Japanese on the Web, but even so it is not always free of problems. Occasionally,where there are several texts in frames, such as the Japanese versions of Noh plays with a separate frame for furigana, you may need to click on each frame and then set EUC for each. Nevertheless, we have observed behavior of the Japanese display that can only be described as erratic. Particularly hard for IE to interpret are the frames of Noh plays with both Japanese and English translations. Click on the Japanese frame and then on Fonts/Auto Select or Fonts/EUC. If that doesn't work, right click on the Japanese frame, then on Language, then on Auto Select or EUC. Sometimes, mysteriously, Language/Auto Select works when Fonts/Auto Select does not work. In other words, try different ways of accessing Japanese texts even with IE and its Japanese add-on.

If you have Netscape 4.x (e.g., 4.0), you can also read Japanese, but installation is a bit more complicated than it is for IE. You will need a Japanese font that Windows 95 or 98 recognizes as a system font. Japanese fonts that are installed with word processing software or with Japanese utilities should work, but seem to be problematical. One font that does work is Microsoft's Gothic font, which is the add-on font for IE. The catch is that you cannot download MS Gothic directly in Netscape. You must have IE installed and then use that to install the Japanese font add-on. Only then will MS Gothic be available for use in Netscape. Another font that you can try is Microsoft's Mincho font for Office 97, currently available in the famous Monash University Nihongo Archive. The Gothic and Mincho fonts, with instructions for installing them, are also available here.

To use a Japanese system font in Netscape, click on Edit/Preferences/Fonts. In the Fonts pull-down menu, click on Japanese. In the list of fonts on your system, click on a Japanese font (for example, MS Gothic) for both Variable Width Font and Fixed Width Font. Then in the Netscape browser menu, click on View/Encoding and Japanese (EUC-JP). You should also be able to click on Japanese (Auto-Detect), but that works only erratically. And getting the Japanese display to work at all is not always straight-forward. In the University of Virginia E-Text Center we have two Pentiums side by side with the same configurations and with Netscape 4.0. With the settings described above, one of the machines displays Japanese without any problem, while the other one refuses to display any Japanese.

If you have Netscape 3.0 or IE 3.0, you can try activating fonts in ways similar to the instructions for 4.0. But you are likely to save trouble by just downloading IE 4.0 or 5.0 or Netscape 4.0.

Another way to read Japanese is to use a client like NJWIN. This is similar to the clients discussed in the following section, except that NJWIN can be used only for reading Japanese, not for inputting it.


How to Input Japanese on an IBM-Compatible Computer

If you do not have a Japanese operating system, you will need a Japanese client in order to input Japanese to Web forms like the Interactive Searching form in the Japanese Text Initiative. In the E-Text Center with English-language Windows we use Microsoft's Global IME (Input Method Editor -- see below) or KanjiKit. Some commercial clients for inputting Japanese are the Japanese modules of Unionway's AsianSuite 97 and its clones such as KanjiKit 97 and Dragon Writer; and Twinbridge's Japanese Partner.

As noted above, pages of the Japanese Text Initiative that require display or input of Japanese characters include EUC META tags. You should not need to set the language in your browser to EUC, but sometimes, depending on your browser configuration, you may need to tweak it, as described below.

Microsoft makes available without charge an input utility called Microsoft Global IME. Unlike KanjiKit or Twinbridge or the other commercial utilities, this works with Web forms only in Internet Explorer, or with the Microsoft mailers Outlook Express or Outlook 98, but then only when Outlook 98 is sending messages in HTML. To use IME for searching our texts, go to our search page. Click on the IME icon in the taskbar at the bottom of your screen. Click on Japanese; you will get a bar with three icons. Click on the first one on the left (an "A" icon), which opens a pull-down menu of various input options. Click on the top option ("zenkaku hiragana"). Then click on the middle icon on the IME menu bar. This opens another pull-down menu. Click on the second choice ("kodo nyuryoku hoshiki (D)") and ascertain that shift-JIS is chosen as the input method. With IME configured, right-click with the mouse on the search page. Click on "Language" in the pull-down menu. If the language is not already set to Japanese(EUC), click on that. With these settings of IME and the Language option of Internet Explorer, you can now enter Japanese into the search forms. Note: We find that on some computers the characters you enter into the search form display as garbage. Interestingly, this happens also in IE when you use KanjiKit and the Language is set to Western; see below. Entering Japanese in English Windows is still an art, not a science.

In the E-Text Center, in addition to MS Global IME, we also use KanjiKit 97 with Windows 95. Getting KanjiKit (or the other Unionway products or Twinbridge) to work correctly with Internet Explorer 4.0 or Netscape 4.0 requires a great deal of patient experimentation. We report here on what is currently working for us in KanjiKit (which would be the same options in Unionway or Dragon Writer); Twinbridge offers similar options

(1) Begin by activating KanjiKit and clicking on the "Select another input method" button in the KanjiKit menu bar (this is the 2nd button from the left). This calls the IME Configuration menu. First in Input Method click on Word mode and Show code. Next click on Options, and then in "Output code" click on ShiftJIS Japanese. Note that Japanese characters in the Japanese Text Initiative pages will not display correctly if JIS Japanese is activated in "Output code." (2) Return to the main KanjiKit menu bar, and click on the 4th button from the left to "Turn Automatic Space mode on or off." (For more information on spaces between Japanese characters, see Tips on Interactive Searching.) (3) Finally, click on the first button in the menu bar to "Change Japanese display code type" to S-JIS (Japanese). Again, the Japanese Text Initiative pages will not display correctly if the display code type is set to EUC or ANSI. This seems peculiar, since our pages are encoded in EUC; but that's the way KanjiKit and other Unionway clones work.

You are now ready to try KanjiKit in Internet Explorer or Netscape.

For Internet Explorer 4.0: Because of the META tag in the Japanese Text Initiative Japanese pages, View/Fonts should already be set to EUC after you click on a Japanese page. If not, right click on the browser screen, then on Language, then on Japanese (EUC). When you enter Japanese characters with this setting, the search should work without error, and the keyword results will display correctly. With some configurations, Japanese characters may display as garbage with these settings of KanjiKit and IE. You may also experience problems with wrap-around of long prose lines in some browser configurations. Try adjusting your IE Language settings to Auto Select or EUC or other settings, even Western, in order to display the Japanese correctly.

For Netscape 4.0: Even though the META tags are set to EUC, Netscape is likely to continue setting its View/Encoding to Western (ISO 8859-1). However, the same KanjiKit settings that you used for IE will work with Netscape with Encoding set to Western. Also, like IE, Netscape will sometimes fail to display Japanese in the search form. As above, try selecting different Encoding choices. For example, try switching Encoding from Western to Japanese (Auto-Detect) or Japanese (EUC-JP).

When you use a utility like KanjiKit or Twinbridge, you may have trouble with lines not wrapping correctly at the right margin. All of the JTI texts wrap correctly with current utilities and browsers. A likely cause of failure is that you are using an out-of-date version of the utility with an up-to-date browser, or vice versa. Sometimes the only solution is to update your software.

Inputting Japanese characters and then displaying Japanese texts on an English-language PC is likely to prove frustrating until your techniques are stabilized. You may need to try out different combinations of settings in Internet Explorer or Netscape, with KanjiKit or another Japanese client. You may experience garbage on your screen or null results in your search. Re-boot, and persevere.


Other Information on Displaying Japanese

Some other sites with helpful information on using Japanese on the Web are Ken Lunde's site mentioned above and Nihongo.org.

Up-to-date information is also available in Fabian van-de-l'Isle's excellent Japanisation FAQ for Computers Running Western Windows.


Testing Your Japanese Client

To test that your Japanese client is handling EUC input correctly, try a search of Hyakunin Isshu. Enter a character such as ("yama" in Romaji), and restrict the search to the title Hyakunin. You should get 22 hits. If instead you get a no-hit message, check your EUC client set-up. Also, see "Tips on Interactive Searching."

Japanese Text Initiative
Electronic Text Center | University of Virginia Library
PO Box 400148 | Charlottesville VA 22904-4148
434.243.8800 | fax: 434.924.1431
Etext Home | Library Home | Search the Library Web
Maintained by:
Last Modified: Tuesday, August 31, 2004
© 2004 The Rector and Visitors of the University of Virginia