Home
| pfodApps/pfodDevices
| WebStringTemplates
| Java/J2EE
| Unix
| Torches
| Superannuation
|
| About
Us
|
Using Non-ASCII chars in Arduino and other micro-processors
|
by Matthew Ford 12th October 2013
(original1st September 2013)
© Forward Computing and Control
Pty. Ltd. NSW Australia
All rights reserved.
If you are using the Arduino IDE, then everything almost always just works. Open the IDE and paste in the characters you want displayed by your print( ) statement and then compile and upload to the processor. If you are connecting to the pfodApp, that's it finished.
If you are writing a web server you need to start
every page you serve with
<html><head><meta
http-equiv="Content-Type"
content="text/html;charset=utf-8">
If you are trying to test using the Arduino SerialMonitor, forget it. As of V1.5.3, The Arduino SerialMonitor only accepts ASCII chars. Hopefully this will be fixed in some future release.
If you don't get the output you expected OR you are
not using the Arduino IDE as your editor OR your are programming some
other micro-processor then read on for solutions.
(Note: Notepad
on Windows can edit non-ASCII by saving and reading as UTF-8 files,
just choose UTF-8 as the encoding when saving.)
Outputting messages or adding code comments in your own (non-English) language is very convenient. Also if you are writing a micro-processor driven web server that will server web pages containing non-ASCII characters, or if you are writing a pfodDevice where user's want to see the menus in their own (non-English) language then you need to code these non-ASCII characters in your code and have them compiled and uploaded to your micro-processor.
Unicode has become the standard means of handling the multitude of characters, 110,000 characters covering 100 scripts. There are many ways of encoding Unicode characters, UTF-8 has the advantage that it is completely compatible with plain ASCII. That means if you are only sending ASCII characters you cannot tell the difference between ASCII encoding and UTF-8 encoding. Because of the ASCII compatibility UTF-8 has become the de facto encoding for storing Unicode in files and transmitting it.
The Arduino IDE explicitly reads and writes it sketches in UTF-8 encoding. The Arduino gcc-avr compiler also uses UTF-8 encoded files by default. But as noted above the Arduino SerialMonitor does not.
Sending UTF-8 encoded characters is only half the issue. The receiving display device, pfodApp or web browser needs to correctly display the characters. To do this the receiving device needs to a) process the bytes as UTF-8 encoded characters and b) have the required font installed to display the resulting characters.
pfodApp
works as expected processing the received bytes using UTF-8 encoding.
As mentioned above for the web browser to correctly interpret the
characters you need tell it that the page is encoded in UTF-8. You do
that by starting every web page you serve with
<html><head><meta
http-equiv="Content-Type"
content="text/html;charset=utf-8">
But getting the receiving encoding correct only satisfies point a) above. Once the receiving software has read and decoded the character it need to display it on the screen. Most computers and mobile devices do NOT have all 110,000 character shapes available to display every Unicode character.
For example here is a sketch with displays 4 buttons (which do nothing) on a pfodApp. The buttons are “Hello World” translated into Chinese, Russian and Hindi.
Here is the display of the buttons by pfodApp running on my Nexus Android phone.
My Nexus mobile does not have the necessary Hindi font loaded so it just displays missing characters. Although you would expect an Android mobile in India to have the font installed by default.
If you are not using the Arduino IDE or some other editor that supports UTF-8 OR your compiler or assembler only supports ASCII then you can still code and send non-ASCII characters in UTF-8 format by first converting the characters to the equivalent UTF-8 bytes and then coding those bytes directly.
The UTF8converter that can be downloaded from here does the conversion for you.
I have provided a UTF8converter that can be downloaded from here. This program allows you to paste in the characters you want to display to your user and then convert them to the correct UTF-8 sequence of bytes (as octal) for inserting into the coding of your pfodDevice.
To run the application, download the jar file, UTF8converter1_0_1.jar. Save it in a directory which you can write to.
You should double click on the jar file and it should run. If not, you do not have Java installed. To install Java goto www.java.com and download and install the Java runtime.
Put the downloaded
UTF8converter1_0_1.jar file in a directory.
Then from a terminal
window, change directory to where the UTF8converter1_0_1.jar file is
and run the command:-
java -jar
UTF8converter1_0_1.jar
If the
UTF8converter window does not appear, goto www.java.com
and download and install Java.
As well on Mac OS, you can assign
"Jar Launcher" as the default app. to use when you
double-click a jar file, as follows (I don't believe you need the
developer tools installed for this):
i) Click once on the .jar
file in the Finder and then from the menubar in the Finder select
File -> Get Info".
ii) Click on "Open with" and
from the popup menu select "Other". A file browser window
will open.
iii) In this window, go to the
/System/Library/CoreServices folder and select 'Jar Launcher'.
iv)
Then make sure the "Always Open With" checkbox is checked
and then click Add.
v) Then click the "Change all"
button so that any jar file will be opened automatically.
vi)
Finally, close the Info window and now when you double-click any of
your jar files they should run automatically.
(see
http://macosx.com/tech-support/how-to-execute-a-jar-file-in-os-x/9549.html
)
Run the UTF8converter, as described above, and then type or paste the text to want to convert.
For example, using google translate,
“Geniuses
eat a peach and engage in fishing.”
in italian, with the
accents, becomes
Genî mangiano una pèsca e pratichino
la pésca.
Pasting this into the UTF8converter and converting gives
There are the UTF-8 bytes representing the text. Most of them are just standard ASCII except for the characters with accents which have been replaced with their UTF-8 equivalents (in octal). Hex \x.. is not used because C compilers can get confused if the next character after the two hex digits is 'a' to 'f'. Using octal avoids this problem. The GCC compiler used by Arduino also does not accept all unicode sequences such as \u0020
Right clicking the UTF-8 field and choosing “copy” from the popup menu copies the bytes to your clipboard to paste into your code.
The method will work of any language and will display that language on any pfodApp provided the mobile has the appropriate font necessary to display the characters.
AndroidTM is a trademark of Google Inc. For use of the Arduino name see http://arduino.cc/en/Main/FAQ
The General Purpose Android/Arduino Control App.
pfodDevice™ and pfodApp™ are trade marks of Forward Computing and Control Pty. Ltd.
Contact Forward Computing and Control by
©Copyright 1996-2020 Forward Computing and Control Pty. Ltd.
ACN 003 669 994