Persian Input Methods
For Emacs And More Broadly Speaking
Version 0.4
September 19, 2012
Copyright ©2012 Mohsen BANAN
Permission is granted to make and distribute complete (not partial)
verbatim copies of this document provided that the copyright notice
and this permission notice are preserved on all copies.
- 1 Introduction
- 2 About this Document
- 3 Scope and Context
- 4 Persian With Emacs
- 4.1 About Emacs
- 4.2 Obtaining Emacs
- 4.3 Obtaining Persian Blee
- 4.4 Selecting Persian Language
- 4.5 Selecting Persian Input Methods
- 4.6 A Sample Farsi Editing Session
- 4.7 Hints for Persian Characters (Unicode) Usage
- 4.8 Hints for bidi Emacs Usage
- 4.9 Multilingualization (M17n) of Spelling Dictionaries
- 5 Emacs Persian Input Methods
- 6 Relevant Standards/Specifications
- 7 The Broader Scope Of farsi-transliterate-banan
- 8 History and Previous Work
- 9 Colophon
1 Introduction
There are several things we want to accomplish with this document.
1.1 Goal: Widespread Usage of Persian in Emacs
Our first goal in providing this document is to facilitate writing in Persian in the Halaal/Convivial quadrant.
That begins with the promotion of use of Emacs for writing in Persian. Emacs is the ne plus ultra Halaal/Convivial multi-lingual user environment in existence today. It is quite simply the best, far surpassing any other currently available toolset. For complete information about Emacs see
The word “Halaal” is very strong and very loaded. For our usage and meaning of this word see our document titled, Introducing Halaal and Haraam into Globish – Based on Moral Philosophy of Abstract Halaal [3] – معرفیِ حلال و حرام به بقیهیِ دنیا
For a definition of Halaal Software see our document titled, Defining Halaal Software and Defining Halaal Internet Services [2]. This document is also available in Farsi as [8]. – تعريف نرم افزار حلال و تعريف خدمات اينترنتى حلال
Our use of the terms “convivial” and “conviviality” is based on Ivan Illich’s Tools For Conviviality. For our use of these terms see our document titled, Introducing Convivial into Globish [1]. In that document we also define the “Halaal/Convivial Quadrant.”
Emacs has had full Unicode support for many years. Starting with Emacs 24, full native bidi (bidirectional) support is now also available. Multiple Persian input methods are part of the Emacs 24 distribution. These input methods are documented here.
Emacs comes with a rich mail reader, a personal planner, an address book, a calendar, spell checkers for English and Persian, multi-lingual dictionary interfaces and many other tools and packages; all integrated together. Because Emacs supports Persian, all these tools and packages also support Persian.
Most Iranians today use Microsoft Windows products such as MS Word and MS PowerPoint in the Haraam/Industrial quadrant. Microsoft Windows is closed, proprietary software made by an American corporation.
Our goal is to enable and encourage the transition of Iranians from the proprietary Microsoft Windows products in the Haraam/Industrial quadrant, to the far superior Emacs in the Halaal/Convivial quadrant.
This document provides enough information to enable anyone to obtain Emacs and begin using it as her/his Persian user environment.
1.2 Goal: Widespread Adoption of Persian Blee
Our second goal is to promote the use of Blee (the ByStar Libre Emacs Environment [5]) among Persian speakers in general, and Iranians in particular.
Blee is a layer above Emacs that integrates GNU/Linux capabilities into Emacs, and provides close integration with the ByStar Services. The ByStar Federation of Autonomous Libre Services is a unified Halaal services model, unifying and making consistent a large number of services that currently exist in functional isolation. It is a coherent, integrated family of services, providing the user with a comprehensive, all-encompassing Internet experience. For information about Libre Services see our document titled, Libre Services: A non-proprietary model for delivery of Internet services [7]. For information about the ByStar Federation see our document titled, The ByStar Federation of Autonomous Libre Services [4].
The present document provides enough information to allow a ByStar Autonomous Libre Service owner to use Blee as her/his Persian Halaal Software-Service Continuum.
1.3 Goal: Evaluation of Applicability of farsi-transliterate-banan to other User Environments
Our third goal in producing this document is to encourage adoption of the Multi-Character Persian Reverse Transliteration in other Halaal Digital environments in general, and in Gnome in particular.
This document provides enough information so that in addition to Emacs, implementation of the “Banan Multi-Character Transliteration Persian Input Method” is possible in other user environments.
2 About this Document
The primary URL for this document is: The pdf format is authoritative.
Distribution of this document is unrestricted. We encourage you to forward it to others.
We can benefit from your feedback. Please let us know your thoughts. You can send us your comments, criticisms and corrections via the URL, or by email to feedback@ our base domain, which is
We thank you for your assistance.
3 Scope and Context
Use of the Latin character keyboard to input Persian text into machines, and more generally use of the Latin alphabet for writing in Persian, is an old topic with a lot of history. We reference some of this history and prior work later in this document. See Section 8 for more information.
The terminology in this area is often ambiguous or misused, causing confusion when addressing this topic. In this document we will be consistent in our own terminology, taking pains to define the more ambiguous or problematic terms carefully. We do this by providing our own definitions, or by referencing external definitions.
3.1 Persian vs Farsi
Our use of the terms “Persian” and “Farsi” is consistent with the
definitions of these terms established by the Society of Iranian
Linguists. Their definitions are available at:
In the section below titled “Persian Language” we reproduce the relevant parts of their text.
The current implementation of Persian input methods for Emacs is for Farsi only. Thus in the current implementation the terms Persian and Farsi may be considered equivalent, and in the present version of this document we use these terms interchangeably.
We plan to expand the implementation in the future to include other Persian language variations.
3.1.1 The Persian Language
Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is spoken in Iran, Afghanistan, and Tajikistan and has official-language status in these three countries.
There are three modern varieties of standard Persian:
- The Persian variety spoken in Iran has also been called Iranian Persian or Farsi. The writing system is an extended version of the Arabic script.
- Dari Persian has been used to refer to the Persian language spoken in Afghanistan and Uzbekistan. It uses the same writing system as Iranian Persian.
- Tajik or Tajiki Persian is the variety used in Tajikistan, Uzbekistan and Russia. Unlike the Persian used in Iran and Afghanistan, it is written in an extended version of the Cyrillic script.
3.2 Terminology
Here we reference the external definitions of various words we will use. Note that our reference to a Wikipedia article in the list below does not necessarily mean that we endorse or conform to their definition; it means only that it exists as an external definition and that we have made the trade-off of mentioning it.
- Transliteration – نویسه گردانی:
- Romanization:
In the context of Persian, this amounts to same thing as transliteration. - Latin vs Roman:
- In the context of alphabets, we use these terms interchangeably.
- Transcription:
- This term is not used in this document. - Pinglish/Finglish:
- An informal and loose transliteration for human-to-human communication. Pinglish is word oriented. The Multi-Character Transliteration Input method is character oriented.
- Pinglish Web Services:
- For example, behnevis.
- Persian Multi-Character/Composite Transliteration:
- Synonymous with Multi-Character Reverse Transliteration Input Method.
- Persian Multi-Character/Compos-it Reverse Transliteration:
- Transliteration was the process by which خ became “kh”. Now, the route by which “kh” is becoming خ is reverse-transliteration. But we continue to refer to it as transliteration. farsi-transliterate-banan defined in this document is an example of the Multi-Character Reverse Transliteration Input Method. See Section 5.2 for details.
- Input Method / Emacs Input Method:
- An "input method" is a kind of character conversion designed specifically for interactive input.
- Mapping Input Method:
- This simplest kind of input method works by mapping ASCII letters into another alphabet; this allows you to use one other alphabet instead of ASCII.
- Composite Input Method:
- A more powerful technique is composition: converting sequences of characters into one letter. For example “kh” becomes خ.
3.3 Overview of the Full Picture: The By* Halaal Digital Ecosystem
This document is part of a bigger picture.
We want the world to move towards Halaal Software, and Halaal Internet Services.
The totality of our work is directed towards creation of the ByStar Halaal Digital Ecosystem, as a moral alternative to the proprietary American digital ecosystem. An overview of this is provided in our document titled, The ByStar Halaal/Libre Digital Ecosystem: A Moral Alternative to the Proprietary American Digital Ecosystem [6], available on-line at: In that document we present a complete picture for establishing a model and process that can redirect the manner of existence of software and Internet services towards safeguarding humanity. We also describe the framework that is already in place for collaboration and we invite you to participate in this work.
4 Persian With Emacs
This information applies to emacs version or higher.
Enabling Persian in Emacs is very simple.
If you already are an emacs user, you can skip over to section 4.3 and continue reading from there.
If you are completely new to emacs, the information below is sufficient to permit you to install emacs, enable Persian and start using emacs as your Persian user environment.
4.1 About Emacs
Emacs is world’s most potent multilingual editor-centered user experience platform. Emacs comes with a rich mail reader, a personal planner, an address book, a calendar, spell checkers for English and Persian, multi-lingual dictionary interfaces and many other tools and packages; all integrated together. Because Emacs supports Persian, all these tools and packages also support Persian.
Some useful links to emacs related resources are included below:
4.2 Obtaining Emacs
Emacs is halaal/libre/free software.
The primary access page for emacs is:
You can obtain the sources for emacs and build it yourself or you can obtain pre-built binaries.
Instructions for obtaining emacs in various forms and for various platforms are also included below.
4.2.1 Obtaining Emacs Sources
When Emacs 24.3 is released You can obtain the source for emacs 24.3 with:
The latest version from the repository trunk can be obtained with:
Then you can build emacs from sources by following the instructions.
4.2.2 Binaries For Debian GNU/Linux and Ubuntu
Snapshots of the repository trunk are regularly built for Debian and Ubuntu. You can obtain these from:
Once Emacs 24 is included in distributions of Debian and Ubuntu, all you have to do is:
4.2.3 Binaries For MS Windows
We do not encourage use of any software on the proprietary/haraam Microsoft Windows platform. From the Halaal Software perspective, use of any software under Windows is at best makruh – مکروه . Use GNU/Linux instead.
Snapshots of the repository trunk are regularly built for MS Windows. You can obtain these from:
4.3 Obtaining Persian Blee
Blee (the ByStar Libre Emacs Environment [5] is a layer above Emacs that integrates GNU/Linux capabilities into Emacs, and provides close integration with the ByStar Services. The ByStar Federation of Autonomous Libre Services [4] is a unified Halaal services model, unifying and making consistent a large number of services that currently exist in functional isolation.
Information about obtaining Blee can be found at:
4.4 Selecting Persian Language
Using Emacs menus, select:
“Options” - “Multilingual Environment” - “Set Language Environment” - “Persian”.
Or you can select the Persian language with Emacs commands.
The notation “M-:” in the following commands means you press the “Meta” key (often the Esc key) followed by “:”. The “M-:” is then followed by the elisp form. For some commands the “M-:” does not appear; in this case you just need to eval the elisp form.
To see language environment settings,
using Emacs menus, select:
“Options” - “Multilingual Environment” - “Describe Language Environment” - “Persian”.
or invoke the Emacs command:
4.5 Selecting Persian Input Methods
Emacs comes with two built-in Persian input methods:
- farsi-isiri-9149:
- A Persian keyboard based on the Islamic Republic of Iran’s ISIRI-9147 specification. See Section 5.1 for details.
- farsi-transliterate-banan:
- An intuitive transliteration keyboard for Farsi. See Section 5.2 for details.
With Plain Emacs
With the language environment set to “Persian”,
using Emacs menus, select:
“Options” - “Multilingual Environment” - “Toggle Input Method”.
Now, your keyboard is configured for Persian as farsi-transliterate-banan.
To activate the ISIRI-9147 keyboard, enter the command:
To activate the transliterate keyboard, enter the command:
Alternatively you can select these options from the “Options-Multilingual Environment” menu.
To toggle back to the English keyboard type C-\
(hold down the
Ctrl key while typing the character \
To see a description of either input method, use the commands:
With Persian Blee
Using Persian Blee, just press the F6 key twice. Your input method and language environment (spell checking, dictionaries, etc.) are then all set to Persian.
Press the F6 twice again to toggle back to the English keyboard.
4.6 A Sample Farsi Editing Session
Let’s start from scratch and walk through the steps involved in writing a simple sentence both in Farsi and in English.
- Install Emacs 24 on your system based on the information in section 4.1.
- Open a file: (for example “example.fa”)
Menu:”File” - “Visit New File” - “example.fa” - Select Persian Language (section 4.3)
Menu: “Options” - “Multilingual Environment” - “Describe Language Environment” - “Persian”. - Select the farsi-transliterate-banan Persian Input Method (section 4.4)
Menu: “Options” - “Multilingual Environment” - “Toggle Input Method” - Consider that we want to write:
حالا، با نرم افزار حلال میتوانیم به فارسی سالم و خوش بنویسیم.- Note that we are not writing in pinglish. Ignore the vowels and think of the Persian writing above letter-by-letter.
Now type:Hala, ba nrm afzar Hlal mitvanim bh farsi salm v khush bnvisim.
- Toggle back to English C-
Menu: “Options” - “Multilingual Environment” - “Toggle Input Method”- Now enter something in English, for example:
Now, with Halaal software we can write well in Persian.
Note that the empty line between the Farsi paragraph and the English paragraph properly took care of directionality.- We are done, so let’s save the file and close this buffer.
Menu: “File” - “Save”.
Menu: “File” - “Close”.Kool!
With Emacs, you are using world’s most potent multilingual editor-centered user experience platform. And it is Halaal/Libre/Free. And it is Gratis/Free-of-Charge. And it has everything – a Persian spell checker, an email interface, calendar, address book, personal planner, ...
To learn more and explore more, you can try:
Menu: “Help” - “Read the Emacs Manual”.
Menu: “Help” - “Tutorial”.Also, some Persian specific help is included below.
4.7 Hints for Persian Characters (Unicode) Usage
As you are writing in Persian, you may want to know exactly what Unicode character is at the cursor. To do that place point on the character, then enter the following commands:
ctl+x = meta+x describe-char ctl+u ctl+x =For example, to verify consistency between this document and code, place the cursor on the character and with “ctl+x =” verify that the Unicode hex numbers match.
To enter a Unicode character directly in decimal or hex:
ctl+x 8 enter (ucs-insert #x0635) (ucs-insert (string-to-number "0635" 16)) (ucs-insert 1589)4.8 Hints for bidi Emacs Usage
Sometimes you may want to specify the directionality explicitly (i.e. left-to-right or right-to-left).
With Plain Emacs
Here are some of the basic Emacs bidi controls:
(setq bidi-display-reordering t) (setq bidi-display-reordering nil) (setq bidi-paragraph-direction 'right-to-left) (setq bidi-paragraph-direction 'left-to-right)See the Emacs documentation for more.
With Persian Blee
The keystroke combinations F6-1 and F6-2 are bound to toggle display-reordering.
4.9 Multilingualization (M17n) of Spelling Dictionaries
Debian/Ubuntu includes a Persian dictionary that can also be used with Emacs.
With Plain Emacs
First you need to obtain the spelling dictionary. Enter the following command:
sudo apt-get -y install aspell-faNext you need to let Emacs know that you want to use the Persian spelling dictionary.
With Persian Blee
As already noted, pressing the F6 key twice toggles your input method. This also toggles your language environment. ispell/aspell is then configured to work with multiple dictionaries.
So there is nothing else you need to do.
5 Emacs Persian Input Methods
At this time there are two Persian input methods supported in Emacs:
- farsi-isiri-9149:
- A Persian keyboard based on the Islamic Republic of Iran’s ISIRI-9147 specification.
- farsi-transliterate-banan:
- An intuitive transliteration keyboard for Farsi.
These are described in the following sections.
5.1 farsi-isiri-9147 Persian Input Method
In Emacs this input method is labeled farsi-isiri-9147. It is based on the ISIRI 9147 – 1st edition. ISIRI-9147 defines the layout of Iran’s Persian keyboard. See section 6.3 and section 6.1 for more information.
Layers 1, 2 and 3 of ISIRI-9147 are fully implemented with the exception of the Backslash ’
’ , Alt-Backslash, Shift-Space and Alt-Space keys.The Backslash key is used to replace کلید با دگر ساز راست (the Alt or Meta key).
Layer 3 is then entered with the Backslash key, and Layer 3 is implemented as two-letter key combinations as specified in ISIRI-9147.
The character corresponding to Backslash is entered with Backslash-Backslash. Alt-Backslash has been moved to Backslash-r. Shift-Space has been moved to Backslash-y. Alt-Space has been moved to Backslash-t.
With these modifications farsi-isiri-9147 is a full implementation of ISIRI-9147. In addition, with these modifications this implementation is ascii input stream based, as well as being a keyboard layout.
If a key on Layer 1 were reserved to replace دگر ساز راست (the Alt or Meta key), then farsi-isiri-9147 would be fully compliant, without needing the above description/modifications.
Perhaps this can be considered a defect in the base ISIRI-9147 specification, to be addressed in the next revision.
All inputs for each Persian letter Unicode for farsi-isiri-9147 are shown in Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9 and Table 10.
5.2 farsi-transliterate-banan Persian Input Method
In Emacs this input method is labeled farsi-transliterate-banan.
The ISIRI-9147 Persian keyboard is not well suited to Iranian expatriates living in the West. Persian-speaking expatriates are usually already completely familiar and accustomed to the standard qwerty keyboard, and they don’t want to have to learn and adapt to ISIRI-9147. Rather, they expect software to adapt to them.
This is what the farsi-transliterate-banan – “Banan Multi-Character (Reverse) Transliteration Persian Input Method” – accomplishes. This input method addresses the needs of a user who:
- Can write in Farsi (not just speak it).
- Is familiar with and accustomed to the qwerty Latin keyboard.
- Is unfamiliar with ISIRI-9147 and does not wish to learn it.
- Writes and otherwise communicates in mixed Globish/Persian, not pure Persian.
- Is intuitively familiar with the transliteration of
Farsi/Persian into Latin based on two-letter phonetic mapping to
Persian characters. (For example: gh ق – kh خ – sh
ش – ch چ – zh ژ
The transliteration keyboard is intuitive in design, so that the mappings are natural and easy to remember for a Persian writer. It provides equivalent capability to farsi-isiri-9147, allowing input of all characters enumerated in ISIRI-6219.
farsi-transliterate-banan is phonetically oriented. But it is very different from Pinglish. Pinglish is word-oriented, where you sound out the word using Latin letters, including the vowels. farsi-transliterate-banan is letter-oriented, where you type the Latin letter(s) closest to the Persian letter, and usually omit vowels.
For some Persian characters there are multiple ways of inputting the same character. For example both “i” and “y” produce ی. For یک “yk”, “y” is more natural, and for این “ain”, “i” is more natural.
The more commonly used letters are mapped to lower case; the less commonly used letters are mapped to upper case. For example “s” is س while “S” is ص. And “h” is ه while “H” is ح. Table 1 shows these mappings.
Postfix composition is based on “h”. The letter “h” is used as a postfix for the following two-character mappings: gh ق – kh خ – sh ش – ch چ – zh ژ – Th ة – Yh ی. Table 2 shows these mappings.
Prefix composition is based on the prefix characters
.Prefix letter
is used for two-character inputs when an alternative form of a letter is desired. For example\−
is “÷” while−
is “−”.Prefix letter
is used for multi-character inputs when special characters are desired based on their abbreviated name. For example you can enter‎
to enter the “LEFT-TO-RIGHT MARK” character.Prefix letter
is used to provide two specific characters./
.The letter “h” is used in a number of two-character postfix mappings; for example “sh” ش. So if you need the sequence “s” then “h” you have to repeat the “s”. For example: سهم = ’s’ ’s’ ’h’ ’m’.
Table 1 shows the single-character keyboard layout for farsi-transliterate-banan. It is based on the results of (describe-input-method ’farsi-transliterate-banan).
Table 2 shows the multi-character mappings for farsi-transliterate-banan. It is based on the results of (describe-input-method ’farsi-transliterate-banan).
All inputs for each Persian letter Unicode for farsi-transliterate-banan are shown in Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9 and Table 10.
۱! ۲ْ ۳ً ۴ٰ ۵٪ ۶َ ۷& ۸* ۹( ۰) -ـ =+ ّٔ غق عء ٍِ رR تط یي و ٓ یئ ٌُ پP [{ ]} اآ سص دٱ فإ گغ هح ج کك لL ؛: ’" زذ ضظ ث ٕ وؤ بB ن« م» ،< .> /؟ Table 1: Banan Transliteration Keyboard Layout for Single Keys ch چ kh خ sh ش zh ژ gh ق Gh غ hh ح Yh ى Th ة Table 2: Banan Transliteration of “h” Postfix Multi Keys Mappings farsi Emacs Uni- فارسى transliterate ISIR code Unicode Name نامِ نویسهbanan 9147 Hex ء W M 0621 ARABIC LETTER HAMZA حرف فارسی همزهآ A H 0622 ARABIC LETTER ALEF WITH MADDA ABOVE حرف فارسی الف با کلاها a h 0627 ARABIC LETTER ALEF حرف فارسی الفأ \a
G 0623 ARABIC LETTER ALEF WITH HAMZA ABOVE حرف فارسی الف با همزه بالاب b f 0628 ARABIC LETTER BEH حرف فارسی بپ p m 067e ARABIC LETTER PEH حرف فارسی پت t tt j 062a ARABIC LETTER TEH حرف فارسی تث c cc e 062b ARABIC LETTER THEH حرف فارسی ثج j [ 062c ARABIC LETTER JEEM حرف فارسی جیمچ ch ] 0686 ARABIC LETTER TCHEH حرف فارسی چح H hh p 062d ARABIC LETTER HAH حرف فارسی حخ kh o 062e ARABIC LETTER KHAH حرف فارسی خد d n 062f ARABIC LETTER DAL حرف فارسی دالذ Z b 0630 ARABIC LETTER THAL حرف فارسی ذالر r v 0631 ARABIC LETTER REH حرف فارسی رز z zz c 0632 ARABIC LETTER ZAIN حرف فارسی زژ zh C 0698 ARABIC LETTER JEH حرف فارسی ژس s ss s 0633 ARABIC LETTER SEEN حرف فارسی سینش sh a 0634 ARABIC LETTER SHEEN حرف فارسی شینص S w 0635 ARABIC LETTER SAD حرف فارسی صادض x q 0636 ARABIC LETTER DAD حرف فارسی ضادط T TT x 0637 ARABIC LETTER TAH حرف فارسی طاظ X z 0638 ARABIC LETTER ZAH حرف فارسی ظاع w u 0639 ARABIC LETTER AIN حرف فارسی عینغ q Gh G GG y 063a ARABIC LETTER GHAIN حرف فارسی غینف f t 0641 ARABIC LETTER FEH حرف فارسی فق gh Q r 0642 ARABIC LETTER QAF حرف فارسی قافک k kk ; 06a9 ARABIC LETTER KEHEH حرف فارسی کافگ g gg ’ 06af ARABIC LETTER GAF حرف فارسی گافل l g 0644 ARABIC LETTER LAM حرف فارسی لامم m l 0645 ARABIC LETTER MEEM حرف فارسی میمن n k 0646 ARABIC LETTER NOON حرف فارسی نونو u v , 0648 ARABIC LETTER WAW حرف فارسی واوؤ V A 0624 ARABIC LETTER WAW WITH HAMZA ABOVE حرف فارسی واو با همزه بالاه h Hh i 0647 ARABIC LETTER HEH حرف فارسی هی i y d 06cc ARABIC LETTER FARSI YEH حرف فارسی یئ I S 0626 ARABIC LETTER YEH WITH HAMZA ABOVE حرف فارسی ی با همزه بالاTable 3: Main Letters: Mapping of Persian Unicode to farsi-transliterate-banan – Matching Table 5 of isiri-6219 farsi Emacs Uni- فارسى transliterate ISIR code Unicode Name نامِ نویسهbanan 9147 Hex إ F F 0625 ARABIC LETTER ALEF WITH HAMZA BELOW حرف فارسی الف با همزه پایینٱ D \h
0671 ARABIC LETTER ALEF WASLA حرفِ الفِ وصلك K Z 0643 ARABIC LETTER KAF حرف کاف عربیة Th Z 0629 ARABIC LETTER TEH MARBUTA حرف ت گردي Y YY D 064a ARABIC LETTER YEH حرف ی عربی نقطه دارى Yh V 0649 ARABIC LETTER ALEF MAKSURA حرف ی عربی بی نقطهTable 4: Arabic Letters: Mapping of Persian Unicode to farsi-transliterate-banan – Matching Table 6 of isiri-6219 farsi Emacs Uni- فارسى transliterate ISIR code Unicode Name نامِ نویسهbanan 9147 Hex ۰ 0 0 06f0 EXTENDED ARABIC-INDIC DIGIT ZERO رقم فارسی صفر۱ 1 1 06f1 EXTENDED ARABIC-INDIC DIGIT ONE رقم فارسی یک۲ 2 2 06f2 EXTENDED ARABIC-INDIC DIGIT TWO رقم فارسی دو۳ 3 3 06f3 EXTENDED ARABIC-INDIC DIGIT THREE رقم فارسی سه۴ 4 4 06f4 EXTENDED ARABIC-INDIC DIGIT FOUR رقم فارسی چهار۵ 5 5 06f5 EXTENDED ARABIC-INDIC DIGIT FIVE رقم فارسی پنج۶ 6 6 06f6 EXTENDED ARABIC-INDIC DIGIT SIX رقم فارسی شش۷ 7 7 06f7 EXTENDED ARABIC-INDIC DIGIT SEVEN رقم فارسی هفت۸ 8 8 06f8 EXTENDED ARABIC-INDIC DIGIT EIGHT رقم فارسی هشت۹ 9 9 06f9 EXTENDED ARABIC-INDIC DIGIT NINE رقم فارسی نه٫ @ / 066b ARABIC DECIMAL SEPARATOR ممیز فارسی٬ \,
U 066c ARABIC THOUSAND SEPARATOR جدا کننده هزارهای فارسی٪ %
066a ARABIC PERCENT SIGN درصد فارسی+ +
002b PLUS SIGN علامت به اضافه- -
2212 MINUS SIGN علامت منها× \*
066a MULTIPLICATION SIGN علامت ضرب÷ \-
00f7 DIVISION SIGN علامت تقسیم< < > 003c ARABIC LESS THAN SIGN علامت کوچکتر= = = 003d EQUAL SIGN علامت مساوی> > < 003e ARABIC GREATER THAN SIGN علامت بزرگترTable 5: Digits and Math Signs: Mapping of Persian Unicode to farsi-transliterate-banan – Matching Table 4 of isiri-6219 farsi Emacs Uni- فارسى transliterate ISIR code Unicode Name نامِ نویسهbanan 9147 Hex 0020 SPACE فاصله. . . 002e FULL STOP نقطه: : : 003a COLON دونقطه! ! ! 0021 EXCLAMATION POINT علامت تعجب… \.
2026 HORIZONTAL ELLIPSIS سه نقطه فارسی- - - 2010 HYPHEN خط تیره- - - 002d MINUS OR HYPHEN تیره منها| | | 007c VERTICAL BAR خط عمودی/ // / 002f SLASH خط اریب\
005c BACKSLASH خط اریب وارو* * * 002a ASTERISK ستاره) )
O 005b ARABIC OPENING BRACKET کروشه باز[ [
P 005d ARABIC CLOSING BRACKET کروشه بسته} }
007b ARABIC OPENING BRACE آکولاد باز{ {
007d ARABIC CLOSING BRACE آکولاد بسته» \>
NL 00ab LEFT-POINTING DOUBLE ANGLE QUOTATION MARK گیومه بستهTable 6: Common Punctuation Marks: Mapping of Persian Unicode to farsi-transliterate-banan – Matching Table 2 of isiri-6219 farsi Emacs Uni- فارسى transliterate ISIR code Unicode Name نامِ نویسهbanan 9147 Hex ، ,
T 060c ARABIC COMMA ویرگول فارسی؛ ;
Y 061b ARABIC SEMICOLON نقطه ویرگول فارسی؟ ?
061f ARABIC QUESTION MARK علامت سوال فارسیـ _
J 0640 ARABIC TATWEEL کشیدگی فارسیTable 7: Persian Punctuation Marks: Mapping of Persian Unicode to farsi-transliterate-banan – Matching Table 3 of isiri-6219 farsi Emacs Uni- فارسى transliterate ISIR code Unicode Name نامِ نویسهbanan 9147 Hex J ‍
200d ZERO WIDTH JOINER اتصال مجازی‎
200e LEFT-TO-RIGHT MARK نشانه چپ به راست‏
200f RIGHT-TO-LEFT MARK نشانه راست به چپ&ls ;
2028 LINE SEPARATOR جدا کننده سطرها&ps;
2029 PARAGRAPH SEPARATOR جدا کننده بندها&lre;
202a LEFT-TO-RIGHT EMBEDDING زیر متن چپ به راست&rle;
202b RIGHT-TO-LEFT EMBEDDING زیر متن راست به چپ&pdf;
202c POP DIRECTIONAL FORMATTING پایان زیر متن&lro;
202d LEFT-TO-RIGHT OVERRIDE زیر متن اکیداْ چپ به راست&rlo;
202e RIGHT-TO-LEFT OVERRIDE زیر متن اکیداْ راست به چپ&bom;
feff BYTE ORDER MARK نشانه ترتیب بایتهاTable 8: Control Mark Ups: Mapping of Persian Unicode to farsi-transliterate-banan – Matching Table 1 of isiri-6219 farsi Emacs Uni- فارسى transliterate ISIR code Unicode Name نامِ نویسهbanan 9147 Hex َ ^
U 064e ARABIC FATHA زبرِ e Y 0650 ARABIC KASRA زيرُ o T 064f ARABIC DAMMA پيش – ضمهً #
R 064b ARABIC FATHATAN دو زبرٍ E E 064b ARABIC KASRATAN دو زيرٌ O W 064c ARABIC DAMMATAN دو پیشّ ~
I 0651 ARABIC SHADDA تشديدْ @
Q 0652 ARABIC SUKUN ساکنٓ U X 0653 ARABIC MADA مدٔ `
N 0654 ARABIC HAMZA ABOVE همزه فارسی بالإ C
0655 ARABIC HAMZA BELOW همزه فارسی پایینٰ $
V 0670 ARABIC LETTER SUPERSCRIPT ALEF الف مقصورهTable 9: Persian Signs: Mapping of Persian Unicode to farsi-transliterate-banan – Matching Table 7 of isiri-6219 farsi Emacs Uni- فارسى transliterate ISIR code Unicode Name نامِ نویسهbanan 9147 Hex @ \@
0040 COMMERCIAL AT علامت در0 \0
0030 DIGIT ZERO رقم صفر لاتین1 \1
0031 DIGIT ONE رقم یک لاتین2 \2
0032 DIGIT TWO رقم دو لاتین3 \3
0033 DIGIT THREE رقم سه لاتین4 \4
0034 DIGIT FOUR رقم چهار لاتین5 \5
0035 DIGIT FIVE رقم پنج لاتین6 \6
0036 DIGIT SIX رقم شش لاتین7 \7
0037 DIGIT SEVEN رقم هفت لاتین8 \8
0038 DIGIT EIGHT رقم هشت لاتین9 \9
0039 DIGIT NINE رقم نه لاتینTable 10: Extensions: Mapping of Persian Unicode to farsi-transliterate-banan Banan Emacs Uni- فارسى Reverse ISIR code Unicode Name نامِ نویسه عربیTranslit 9147 Hex ۀ 06c0 ARABIC LETTER HEH WITH YEH ABOVE حرفِ هِ اردو با همزهی بالا۰ 0660 ARABIC-INDIC DIGIT ZERO رقم صفر عربی١ 0661 ARABIC-INDIC DIGIT ONE رقم یک عربی٢ 0662 ARABIC-INDIC DIGIT TWO رقم دو عربی٣ 0663 ARABIC-INDIC DIGIT THREE رقم سه عربی٤ 0664 ARABIC-INDIC DIGIT FOUR رقم چهار عربی٥ 0665 ARABIC-INDIC DIGIT FIVE رقم پنج عربی٦ 0666 ARABIC-INDIC DIGIT SIX رقم شش عربی٧ 0667 ARABIC-INDIC DIGIT SEVEN رقم هفت عربی٨ 0668 ARABIC-INDIC DIGIT EIGHT رقم هشت عربی٩ 0669 ARABIC-INDIC DIGIT NINE رقم نه عربیTable 11: Forbidden Characters: Mapping of Persian Unicode to farsi-transliteration-banan – – Matching Table 8 of isiri-6219 6 Relevant Standards/Specifications
We have put together a repository of standards/specifications which are relevant to Persian input methods. That repository is at:
Legitimacy of any of these documents as standards is not our focus or concern. We have included them here because they are relevant and useful.
6.1 ISIRI-6219
Based on Unicode, ISIRI-6219 defines the Farsi Character Set. Its full title is:
فنّاوریِ اطلاعات – تبادل و شیوهی نمایش اطلاعاتِ فارسی بر اساس یونی کُد
استاندارد ملی ایران ۶۲۱۹ −− نسخهی نهایی
Institute of Standards and Industrial Research of Iran Information Technology – Persian Information Interchange and Display Mechanism, using Unicode ISIRI-6219 Final Version
Published at:
and republished at: Suggested Enhancements For ISIRI-6219
During the process of developing farsi-transliterate-banan we studied ISIRI-6219. Here are some of our comments and some suggestions.
6.2.1 Clear labeling of ISIRI-6219 as the definition of Farsi Character Set
ISIRI-6219 does many things. It defines the Farsi Character Set and it also includes translation of various global specifications.
ISIRI-6219 does not clearly say that it primarily defines Iran’s Farsi Character Set.
On the title page and early in the specification it should explicitly make it clear that ISIRI-6219 defines the Farsi Character Set for Iran. Something along the lines of:
مجموعه نویسهٔ استاندارد ایران برای تبادل اطلاعات، استاندارد ملی ۶۲۱۹ مؤسسهٔ استاندارد و تحقیقات صنعتی ایران است که مبتنی بر یونی کد است.Being the definition of Farsi Character Set, it should then require that all Farsi Input Methods make it clear that they provide for full support of the Farsi Character Set. And if an input method provides for anything more than ISIRI-6219, those extensions should be explicitly marked as extensions. This is not happening between ISIRI-9147 and ISIRI-6219 today. Specification of farsi-transliterate-banan input method in this document is based on the ISIRI-6219 Farsi Character Set tables. Conformance of farsi-transliterate-banan is explicitly made clear and extensions are explicit.
6.2.2 Missing At Sign – ’@’
ISIRI-6219 does not include ’@’ as part of the Farsi Character Set.
Moving towards use of Internationalized Domain Name (IDN) and use of – .ایران – requires ’@’ for email addresses. This alone makes ’@’ important enough for inclusion in ISIRI-6219.
6.3 ISIRI-9147
ISIRI-9147 defines the layout of Iran’s Persian keyboard. Its full title is:
فنّاوریِ اطلاعات - چیدمان حروف و علائم فارسی بر صفحه کلید رایانه
استاندارد ملی ایران ۹۱۴۷ − چاپ اول
Institute of Standards and Industrial Research of Iran Information Technology – Layout of Persian Letters and Symbols on Computer Keyboards ISIRI 9147 -- 1st edition
Published at:
and republished at: Suggested Enhancements For ISIRI-9147
Design and specification of ISIRI-9147 is overly tactical. While ISIRI-9147 specifies a keyboard layout, it should strategically leave the door open to more.
Today, a keyboard specification needs to be more than just a layout for a physical keyboard. It is not to be viewed as the sole input method and as such should consider co-habitation topics related to harmony with other input methods.
Difficulties of ISIRI-9147 in fitting well into a multilingual editor such as emacs include:
6.4.1 Entry into Layer 3 with a Layer 1 Key instead of Alt
Specification of ISIRI-9147 provides access to layer 3 through the Alt key.
The Alt key may not be available in some environments – as the Alt key is often an integral part of multilingual editors such as emacs. When the Alt key is not available and when the input model supports 2 letter compositions, entry into layer 3 can be made through a reserved layer 1 key.
So, we suggest reserving the Backslash key to replace the Alt key in such environments. And moving Alt-Backslash to Backslash-r.
6.4.2 Alternates For Shift-Space and Alt-Space
We suggest providing equivalents for Shift-Space and Alt-Space. In our implementation we have placed them at layer 3 as Backslash-y and Backslash-t.
6.4.3 Explicit Identification of Extensions Beyond ISIRI-6219
In its layer 3, ISIRI-9147 goes well beyond ISIRI-6219 without explicitly identifying the extensions. This damages the purpose of ISIRI-6219.
7 The Broader Scope Of farsi-transliterate-banan
Aside from farsi-transliterate-banan, all Persian input methods today are keyboard layout oriented or are single character transliteration mapping input methods. More often now the keyboard layouts conform to ISIRI-9147.
While that convergence point is good and great, we can also be using more powerful input method models.
In this day and age it makes good sense to adopt the more powerful composition input method instead of the simple mapping method. Here we are proposing that farsi-transliterate-banan as defined in Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9 and Table 10 be considered a convergence point for Persian composition input methods.
For example, in Gnome, where we currently only have file:///usr/share/X11/xkb/symbols/ir, it would be nice to also implement the equivalent of farsi-transliterate-banan.
We would very much like to collaborate towards that goal.
8 History and Previous Work
Use of the Latin character keyboard to input Persian text into machines, and more generally use of the Latin alphabet for writing in Persian, is an old topic with a lot of history.
Jon Dehdari has assembled a table that summarizes previous work in this area. We have reproduced it here as Table 12.
فارسى Dehdari Buckwal ArabTEX Uni-Dec Uni-Hex UTF-8 Isi3342 CP1256 Uni-Name ا A A A 1575 0627 d8a7 c1 c7 ARABIC LETTER ALEF ب b b b 1576 0628 d8a8 c3 c8 ARABIC LETTER BEH پ p P p 1662 067e d9be c4 81 ARABIC LETTER PEH ت t t t 1578 062a d8aa c5 ca ARABIC LETTER TEH ث V v _t 1579 062b d8ab c6 cb ARABIC LETTER THEH ج j j j 1580 062c d8ac c7 cc ARABIC LETTER JEEM چ c J ^
c1670 0686 da86 c8 8d ARABIC LETTER TCHEH ح H H .h 1581 062d d8ad c9 cd ARABIC LETTER HAH خ x x x 1582 062e d8ae ca ce ARABIC LETTER KHAH د d d d 1583 062f d8af cb cf ARABIC LETTER DAL ذ L * _d 1584 0630 d8b0 cc d0 ARABIC LETTER THAL ر r r r 1585 0631 d8b1 cd d1 ARABIC LETTER REH ز z z z 1586 0632 d8b2 ce d2 ARABIC LETTER ZAIN ژ J ^
z1688 0698 da98 cf 8e ARABIC LETTER JEH س s s s 1587 0633 d8b3 d0 d3 ARABIC LETTER SEEN ش C $ ^
s1588 0634 d8b4 d1 d4 ARABIC LETTER SHEEN ص S S .s 1589 0635 d8b5 d2 d5 ARABIC LETTER SAD ض D D .d 1590 0636 d8b6 d3 d6 ARABIC LETTER DAD ط T T .t 1591 0637 d8b7 d4 d8 ARABIC LETTER TAH ظ Z Z .z 1592 0638 d8b8 d5 d9 ARABIC LETTER ZAH ع E E ‘ 1593 0639 d8b9 d6 da ARABIC LETTER AIN غ G g .g 1594 063a d8ba d7 db ARABIC LETTER GHAIN ف f f f 1601 0641 d981 d8 dd ARABIC LETTER FEH ق q q q 1602 0642 d982 d9 de ARABIC LETTER QAF ك K k k 1603 0643 d983 fd df ARABIC LETTER KAF گ g G g 1711 06af daaf db 90 ARABIC LETTER GAF ل l l l 1604 0644 d984 dc e1 ARABIC LETTER LAM م m m m 1605 0645 d985 dd e3 ARABIC LETTER MEEM ن n n n 1606 0646 d986 de e4 ARABIC LETTER NOON و u w U 1608 0648 d988 df e6 ARABIC LETTER WAW ه h h h 1607 0647 d987 e0 e5 ARABIC LETTER HEH ي y y I 1610 064a d98a fe ed ARABIC LETTER YEH َ a a a 1614 064e d98e f0 f3 ARABIC FATHA ُ o u o 1615 064f d98f f2 f5 ARABIC DAMMA ِ e i e 1616 0650 d990 f1 f6 ARABIC KASRA آ ] |
’A 1570 0622 d8a2 c0 c2 ARABIC LETTER ALEF WITH MADDA ABOVE ا |
A a 1575 0627 d8a7 c1 c7 ARABIC LETTER ALEF # Initial ة P p T 1577 0629 d8a9 fc c9 ARABIC LETTER TEH MARBUTA ک k k k 1705 06a9 daa9 da 98 ARABIC LETTER KEHEH ی i i I 1740 06cc db8c e1 ARABIC LETTER FARSI YEH ً M ’ ’ |
1569 0621 d8a1 c2 c1 ARABIC LETTER HAMZA X H-i 1728 06c0 db80 c0 ARABIC LETTER HEH WITH YEH ABOVE ئ I } ’y 1574 0626 d8a6 fb c6 ARABIC LETTER YEH WITH HAMZA ABOVE ؤ U & U’ 1572 0624 d8a4 fa c4 ARABIC LETTER WAW WITH HAMZA ABOVE ً N F aN 1611 064b d98b f3 f0 ARABIC FATHATAN ّ ∼ ∼ xx 1617 0651 d991 f6 f8 ARABIC SHADDA ، , , , 1548 060c d88c ac a1 ARABIC COMMA ؛ ; ; ; 1563 061b d89b bb ba ARABIC SEMICOLON ؟ ? ? ? 1567 061f d89f bf bf ARABIC QUESTION MARK ٪ % % % 1642 066a d9aa a5 ARABIC PERCENT SIGN 0032 0020 20 a0 20 SPACE . . . . 0046 002e 2e a6 2e FULL STOP \\
0010 000a 0a 0a 0a LINEFEED « { \lq
0171 00ab ab e7 ab LEFT-POINTING DOUBLE ANGLE … » } \rq
0187 00bb bb e6 bb RIGHT-POINTING DOUBLE ANGLE … - \hspace{0ex}
8204 200c e2808c a1 9d ZERO WIDTH NON-JOINER Table 12: Jon Dehdari’s Pre-Unicode Character Set Mappings Because of the widespread adoption of Unicode, this previous work is now largely obsolete.
Most transliteration previous work (Legally’s ArabTeX, Buckwal, Dehdari, ...) consists mostly of single character mappings.
The farsi-transliterate-banan input method documented here is distinctly different from these past transliteration methods with respect to wide use of compositions in general and with regard to the “h” postfix composition in particular.
9 Colophon
This document was produced entirely with Libre-Halaal Software, and is published using Libre-Halaal Internet Services. All tools used to produce and distribute this document conform fully to the definition of Libre-Halaal Software and Libre-Halaal Internet Application Services as specified in [2] and [8].
9.1 Our Libre-Halaal Software Tools
This document has been created based exclusively on the use of Libre-Halaal software tools. We make use of a comprehensive and well-integrated set of tools, including:
- Debian GNU/Linux is our base platform
- Emacs is our editor-based user environement
- TeX, LaTeX, XeTeX, XeLaTeX is our document processor
- The Emacs bidi (bidirectional) capability is used to write in mixed Persian and Globish
- The xepersian LaTeX package is used to process Persian documents
- The LaTeX beamer package is used to prepare presentation slides
- The Emacs auctex mode is used to create documents in LaTeX
- Aspell via Emacs is used for spell checking in Persian/Farsi and Globish/English
- Dict via Emacs is used for dictionary and thesarus lookup in multiple languages
- Conversion from LaTeX to html is accomplished through HeVea and tex4ht
- Libre Office is used for creating figures and illustrations
- CVS via Emacs is used for version control
- The Emacs Gnus and qmail facilities are used for emailing out drafts and receiving feedback
- Integration with ByStar Services is through BLEE (the ByStar Libre Emacs Environment)
These Libre-Halaal software tools collectively represent a deeply integrated environment that is far superior in capability to any Haraam software. We question why so many people continue to use the clumsy and ineffective Microsoft Proprietary-Haraam software when such a vastly superior alternative is available.
9.2 Our Libre-Halaal Internet Services
The publication and distribution of this document has been accomplished exclusively by means of Libre-Halaal Internet Application Services. We make use of a comprehensive and well-integrated set of services, including:
- The ByName Autonomous Libre Service (part of the By* family) is used for autonomous web publication of this document by the author himself
- The ByContent Federated Libre Service (part of the By* family) is used for web re-publication/distribution of this document
- All By* Services are based on the Debian GNU/Linux platform
- Apache2 and Plone3 are used to provide By* Web Services
- All By* Services related to this document are hosted at, a physical data center built exclusively with Halaal software. All routers, servers and other hardware infrastructure at run Halaal Software exclusively.
- The By* Self Publication Facilities, fully integrated with BLEE, are used for publication of this document
- The By* Library Facilities are used for managing this document in the context of multiple other related documents
These Libre-Halaal Internet Services are comparable in capability to the most high-profile Haraam Internet Services presently available, such as Google or Facebook.
The deep integration between Libre-Halaal Software and Libre-Halaal Internet Services creates a Libre-Halaal Software-Service continuum, which is far superior in capability to any Proprietary-Haraam Software/Service combination.
- [1]
- " Mohsen BANAN ". " introducing convivial into globish ". Permanent Libre Published Content "120037", Autonomously Self-Published, "July" 2011.
- [2]
- " Mohsen BANAN ". " defining halaal manner-of-existence of software and defining halaal internet application services ". Permanent Libre Published Content "120041", Autonomously Self-Published, "September" 2012.
- [3]
- " Mohsen BANAN ". "introducing halaal and haraam into globish based on moral philosophy of abstract halaal معرفیِ حلال و حرام به بقیهیِ دنیا ". Permanent Libre Published Content "120039", Autonomously Self-Published, "September" 2012.
- [4]
- et al. " " Banan. " overview of bystar digital ecosystem concepts, models and offerings ". Permanent Libre Published Content "180011", Autonomously Self-Published, "February" 2011.
- [5]
- et al. " " Banan. " blee and bxgnome: Bystar software-service continuum based convivial user environments ". Permanent Libre Published Content "180004", Autonomously Self-Published, "September" 2012.
- [6]
- et al. " " Banan. " the libre/halaal bystar digital ecosystem a unified and non-proprietary model for autonomous internet services a moral alterantive to the proprietary american digital ecosystem ". Permanent Libre Published Content "180016", Autonomously Self-Published, "September" 2012.
- [7]
- Andrew Hammoude " " Mohsen BANAN. " libre services a non-proprietary model for delivery of internet services ". Permanent Libre Published Content "100101", Autonomously Self-Published, "March" 2006.
- [8]
- " محسن بنان ". " تعريف نرم افزار حلال و تعريف خدمات اينترنتى حلال ". Permanent Libre Published Content "120035", Autonomously Self-Published, "May" 2013.
Document Actions
- Note that we are not writing in pinglish. Ignore the vowels and think of the Persian writing above letter-by-letter.