Supported Scripts
The Unicode Standard encodes scripts rather than languages. When writing systems for more than one language share sets of graphical symbols that have historically related derivations, the union of all of those graphical symbols is treated as a single collection of characters for encoding and is identified as a single script. Each script then serves as an inventory of graphical symbols, which are drawn upon for the writing systems of particular languages. In many cases, a single script, such as the Latin script, may be used to write tens or even hundreds of languages. In other cases, only one language employs a particular script—for example, Hangul, which is typically used only to write the Korean language. The writing systems for some languages may also use more than one script; for example, Japanese traditionally makes use of the Han (Kanji), Hiragana, and Katakana scripts, and modern Japanese usage commonly mixes in the Latin script as well.
The scripts supported by the Unicode Standard include all of those listed in the following table. The listing in the table is ordered by the version of the Unicode Standard in which a particular script was first encoded. In many instances, supplemental characters for a given script have been encoded in subsequent versions of the standard, after the initial addition of the script. Details about most of these scripts can be looked up at the ScriptSource website.
Version (Year) |
Scripts Added |
Totals |
1.1 (1993) |
|
23 |
|
Arabic |
Gujarati |
Lao |
|
Armenian |
Gurmukhi |
Latin |
Bengali |
Han |
Malayalam |
Bopomofo |
Hangul |
Oriya |
Cyrillic |
Hebrew |
Tamil |
Devanagari |
Hiragana |
Telugu |
Georgian |
Kannada |
Thai |
Greek |
Katakana |
|
2.0 (1996) |
|
+1, = 24 |
|
Tibetan |
|
|
|
3.0 (1999) |
|
+13, = 37 |
|
Braille (patterns) |
Mongolian |
Syriac |
|
Canadian Syllabics |
Myanmar |
Thaana |
Cherokee |
Ogham |
Yi |
Ethiopic |
Runic |
|
Khmer |
Sinhala |
|
3.1 (2001) |
|
+3, = 40 |
|
Deseret |
Gothic |
Old Italic |
|
3.2 (2002) |
|
+4, = 44 |
|
Buhid |
Tagalog |
|
|
Hanunóo |
Tagbanwa |
|
4.0 (2003) |
|
+7, = 51 |
|
Cypriot |
Osmanya |
Ugaritic |
|
Limbu |
Shavian |
|
Linear B |
Tai Le |
|
4.1 (2005) |
|
+8, = 59 |
|
Buginese |
Kharoshthi |
Syloti Nagri |
|
Coptic |
New Tai Lue |
Tifinagh |
Glagolitic |
Old Persian Cuneiform |
|
5.0 (2006) |
|
+5, = 64 |
|
Balinese |
Phags-pa |
Sumero-Akkadian Cuneiform |
|
N'Ko |
Phoenician |
|
5.1 (2008) |
|
+11, = 75 |
|
Carian |
Lycian |
Saurashtra |
|
Cham |
Lydian |
Sundanese |
Kayah Li |
Ol Chiki |
Vai |
Lepcha |
Rejang |
|
5.2 (2009) |
|
+15, = 90 |
|
Avestan |
Inscriptional Parthian |
Old South Arabian |
|
Bamum |
Javanese |
Old Turkic |
Egyptian Hieroglyphs |
Kaithi |
Samaritan |
Imperial Aramaic |
Lisu |
Tai Tham |
Inscriptional Pahlavi |
Meetei Mayek |
Tai Viet |
6.0 (2010) |
|
+3, = 93 |
|
Batak |
Brahmi |
Mandaic |
|
6.1 (2012) |
|
+7, = 100 |
|
Chakma |
Miao |
Takri |
|
Meroitic Cursive |
Sharada |
|
Meroitic Hieroglyphs |
Sora Sompeng |
|
7.0 (2014) |
|
+23, = 123 |
|
Bassa Vah |
Mahajani |
Pahawh Hmong |
|
Caucasian Albanian |
Manichaean |
Palmyrene |
Duployan (shorthand) |
Mende Kikakui |
Pau Cin Hau |
Elbasan |
Modi |
Psalter Pahlavi |
Grantha |
Mro |
Siddham |
Khojki |
Nabataean |
Tirhuta |
Khudawadi |
Old North Arabian |
Warang Citi |
Linear A |
Old Permic |
|
8.0 (2015) |
|
+6, = 129 |
|
Ahom |
Hatran |
Old Hungarian |
|
Anatolian Hieroglyphs |
Multani |
Sutton SignWriting |
9.0 (2016) |
|
+6, = 135 |
|
Adlam |
Marchen |
Osage |
|
Bhaiksuki |
Newa |
Tangut |
10.0 (2017) |
|
+4, = 139 |
|
Masaram Gondi |
Soyombo |
|
|
Nushu |
Zanabazar Square |
|
11.0 (2018) |
|
+7, = 146 |
|
Dogra |
Makasar |
Sogdian |
|
Gunjala Gondi |
Medefaidrin |
|
Hanifi Rohingya |
Old Sogdian |
|
12.0 (2019) |
|
+4, = 150 |
|
Elymaic |
Nyiakeng Puachue Hmong |
|
|
Nandinagari |
Wancho |
|
13.0 (2020) |
|
+4, = 154 |
|
Chorasmian |
Khitan Small Script |
|
|
Dives Akuru |
Yezidi |
|
14.0 (2021) |
|
+5, = 159 |
|
Cypro-Minoan |
Tangsa |
Vithkuqi |
|
Old Uyghur |
Toto |
|
15.0 (2022) |
|
+2, = 161 |
|
Kawi |
Nag Mundari |
|
|
In addition to the scripts listed above, a large number of other collections of characters are also encoded by Unicode. These collections include the following:
- Numbers
- General Diacritics
- General Punctuation
- General Symbols
- Mathematical Symbols (Western and Arabic)
- Musical Symbols (Western, Byzantine, Ancient Greek, and other)
- Technical Symbols
- Emoji: For details, see Emoji Versions
- Dingbats
- Arrows, Blocks, Box Drawing Forms, and Geometric Shapes
- Game Symbols
- Miscellaneous Symbols
- Presentation Forms
- Kangxi and other CJK radicals
|