Detect CharSet-2.326.1130



Detect CharSet

Author: Jerry Chae

This plugin takes and text file and tells you what Character Set (Character Code/Encoding) the file uses.



Need help?

Technical contact to tech@argos-labs.com


May you search all operations,




Input

  1. Any text files (.txt or .csv are the more common ones)


Output

  1. A CSV file with 3 headers
    Headers are: Language, Encoding, Confidence.

  (warning) Note: it reply blank when the language is for all languages.


Character Code Table

CodecLanguagesCodecLanguagesCodecLanguagesCodecLanguages
asciiEnglishcp869GreekgbkUnified ChinesejohabKorean
big5Traditional Chinesecp874Thaigb18030Unified Chinesekoi8_rRussian
big5hkscsTraditional Chinesecp875GreekhzSimplified Chinesekoi8_tTajik
cp037Englishcp932Japaneseiso2022_jpJapanesekoi8_uUkrainian
cp273Germancp949Koreaniso2022_jp_1Japanesekz1048Kazakh
cp424Hebrewcp950Traditional Chineseiso2022_jp_2Japanese, Korean, Simplified Chinese, Western Europe, Greekmac_cyrillicBulgarian, Byelorussian, Macedonian, Russian, Serbian
cp437Englishcp1006Urduiso2022_jp_2004Japanesemac_greekGreek
cp500Western Europecp1026Turkishiso2022_jp_3Japanesemac_icelandIcelandic
cp720Arabiccp1125Ukrainianiso2022_jp_extJapanesemac_latin2Central and Eastern Europe
cp737Greekcp1140Western Europeiso2022_krKoreanmac_romanWestern Europe
cp775Baltic languagescp1250Central and Eastern Europelatin_1Western Europemac_turkishTurkish
cp850Western Europecp1251Bulgarian, Byelorussian, Macedonian, Russian, Serbianiso8859_2Central and Eastern Europeptcp154Kazakh
cp852Central and Eastern Europecp1252Western Europeiso8859_3Esperanto, Malteseshift_jisJapanese
cp855Bulgarian, Byelorussian, Macedonian, Russian, Serbiancp1253Greekiso8859_4Baltic languagesshift_jis_2004Japanese
cp856Hebrewcp1254Turkishiso8859_5Bulgarian, Byelorussian, Macedonian, Russian, Serbianshift_jisx0213Japanese
cp857Turkishcp1255Hebrewiso8859_6Arabicutf_32all languages
cp858Western Europecp1256Arabiciso8859_7Greekutf_32_beall languages
cp860Portuguesecp1257Baltic languagesiso8859_8Hebrewutf_32_leall languages
cp861Icelandiccp1258Vietnameseiso8859_9Turkishutf_16all languages
cp862Hebrewcp65001Windows only Windows UTF-8 (CP_UTF8)iso8859_10Nordic languagesutf_16_beall languages
cp863Canadianeuc_jpJapaneseiso8859_11Thai languagesutf_16_leall languages
cp864Arabiceuc_jis_2004Japaneseiso8859_13Baltic languagesutf_7all languages
cp865Danish, Norwegianeuc_jisx0213Japaneseiso8859_14Celtic languagesutf_8all languages
cp866Russianeuc_krKoreaniso8859_15Western Europeutf_8_sigall languages


gb2312Simplified Chineseiso8859_16South-Eastern Europe


How to set parameters