No recognize Romanian letters şţâă in subtitles external

No recognize Romanian letters şţâă in subtitles external

Make sure your external SRT subtitle files are encoded with the UTF-8 text encoding.

Here is a great tool to convert your files: Subtitle Edit
It can even convert a whole drive’s worth of SRT files in “batch mode”.

1 Like

Hello Otto. I add you a sample where these kind of diacritics are not shown and file is UTF8


srt sample.zip (10.3 KB)
in this case is a synology server

There’s another older thread about this issue - you might want to jump on it: Plex Media Server doesn't recognize the language code for external subtitles

Looks like converting the file helps (there’s some notes in that thread about it) but also that this might be a regression…

Could you tell me if your screen shots are showing the correct letters or wrong ones? I am in no way familiar with Romanian, to be able to tell.

Your sample file is not UTF-8 encoded. It uses Western European Windows codepage 1252.
It also appears to be already converted in a wrong way. For instance it contains the letter þ quite frequently, which to my knowledge is only used in Icelandic nowadays.

If I manually change the codepage to Windows 1250, the letters change to what appears to be Romanian appearance (as mentioned, I don’t speak Romanian).

I converted your sample file to UTF-8.
srt sample.zip (10.1 KB)

Thanks Otto giving a view to this

This is ISO 1252 western european (wrong character = þ this is not romanian character)
52
01:13:40,015 → 01:13:44,848
a dezvoltãrii minþii umane
pentru a vedea posibilitãþi infinite.

354
01:13:40,015 → 01:13:44,848
a dezvoltãrii minþii umane
pentru a vedea posibilitãþi infinite.

This is the conversion to UTF8 (there are missing characters)

52
00:14:36,748 → 00:14:39,381
am reu t s control experimentul

354
01:13:40,015 → 01:13:44,848
a dezvolt ii min i umane
pentru a vedea posibilit i infinite.

This is the conversion to ISO 8859-16 (which is Romanian ISO standard) (this is the correct one ă,ș, ț are romanian characters)

52
00:14:36,748 → 00:14:39,381
am reușit să controlăm experimentul.

354
01:13:40,015 → 01:13:44,848
a dezvoltării minții umane
pentru a vedea posibilități infinite.

The correct one is ISO 8859-16

Did you do the conversion or did you use my file?
Which software did you use to read the file?

I use the text editor Synology have which also give the option to change the encodingtext editor

Have you tried my file which I attached above?

This is looking correctly, right?

I just tried your modified attached srt and is working correctly!! And yes your above printscreen is showing the characters in the right way
Sorry to get confusing you but the editor of Synology this is what was showing me when I try to change the encoding to UTF8 with the original file (missing characters)
Now the question is there any way that this conversion from whatever encoding to UTF8 to be done automatically by Plex when is using/charging an external subtitle? Or manual conversion (file by file or I have to find a batch conversion tool for Synology/Linux) is the only way ?

showing correct after conversion

The problem I encountered is that regular software (like the SubtitleEdit app shown above) all interpret the file as using the Windows CP 1252, which is incorrect.

By using Notepad++, I was able to tell the software manually that it is in fact Windows CP 1250.
I then subsequently converted it to UTF-8.

I can only assume that computers which are set to use the Romanian language are employing some workaround to interpret these files differently.
And it is probably this workaround which apparently stopped working in Plex in recent versions.

If can you find a tool which converts text files from cp1250 to utf-8, you should be good.

Since the Synology uses Linux as its base, I wonder if the iconv command line app is available for it.

iconv -f CP1250 -t UTF-8 infile.txt -o outfile.txt

(source: iconv - Wikipedia)

UTF8

I do not know what to say. The text editor from Synology is seeing it as UTF8 (with missing characters) which was confusing me in the begging. I will install also notepad++ in a windows environment to check the difference
And personally I use DSM7 in english as OS

LE
root@Eagle:~#
-ash: iconv: command not found

but as I am reading synology documentation uconv is available which is kind of similar
https://linux.die.net/man/1/uconv

root@Eagle:~# uconv -l
UTF-8 ibm-1208 ibm-1209 ibm-5304 ibm-5305 ibm-13496 ibm-13497 ibm-17592 ibm-17593 windows-65001 cp1208 x-UTF_8J unicode-1-1-utf-8 unicode-2-0-utf-8
UTF-16 ISO-10646-UCS-2 ibm-1204 ibm-1205 unicode csUnicode ucs-2
UTF-16BE x-utf-16be UnicodeBigUnmarked ibm-1200 ibm-1201 ibm-13488 ibm-13489 ibm-17584 ibm-17585 ibm-21680 ibm-21681 ibm-25776 ibm-25777 ibm-29872 ibm-29873 ibm-61955 ibm-61956 windows-1201 cp1200 cp1201 UTF16_BigEndian
UTF-16LE x-utf-16le UnicodeLittleUnmarked ibm-1202 ibm-1203 ibm-13490 ibm-13491 ibm-17586 ibm-17587 ibm-21682 ibm-21683 ibm-25778 ibm-25779 ibm-29874 ibm-29875 UTF16_LittleEndian windows-1200
UTF-32 ISO-10646-UCS-4 ibm-1236 ibm-1237 csUCS4 ucs-4
UTF-32BE UTF32_BigEndian ibm-1232 ibm-1233 ibm-9424
UTF-32LE UTF32_LittleEndian ibm-1234 ibm-1235
UTF16_PlatformEndian
UTF16_OppositeEndian
UTF32_PlatformEndian
UTF32_OppositeEndian
UTF-16BE,version=1 UnicodeBig
UTF-16LE,version=1 UnicodeLittle x-UTF-16LE-BOM
UTF-16,version=1
UTF-16,version=2
UTF-7 windows-65000 unicode-1-1-utf-7 unicode-2-0-utf-7
IMAP-mailbox-name
SCSU ibm-1212 ibm-1213
BOCU-1 csBOCU-1 ibm-1214 ibm-1215
CESU-8 ibm-9400
ISO-8859-1 ibm-819 IBM819 cp819 latin1 8859_1 csISOLatin1 iso-ir-100 ISO_8859-1:1987 l1 819
US-ASCII ASCII ANSI_X3.4-1968 ANSI_X3.4-1986 ISO_646.irv:1991 iso_646.irv:1983 ISO646-US us csASCII iso-ir-6 cp367 ascii7 646 windows-20127 ibm-367 IBM367
gb18030 ibm-1392 windows-54936 GB18030
ibm-912_P100-1995 ibm-912 ISO-8859-2 ISO_8859-2:1987 latin2 csISOLatin2 iso-ir-101 l2 8859_2 cp912 912 windows-28592
ibm-913_P100-2000 ibm-913 ISO-8859-3 ISO_8859-3:1988 latin3 csISOLatin3 iso-ir-109 l3 8859_3 cp913 913 windows-28593
ibm-914_P100-1995 ibm-914 ISO-8859-4 latin4 csISOLatin4 iso-ir-110 ISO_8859-4:1988 l4 8859_4 cp914 914 windows-28594
ibm-915_P100-1995 ibm-915 ISO-8859-5 cyrillic csISOLatinCyrillic iso-ir-144 ISO_8859-5:1988 8859_5 cp915 915 windows-28595
ibm-1089_P100-1995 ibm-1089 ISO-8859-6 arabic csISOLatinArabic iso-ir-127 ISO_8859-6:1987 ECMA-114 ASMO-708 8859_6 cp1089 1089 windows-28596 ISO-8859-6-I ISO-8859-6-E x-ISO-8859-6S
ibm-9005_X110-2007 ibm-9005 ISO-8859-7 8859_7 greek greek8 ELOT_928 ECMA-118 csISOLatinGreek iso-ir-126 ISO_8859-7:1987 windows-28597 sun_eu_greek
ibm-813_P100-1995 ibm-813 cp813 813
ibm-5012_P100-1999 ibm-5012 ISO-8859-8 hebrew csISOLatinHebrew iso-ir-138 ISO_8859-8:1988 ISO-8859-8-I ISO-8859-8-E 8859_8 windows-28598 hebrew8
ibm-916_P100-1995 ibm-916 cp916 916
ibm-920_P100-1995 ibm-920 ISO-8859-9 latin5 csISOLatin5 iso-ir-148 ISO_8859-9:1989 l5 8859_9 cp920 920 windows-28599 ECMA-128 turkish8 turkish
iso-8859_10-1998 ISO-8859-10 iso-ir-157 l6 ISO_8859-10:1992 csISOLatin6 latin6
iso-8859_11-2001 ISO-8859-11 thai8 x-iso-8859-11
ibm-921_P100-1995 ibm-921 ISO-8859-13 8859_13 windows-28603 cp921 921 x-IBM921
iso-8859_14-1998 ISO-8859-14 iso-ir-199 ISO_8859-14:1998 latin8 iso-celtic l8
ibm-923_P100-1998 ibm-923 ISO-8859-15 Latin-9 l9 8859_15 latin0 csisolatin0 csisolatin9 iso8859_15_fdis cp923 923 windows-28605
ibm-942_P12A-1999 ibm-942 ibm-932 cp932 shift_jis78 sjis78 ibm-942_VSUB_VPUA ibm-932_VSUB_VPUA x-IBM942 x-IBM942C
ibm-943_P15A-2003 ibm-943 Shift_JIS MS_Kanji csShiftJIS windows-31j csWindows31J x-sjis x-ms-cp932 cp932 windows-932 cp943c IBM-943C ms932 pck sjis ibm-943_VSUB_VPUA x-MS932_0213 x-JISAutoDetect
ibm-943_P130-1999 ibm-943 Shift_JIS cp943 943 ibm-943_VASCII_VSUB_VPUA x-IBM943
ibm-33722_P12A_P12A-2009_U2 ibm-33722 ibm-5050 ibm-33722_VPUA IBM-eucJP
ibm-33722_P120-1999 ibm-33722 ibm-5050 cp33722 33722 ibm-33722_VASCII_VPUA x-IBM33722 x-IBM33722A x-IBM33722C
ibm-954_P101-2007 ibm-954 x-IBM954 x-IBM954C
euc-jp-2007 EUC-JP Extended_UNIX_Code_Packed_Format_for_Japanese csEUCPkdFmtJapanese X-EUC-JP eucjis ujis
ibm-1373_P100-2002 ibm-1373 windows-950
windows-950-2000 Big5 csBig5 windows-950 x-windows-950 x-big5 ms950
ibm-950_P110-1999 ibm-950 cp950 950 x-IBM950
ibm-1375_P100-2008 ibm-1375 Big5-HKSCS big5hk HKSCS-BIG5
ibm-5471_P100-2006 ibm-5471 Big5-HKSCS MS950_HKSCS hkbig5 big5-hkscs:unicode3.0 x-MS950-HKSCS
ibm-1386_P100-2001 ibm-1386 cp1386 windows-936 ibm-1386_VSUB_VPUA
windows-936-2000 GBK CP936 MS936 windows-936
ibm-1383_P110-1999 ibm-1383 GB2312 csGB2312 cp1383 1383 EUC-CN ibm-eucCN hp15CN ibm-1383_VPUA
ibm-5478_P100-1995 ibm-5478 GB_2312-80 chinese iso-ir-58 csISO58GB231280 gb2312-1980 GB2312.1980-0
euc-tw-2014 EUC-TW
ibm-964_P110-1999 ibm-964 ibm-eucTW cns11643 cp964 964 ibm-964_VPUA x-IBM964
ibm-949_P110-1999 ibm-949 cp949 949 ibm-949_VASCII_VSUB_VPUA x-IBM949
ibm-949_P11A-1999 ibm-949 cp949c ibm-949_VSUB_VPUA x-IBM949C IBM-949C
ibm-970_P110_P110-2006_U2 ibm-970 EUC-KR KS_C_5601-1987 windows-51949 csEUCKR ibm-eucKR KSC_5601 5601 cp970 970 ibm-970_VPUA x-IBM970
ibm-971_P100-1995 ibm-971 ibm-971_VPUA x-IBM971
ibm-1363_P11B-1998 ibm-1363 KS_C_5601-1987 KS_C_5601-1989 KSC_5601 csKSC56011987 korean iso-ir-149 cp1363 5601 ksc windows-949 ibm-1363_VSUB_VPUA x-IBM1363C
ibm-1363_P110-1997 ibm-1363 ibm-1363_VASCII_VSUB_VPUA x-IBM1363
windows-949-2000 windows-949 KS_C_5601-1987 KS_C_5601-1989 KSC_5601 csKSC56011987 korean iso-ir-149 ms949 x-KSC5601
windows-874-2000 TIS-620 windows-874 MS874 x-windows-874
ibm-874_P100-1995 ibm-874 ibm-9066 cp874 TIS-620 tis620.2533 eucTH x-IBM874
ibm-1162_P100-1999 ibm-1162
ibm-437_P100-1995 ibm-437 IBM437 cp437 437 csPC8CodePage437 windows-437
ibm-720_P100-1997 ibm-720 windows-720 DOS-720 x-IBM720
ibm-737_P100-1997 ibm-737 IBM737 cp737 windows-737 737 x-IBM737
ibm-775_P100-1996 ibm-775 IBM775 cp775 csPC775Baltic windows-775 775
ibm-850_P100-1995 ibm-850 IBM850 cp850 850 csPC850Multilingual windows-850
ibm-851_P100-1995 ibm-851 IBM851 cp851 851 csPC851
ibm-852_P100-1995 ibm-852 IBM852 cp852 852 csPCp852 windows-852
ibm-855_P100-1995 ibm-855 IBM855 cp855 855 csIBM855 csPCp855 windows-855
ibm-856_P100-1995 ibm-856 IBM856 cp856 856 x-IBM856
ibm-857_P100-1995 ibm-857 IBM857 cp857 857 csIBM857 windows-857
ibm-858_P100-1997 ibm-858 IBM00858 CCSID00858 CP00858 PC-Multilingual-850+euro cp858 windows-858
ibm-860_P100-1995 ibm-860 IBM860 cp860 860 csIBM860
ibm-861_P100-1995 ibm-861 IBM861 cp861 861 cp-is csIBM861 windows-861
ibm-862_P100-1995 ibm-862 IBM862 cp862 862 csPC862LatinHebrew DOS-862 windows-862
ibm-863_P100-1995 ibm-863 IBM863 cp863 863 csIBM863
ibm-864_X110-1999 ibm-864 IBM864 cp864 csIBM864
ibm-865_P100-1995 ibm-865 IBM865 cp865 865 csIBM865
ibm-866_P100-1995 ibm-866 IBM866 cp866 866 csIBM866 windows-866
ibm-867_P100-1998 ibm-867 x-IBM867
ibm-868_P100-1995 ibm-868 IBM868 CP868 868 csIBM868 cp-ar
ibm-869_P100-1995 ibm-869 IBM869 cp869 869 cp-gr csIBM869 windows-869
ibm-878_P100-1996 ibm-878 KOI8-R koi8 csKOI8R windows-20866 cp878
ibm-901_P100-1999 ibm-901
ibm-902_P100-1999 ibm-902
ibm-922_P100-1999 ibm-922 IBM922 cp922 922 x-IBM922
ibm-1168_P100-2002 ibm-1168 KOI8-U windows-21866
ibm-4909_P100-1999 ibm-4909
ibm-5346_P100-1998 ibm-5346 windows-1250 cp1250
ibm-5347_P100-1998 ibm-5347 windows-1251 cp1251 ANSI1251
ibm-5348_P100-1997 ibm-5348 windows-1252 cp1252
ibm-5349_P100-1998 ibm-5349 windows-1253 cp1253
ibm-5350_P100-1998 ibm-5350 windows-1254 cp1254
ibm-9447_P100-2002 ibm-9447 windows-1255 cp1255
ibm-9448_X100-2005 ibm-9448 windows-1256 cp1256 x-windows-1256S
ibm-9449_P100-2002 ibm-9449 windows-1257 cp1257
ibm-5354_P100-1998 ibm-5354 windows-1258 cp1258
ibm-1250_P100-1995 ibm-1250 windows-1250
ibm-1251_P100-1995 ibm-1251 windows-1251
ibm-1252_P100-2000 ibm-1252 windows-1252
ibm-1253_P100-1995 ibm-1253 windows-1253
ibm-1254_P100-1995 ibm-1254 windows-1254
ibm-1255_P100-1995 ibm-1255
ibm-5351_P100-1998 ibm-5351 windows-1255
ibm-1256_P110-1997 ibm-1256
ibm-5352_P100-1998 ibm-5352 windows-1256
ibm-1257_P100-1995 ibm-1257
ibm-5353_P100-1998 ibm-5353 windows-1257
ibm-1258_P100-1997 ibm-1258 windows-1258
macos-0_2-10.2 macintosh mac csMacintosh windows-10000 macroman x-macroman
macos-6_2-10.4 x-mac-greek windows-10006 macgr x-MacGreek
macos-7_3-10.2 x-mac-cyrillic windows-10007 mac-cyrillic maccy x-MacCyrillic x-MacUkraine
macos-29-10.2 x-mac-centraleurroman windows-10029 x-mac-ce macce maccentraleurope x-MacCentralEurope
macos-35-10.2 x-mac-turkish windows-10081 mactr x-MacTurkish
ibm-1051_P100-1995 ibm-1051 hp-roman8 roman8 r8 csHPRoman8 x-roman8
ibm-1276_P100-1995 ibm-1276 Adobe-Standard-Encoding csAdobeStandardEncoding
ibm-1006_P100-1995 ibm-1006 IBM1006 cp1006 1006 x-IBM1006
ibm-1098_P100-1995 ibm-1098 IBM1098 cp1098 1098 x-IBM1098
ibm-1124_P100-1996 ibm-1124 cp1124 1124 x-IBM1124
ibm-1125_P100-1997 ibm-1125 cp1125
ibm-1129_P100-1997 ibm-1129
ibm-1131_P100-1997 ibm-1131 cp1131
ibm-1133_P100-1997 ibm-1133
gsm-03.38-2009 GSM0338
ISO_2022,locale=ja,version=0 ISO-2022-JP csISO2022JP x-windows-iso2022jp x-windows-50220
ISO_2022,locale=ja,version=1 ISO-2022-JP-1 JIS_Encoding csJISEncoding ibm-5054 JIS x-windows-50221
ISO_2022,locale=ja,version=2 ISO-2022-JP-2 csISO2022JP2
ISO_2022,locale=ja,version=3 JIS7
ISO_2022,locale=ja,version=4 JIS8
ISO_2022,locale=ko,version=0 ISO-2022-KR csISO2022KR
ISO_2022,locale=ko,version=1 ibm-25546
ISO_2022,locale=zh,version=0 ISO-2022-CN csISO2022CN x-ISO-2022-CN-GB
ISO_2022,locale=zh,version=1 ISO-2022-CN-EXT
ISO_2022,locale=zh,version=2 ISO-2022-CN-CNS x-ISO-2022-CN-CNS
HZ HZ-GB-2312
x11-compound-text COMPOUND_TEXT x-compound-text
ISCII,version=0 x-ISCII91 x-iscii-de windows-57002 iscii-dev ibm-4902
ISCII,version=1 x-iscii-be windows-57003 iscii-bng windows-57006 x-iscii-as
ISCII,version=2 x-iscii-pa windows-57011 iscii-gur
ISCII,version=3 x-iscii-gu windows-57010 iscii-guj
ISCII,version=4 x-iscii-or windows-57007 iscii-ori
ISCII,version=5 x-iscii-ta windows-57004 iscii-tml
ISCII,version=6 x-iscii-te windows-57005 iscii-tlg
ISCII,version=7 x-iscii-ka windows-57008 iscii-knd
ISCII,version=8 x-iscii-ma windows-57009 iscii-mlm
LMBCS-1 lmbcs ibm-65025
ibm-37_P100-1995 ibm-37 IBM037 ibm-037 ebcdic-cp-us ebcdic-cp-ca ebcdic-cp-wt ebcdic-cp-nl csIBM037 cp037 037 cpibm37 cp37
ibm-273_P100-1995 ibm-273 IBM273 CP273 csIBM273 ebcdic-de 273
ibm-277_P100-1995 ibm-277 IBM277 cp277 EBCDIC-CP-DK EBCDIC-CP-NO csIBM277 ebcdic-dk 277
ibm-278_P100-1995 ibm-278 IBM278 cp278 ebcdic-cp-fi ebcdic-cp-se csIBM278 ebcdic-sv 278
ibm-280_P100-1995 ibm-280 IBM280 CP280 ebcdic-cp-it csIBM280 280
ibm-284_P100-1995 ibm-284 IBM284 CP284 ebcdic-cp-es csIBM284 cpibm284 284
ibm-285_P100-1995 ibm-285 IBM285 CP285 ebcdic-cp-gb csIBM285 cpibm285 ebcdic-gb 285
ibm-290_P100-1995 ibm-290 IBM290 cp290 EBCDIC-JP-kana csIBM290
ibm-297_P100-1995 ibm-297 IBM297 cp297 ebcdic-cp-fr csIBM297 cpibm297 297
ibm-420_X120-1999 ibm-420 IBM420 cp420 ebcdic-cp-ar1 csIBM420 420
ibm-424_P100-1995 ibm-424 IBM424 cp424 ebcdic-cp-he csIBM424 424
ibm-500_P100-1995 ibm-500 IBM500 CP500 ebcdic-cp-be csIBM500 ebcdic-cp-ch 500
ibm-803_P100-1999 ibm-803 cp803
ibm-838_P100-1995 ibm-838 IBM838 IBM-Thai csIBMThai cp838 838 ibm-9030
ibm-870_P100-1995 ibm-870 IBM870 CP870 ebcdic-cp-roece ebcdic-cp-yu csIBM870
ibm-871_P100-1995 ibm-871 IBM871 ebcdic-cp-is csIBM871 CP871 ebcdic-is 871
ibm-875_P100-1995 ibm-875 IBM875 cp875 875 x-IBM875
ibm-918_P100-1995 ibm-918 IBM918 CP918 ebcdic-cp-ar2 csIBM918
ibm-930_P120-1999 ibm-930 ibm-5026 IBM930 cp930 930 x-IBM930 x-IBM930A
ibm-933_P110-1995 ibm-933 cp933 933 x-IBM933
ibm-935_P110-1999 ibm-935 cp935 935 x-IBM935
ibm-937_P110-1999 ibm-937 cp937 937 x-IBM937
ibm-939_P120-1999 ibm-939 ibm-931 ibm-5035 IBM939 cp939 939 x-IBM939 x-IBM939A
ibm-1025_P100-1995 ibm-1025 cp1025 1025 x-IBM1025
ibm-1026_P100-1995 ibm-1026 IBM1026 CP1026 csIBM1026 1026
ibm-1047_P100-1995 ibm-1047 IBM1047 cp1047 1047
ibm-1097_P100-1995 ibm-1097 cp1097 1097 x-IBM1097
ibm-1112_P100-1995 ibm-1112 cp1112 1112 x-IBM1112
ibm-1122_P100-1999 ibm-1122 cp1122 1122 x-IBM1122
ibm-1123_P100-1995 ibm-1123 cp1123 1123 x-IBM1123
ibm-1130_P100-1997 ibm-1130
ibm-1132_P100-1998 ibm-1132
ibm-1137_P100-1999 ibm-1137
ibm-4517_P100-2005 ibm-4517
ibm-1140_P100-1997 ibm-1140 IBM01140 CCSID01140 CP01140 cp1140 ebcdic-us-37+euro
ibm-1141_P100-1997 ibm-1141 IBM01141 CCSID01141 CP01141 cp1141 ebcdic-de-273+euro
ibm-1142_P100-1997 ibm-1142 IBM01142 CCSID01142 CP01142 cp1142 ebcdic-dk-277+euro ebcdic-no-277+euro
ibm-1143_P100-1997 ibm-1143 IBM01143 CCSID01143 CP01143 cp1143 ebcdic-fi-278+euro ebcdic-se-278+euro
ibm-1144_P100-1997 ibm-1144 IBM01144 CCSID01144 CP01144 cp1144 ebcdic-it-280+euro
ibm-1145_P100-1997 ibm-1145 IBM01145 CCSID01145 CP01145 cp1145 ebcdic-es-284+euro
ibm-1146_P100-1997 ibm-1146 IBM01146 CCSID01146 CP01146 cp1146 ebcdic-gb-285+euro
ibm-1147_P100-1997 ibm-1147 IBM01147 CCSID01147 CP01147 cp1147 ebcdic-fr-297+euro
ibm-1148_P100-1997 ibm-1148 IBM01148 CCSID01148 CP01148 cp1148 ebcdic-international-500+euro
ibm-1149_P100-1997 ibm-1149 IBM01149 CCSID01149 CP01149 cp1149 ebcdic-is-871+euro
ibm-1153_P100-1999 ibm-1153 IBM1153 x-IBM1153
ibm-1154_P100-1999 ibm-1154
ibm-1155_P100-1999 ibm-1155
ibm-1156_P100-1999 ibm-1156
ibm-1157_P100-1999 ibm-1157
ibm-1158_P100-1999 ibm-1158
ibm-1160_P100-1999 ibm-1160
ibm-1164_P100-1999 ibm-1164
ibm-1364_P110-2007 ibm-1364 x-IBM1364
ibm-1371_P100-1999 ibm-1371 x-IBM1371
ibm-1388_P103-2001 ibm-1388 ibm-9580 x-IBM1388
ibm-1390_P110-2003 ibm-1390 x-IBM1390
ibm-1399_P110-2003 ibm-1399 x-IBM1399
ibm-5123_P100-1999 ibm-5123
ibm-8482_P100-1999 ibm-8482
ibm-16684_P110-2003 ibm-16684 ibm-20780
ibm-4899_P100-1998 ibm-4899
ibm-4971_P100-1999 ibm-4971
ibm-9067_X100-2005 ibm-9067
ibm-12712_P100-1998 ibm-12712 ebcdic-he
ibm-16804_X110-1999 ibm-16804 ebcdic-ar
ibm-37_P100-1995,swaplfnl ibm-37-s390
ibm-1047_P100-1995,swaplfnl ibm-1047-s390 IBM1047_LF
ibm-1140_P100-1997,swaplfnl ibm-1140-s390
ibm-1141_P100-1997,swaplfnl ibm-1141-s390 IBM1141_LF
ibm-1142_P100-1997,swaplfnl ibm-1142-s390
ibm-1143_P100-1997,swaplfnl ibm-1143-s390
ibm-1144_P100-1997,swaplfnl ibm-1144-s390
ibm-1145_P100-1997,swaplfnl ibm-1145-s390
ibm-1146_P100-1997,swaplfnl ibm-1146-s390
ibm-1147_P100-1997,swaplfnl ibm-1147-s390
ibm-1148_P100-1997,swaplfnl ibm-1148-s390
ibm-1149_P100-1997,swaplfnl ibm-1149-s390
ibm-1153_P100-1999,swaplfnl ibm-1153-s390
ibm-12712_P100-1998,swaplfnl ibm-12712-s390
ibm-16804_X110-1999,swaplfnl ibm-16804-s390
ebcdic-xml-us

I have to continue this topic. I played a bit with uconv

I took the original file Infinitum.Subject.Unknown.2021.1080p.AMZN.WEB-DL.DDP5.1.H264-EVO.srt (which is iso-8859-1) and convert it to utf8 (new file called Infinitum.Subject.Unknown.2021.1080p.AMZN.WEB-DL.DDP5.1.H264-EVO.rou.srt)

uconv -f iso-8859-1 -t utf8 -o Infinitum.Subject.Unknown.2021.1080p.AMZN.WEB-DL.DDP5.1.H264-EVO.rou.srt Infinitum.Subject.Unknown.2021.1080p.AMZN.WEB-DL.DDP5.1.H264-EVO.srt

Now in Linux the output of file - i is like this


In Plex with the new UTF8 file…shows wrong characters

In the end my point is that this should be manage inside Plex. A low or average user will no be able to deal with these conversions

adding the files
Downloads.zip (20.9 KB)

The file is not using iso-8859-1. It uses Windows CP 1250.
There is a difference between these. They are not synonymous.

could be. but this is the Linux output of file-i

and the file showed as UTF8 is having wrong characters in subtitle

You converted it from the other file, but by telling it that the source file is iso-8859-1, instead of Windows CP 1250.
Of course it shows wrong characters then.

yes you are right I did again the conversion using this time cp1250 to utf8 and is working
but what is strange is that linux see original file as iso-8859-1 and this is why initially I did the conversion iso-8859-1 to utf8
Anyway if you can add this mess to your development backlog will be great
Thanks!

This is the problem with old text files which only use a 8bit codepage. They often don’t contain a clear indication of which codepage was used to create them.
Hence why the iconv tool has to employ heuristics and dictionaries to make an educated guess about a text file’s natural language.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.