PLZ!!! fix library index flagging/grouping of CJK chars (or other non ascii titles)

ENiGMA · February 12, 2018, 6:19am

hi, plex devs, thanks for great work and plz make it better with improved foreign language support.

there’s been some problems with library’s index flag of foreign characters, i ignored that issue for a while but a post from another user @Spelopp reminded me to report this.

i’m not sure if it’s only CJK related or every non-ascii characters are influenced, but i guess it could be the latter.
anyways, in case of CJK characters, there’s their own proper ways to handle index titles because there’s too many characters to deal with.

for example, in korean, chars from ‘가’ (0xAC00) to ‘깋’ (0xAE4B) should be indexed under ‘ㄱ’ (0x3131) or ‘가’ (0xAC00) flag, chars from ‘나’ (0xB098) ~ ‘닣’ (0xB2E3) should be indexed under ‘ㄴ’ (0x3134) or ‘나’ (0xB098) flag and so on. so we can have only 24 korean index flags than current random from 11172 flags.

*here’s how it works.
korean chars are displayed as a combination of ‘jamo’ letters, in order of ‘cho-seong’ (first sound, FS), ‘joong-seong’ (mid sound, MS), ‘jong-seong’ (end sound, ES).

unicode value of each chars are determined as follows.

*index of ‘cho-seong’ / FS

ㄱ	ㄲ	ㄴ	ㄷ	ㄸ	ㄹ	ㅁ	ㅂ	ㅃ	ㅅ	ㅆ	ㅇ	ㅈ	ㅉ	ㅊ	ㅋ	ㅌ	ㅍ	ㅎ
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18

*index of ‘joong-seong’ / MS

ㅏ	ㅐ	ㅑ	ㅒ	ㅓ	ㅔ	ㅕ	ㅖ	ㅗ	ㅘ	ㅙ	ㅚ	ㅛ	ㅜ	ㅝ	ㅞ	ㅟ	ㅠ	ㅡ	ㅢ	ㅣ
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20

*index of ‘jong-seong’ / ES

null	ㄱ	ㄲ	ㄳ	ㄴ	ㄵ	ㄶ	ㄷ	ㄹ	ㄺ	ㄻ	ㄼ	ㄽ	ㄾ	ㄿ	ㅀ	ㅁ	ㅂ	ㅄ	ㅅ	ㅆ	ㅇ	ㅈ	ㅊ	ㅋ	ㅌ	ㅍ	ㅎ
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27

unicode.value = 0xAC00 + [FS]*0x24C(588=21*28) + [MS]*0x1C(28) + [ES]

for example,
'글' = 'ㄱ' + 'ㅡ' + 'ㄹ' = 0xAC00 + [0]*0x24C + [18]*0x1C + [8] = 0xAE00
every chars start with FS ‘ㄱ’ are in range of ‘가’ (= ‘ㄱ’ + ‘ㅏ’ + ‘null’ = 0xAC00) to ‘깋’ (= ‘ㄱ’ + ‘ㅣ’ + ‘ㅎ’ = 0xAE4B)
and should be grouped and indexed under ‘ㄱ’ (0x3131) or ‘가’ (0xAC00)

here’s why this has to be fixed.

in case of this small library, there’s some korean index flags.
however, if it was properly flagged, items indexed under ‘박’ and ‘버’ in above picture would be indexed under ‘ㅂ’ or ‘바’
and items under ‘아’, ‘악’, ‘에’, ‘원’, ‘윤’ and ‘이’ would be indexed under ‘ㅇ’ or ‘아’.

8 index flags vs 2 in this case, 11172 vs 19 in theory.
according to dictionary index scheme instead of random chars.
isn’t it obvious that it has to be fixed to be useful?

I don’t know how other unicode characters are handled, but like in korean,
japanese titles index should be grouped as 10(or 11, maybe?) chars not 50, as follows.

	a	i	u	e	o
1: あ / ∅	あ	い	う	え	お
2: か / K	か	き	く	け	こ
3: さ / S	さ	し	す	せ	そ
4: た / T	た	ち	つ	て	と
5: な / N	な	に	ぬ	ね	の
6: は / H	は	ひ	ふ	へ	ほ
7: ま / M	ま	み	む	め	も
8: や / Y	や		ゆ		よ
9: ら / R	ら	り	る	れ	ろ
10: わ / W	わ	ゐ		ゑ	を
11: ん / ng	ん

thanks.

Topic		Replies	Views
File male indicizzati Italiano - Italian	14	706	December 24, 2018
Bug: Non-English alphabetical sorting has gone out of whack Plex Media Server server-windows , plex-web	28	294	October 10, 2025
Better Korean Alphabetical Scrolling Metadata & Adding Files server-synology , library-management	1	17	August 27, 2025
Korean Sorttitle Generator (한국어 정렬명 생성용 보조 에이전트) 한국어 - Korean scanner-agent-dev , library-management	1	1526	March 16, 2019
Can you replace the default CJK fonts？ NAS & Devices server-synology	28	1114	January 8, 2020

PLZ!!! fix library index flagging/grouping of CJK chars (or other non ascii titles)

Related topics