mirror of
https://github.com/MidnightCommander/mc
synced 2025-01-05 11:04:42 +03:00
7f9b333861
When content of a large directory is being sorted by file names, a significant amount of CPU time is spent in str_utf8_normalize() that is called from str_utf8_create_key_gen(). For example, /usr/bin/ contains 5437 files on my Archlinux box. Running mc /usr/bin/ /usr/bin/ takes approx. 75 000 000 CPU instructions to sort file names, or 25% of total program run time. From these 75 000 000 instructions, 42 500 000 instruction are spent in str_utf8_normalize(). str_utf8_normalize() uses g_utf8_normalize() to do the work. g_utf8_normalize() is a heavyweight function, that converts UTF-8 into UCS-4, does the normalization and then converts UCS-4 back into UTF-8. Since file names are composed of ASCII characters in most cases, we can speed up str_utf8_normalize() by checking if the heavyweight Unicode normalization is actually needed. Normalization of ASCII string is no-op, so it is effectively "normalized" by just strdup(). With this patch, running mc /usr/bin/ /usr/bin/ requires just 37 000 000 instructions to sort the file names (down from 75 000 000) and 4 500 000 instuctions to do str_utf8_normalize() (down from 42 500 000). Signed-off-by: Andrew Borodin <aborodin@vmail.ru> |
||
---|---|---|
.. | ||
Makefile.am | ||
replace.c | ||
strescape.c | ||
strutil8bit.c | ||
strutil.c | ||
strutilascii.c | ||
strutilutf8.c | ||
strverscmp.c | ||
xstrtol.c |