From a weird Dolphin bug 从一个奇怪的 Dolphin bug 说起
From Dolphin to KIO从 Dolphin 到 KIO
KFileItemActions
. However, that seems not to
be the real source of error. The KFileItemList there is already wrong,
with all non-ascii characters turned into question marks. Further I looked
into the logic of obtaining the content of a directory: KIO invokes a
plugin called file.so
, for local filesystem, and that, which is actually
a kdeinit program, run as another process, does the real work. Then the
data is transferred back to Dolphin using a QDataStream. And in file.so
it seems that everything looks correct.KFileItemActions
里。然而,那好像并不是
真的错因。那里的 KFileItemList 已经错了,非 ascii 的字符全变问号了。再往下,我
看了看获取目录内容的逻辑:对于本地文件系统,KIO 调用了一个唤作 file.so
的插件,
而那插件实际上是个 kdeinit 程序,作为另一个进程,做了实事的了。然后数据通过一个
QDataStream 传到 Dolphin 里。在 file.so
里,一切看着都很对。LegacyCodec
in file.so
.
That is not the cause of the problem, of cource, as in there it is
nothing wrong. But it reminded me of the QTextCodec
problem. There
is something Qt used to convert its UTF-16 representation into other
encodings. And that is called QTextCodec::codecForLocale()
. An
observation is that in file.so
, the codec name is called UTF-8
;
while in Dolphin, it is called US-ASCII
, unless I set LC_ALL
to a UTF-8 locale.file.so
里有一 LegacyCodec
。
当然这不是真的错因,因为那边没啥问题啊。但是这教我注意到 QTextCodec
的问题了。
有个玩意儿是 Qt 用来把它自己的 UTF-16 表示法转成别的编码的。那玩意儿叫
QTextCodec::codecForLocale()
。观察到 file.so
里的 codec 名字叫
UTF-8
;跑 Dolphin 里它变成 US-ASCII
了,除非我把 LC_ALL
设成一个
UTF-8 的 locale。icu and Qt bugsicu 和 Qt 的 bug 们
ucnv_getDefaultName()
to obtain the name of the encoding.
That function is from icu. In their bug tracker, I found
one that says:
You may need to call
setlocale(LC_ALL, "");
from your own code before ICU.
ucnv_getDefaultName()
的去获取编码的名字。
这函数 icu 里的。在伊们的 bug 追踪器上,我看到一个说:
在 ICU 之前,您可能得从您自己的代码里调用
setlocale(LC_ALL, "");
。
But unfortunately, it seems the Qt person does not want to fix it. Ok, I guess, then what is callingWhen using ICU, Qt may call
ucnv_getDefaultName()
without callingsetlocale(.., "")
first, which is required before callingucnv_getDefaultName()
. One such possible path is:QString::fromLocal8Bit() -> QTextCodec::codecForLocale() -> QIcuCodec::defaultCodecUnlocked() -> ucnv_getDefaultName()
setlocale()
is called inQCoreApplicationPrivate::initLocale()
, which may not have been called.
ucnv_getDefaultName()
? At least,
if they do not fix it, I must fix it.用 ICU 时,Qt 可能会调用
ucnv_getDefaultName()
而先前却没有调用过setlocale(.., "")
。后者则是在调用ucnv_getDefaultName()
之前必须得调用的. 一个可能的途径是:QString::fromLocal8Bit() -> QTextCodec::codecForLocale() -> QIcuCodec::defaultCodecUnlocked() -> ucnv_getDefaultName()
setlocale()
是在QCoreApplicationPrivate::initLocale()
里调用的。但这玩意儿可能还没被调用呢。
How to track down a function call怎么追踪一个函数调用
fromLocal8Bit
in the KIO source.
Yields nothing useful. Then it occurred to me that I could use LD_PRELOAD
trick.
From proxychains
to valgrind
, they all use LD_PRELOAD
to override an existing
function. If I override ucnv_getDefaultName()
to have it call std::terminate()
,
I could get a backtrace using gdb. So here it is:fromLocal8Bit
。并没有用。然后我突然想起来
LD_PRELOAD
的伎俩了。从 proxychains
到 valgrind
,都用了 LD_PRELOAD
去覆盖
一个已有的函数。如果我把 ucnv_getDefaultName()
覆盖了,让它调用 std::terminate()
,
我便能用 gdb 整个 backtrace 出来。所以就这样了:1 |
|
g++ ucnv-override.cpp -shared -fPIC -o ucnv-override.so
and set LD_PRELOAD=/path/to/ucnv-override.so
. Use gdb to invoke dolphin.
Got a backtrace. And it's there:g++ ucnv-override.cpp -shared -fPIC -o ucnv-override.so
编译那文件,
设下 LD_PRELOAD=/path/to/ucnv-override.so
。用 gdb 启动 dolphin。
弄出一个 backtrace。搁这儿呢:1 | #26 0x00007ffff582370b in QLoggingCategory::init(char const*, QtMsgType) () from /usr/lib64/libQt5Core.so.5 |
main
, no statement from Dolphin. It is in the
initialization of a global variable. And that is:main
的语句。没有 Dolphin 的语句。这玩意是在初始化全局变量的。
呐看:1 | static QLoggingCategory category("kf.kio.widgets.kdirmodel", QtInfoMsg); |
QLoggingCategory
. But Krita uses that for debug too, while I
do not have any issues dealing with Chinese file names? It turns out
that we have been using the Q_LOGGING_CATEGORY
macro, and "The
implicitly-defined QLoggingCategory object is created on first use,
in a thread-safe manner." (from Qt doc) So in that case,
the initialization is always run after entering main()
, where
setlocale()
has been called already.QLoggingCategory
。但 Krita 也用它做调试啊,我也没发现处理中文文件名有
啥问题啊?原来是我们用的是 Q_LOGGING_CATEGORY
宏,而「这隐式定义的
QLoggingCategory 对象在初次使用时给创建的,这过程还线程安全。」(来自
Qt 文档)所以那情况下,初始化肯定在进了 main()
,setlocale()
被调用过了的时候才运行了的。Q_LOGGING_CATEGORY(category, "kf.kio.widgets.kdirmodel", QtInfoMsg)
,
and everything went back to normal. Q_LOGGING_CATEGORY(category, "kf.kio.widgets.kdirmodel", QtInfoMsg)
,
啥东西都正常了。