Fixup Epiphany Segmentation Fault Under Wayland
缘由
自从老灯切到了 Wayland, 好像 epiphany 就基本上打不开了. 由于这个浏览器平常也不怎么用, 因此也就一直没管. 今天周末, 刚好抽空简单看下.
排查
先用 gdb 看看:
❯ gdb epiphany
GNU gdb (GDB) 12.1
This GDB supports auto-downloading debuginfo from the following URLs:
https://debuginfod.archlinux.org
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
(gdb) r
Starting program: /usr/bin/epiphany
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
...
Thread 1 "epiphany" received signal SIGSEGV, Segmentation fault.
wl_resource_get_destroy_listener (resource=0x0, notify=0x7ffff1003980 <(anonymous namespace)::ClientBundleEGL::bufferDestroyListenerCallback(wl_listener*, void*)>) at ../wayland-1.21.0/src/wayland-server.c:850
850 if (resource_is_deprecated(resource))
(gdb) bt
#0 wl_resource_get_destroy_listener
(resource=0x0, notify=0x7ffff1003980 <(anonymous namespace)::ClientBundleEGL::bufferDestroyListenerCallback(wl_listener*, void*)>) at ../wayland-1.21.0/src/wayland-server.c:850
#1 0x00007ffff1005b4b in (anonymous namespace)::ClientBundleEGL::findImage (bufferResource=0x0, this=0x55555686c9c0)
at ../WPEBackend-fdo/src/view-backend-exportable-fdo-egl.cpp:270
#2 (anonymous namespace)::ClientBundleEGL::exportBuffer(wl_resource*) (this=0x55555686c9c0, bufferResource=0x0)
at ../WPEBackend-fdo/src/view-backend-exportable-fdo-egl.cpp:181
#3 0x00007ffff18cb536 in ffi_call_unix64 () at ../src/x86/unix64.S:105
#4 0x00007ffff18c8037 in ffi_call_int
(cif=<optimized out>, fn=<optimized out>, rvalue=<optimized out>, avalue=<optimized out>, closure=<optimized out>)
at ../src/x86/ffi64.c:672
#5 0x00007ffff0403ada in wl_closure_invoke (closure=closure@entry=0x555556cd4880, target=<optimized out>,
target@entry=0x555556c7ba20, opcode=opcode@entry=6, data=<optimized out>, data@entry=0x555556a18b40, flags=2)
at ../wayland-1.21.0/src/connection.c:1025
#6 0x00007ffff0408010 in wl_client_connection_data (fd=<optimized out>, mask=<optimized out>, data=<optimized out>)
at ../wayland-1.21.0/src/wayland-server.c:437
#7 0x00007ffff04069e2 in wl_event_loop_dispatch (loop=0x5555559b30a0, timeout=<optimized out>)
at ../wayland-1.21.0/src/event-loop.c:1027
#8 0x00007ffff10068c5 in operator() (__closure=0x0, base=0x555555a5f670) at ../WPEBackend-fdo/src/ws.cpp:77
#9 _FUN(GSource*, GSourceFunc, gpointer) () at ../WPEBackend-fdo/src/ws.cpp:86
#10 0x00007ffff734cc6b in g_main_dispatch (context=0x5555559210e0) at ../glib/glib/gmain.c:3417
#11 g_main_context_dispatch (context=0x5555559210e0) at ../glib/glib/gmain.c:4135
#12 0x00007ffff73a3001 in g_main_context_iterate.constprop.0
(context=context@entry=0x5555559210e0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
at ../glib/glib/gmain.c:4211
--Type <RET> for more, q to quit, c to continue without paging--c
#13 0x00007ffff734a392 in g_main_context_iteration (context=context@entry=0x5555559210e0, may_block=may_block@entry=1) at ../glib/glib/gmain.c:4276
#14 0x00007ffff750730e in g_application_run (application=0x5555559601d0, argc=argc@entry=1, argv=argv@entry=0x7fffffffd3b8) at ../glib/gio/gapplication.c:2569
#15 0x0000555555558714 in main (argc=<optimized out>, argv=<optimized out>) at ../epiphany/src/ephy-main.c:428
(gdb)
其实后台的调用栈基本上也没啥用. 简单的 segment fault 信息已经告诉我们原因了:
Thread 1 "epiphany" received signal SIGSEGV, Segmentation fault.
wl_resource_get_destroy_listener (resource=0x0, notify=0x7ffff1003980 <(anonymous namespace)::ClientBundleEGL::bufferDestroyListenerCallback(wl_listener*, void*)>) at ../wayland-1.21.0/src/wayland-server.c:850
850 if (resource_is_deprecated(resource))
调用wl_resource_get_destroy_listener
时传入的第一个 resource
参数是一个 null 指针, 导致 850 行的 resource_is_deprecated(resource)
调用 SIGSEGV 了
到这里你是不是觉得我要操刀开始去看代码了?
no. 首先,我打算把这个bt 信息给提交到 GNOME issue. 结果 Gitlab 的 issue 管理还是挺智能的. 当我敲出 "Epiphany crash Under Wayland" 时, 下面自动弹出了一些它觉得可能是同一个问题的issue, 于是我点进去了 https://gitlab.gnome.org/GNOME/epiphany/-/issues/1832
这个 issue creator 比我多做了一步, 他还尝试X11方式启动, 发现完全正常. 我试一下, GDK_BACKEND=x11 epiphany
, 结果也是完全正常的.
GNOME 那边的人回复了, Michael Catanzaro @mcatanzaro (严格上来说是在 Red Hat的人):
Hi, you'll need to report this on the wpebackend-fdo issue tracker, here. Good luck....
随即issue马上被关闭了, 并加上了 Not GNOME
label.
嗯, GNOME 的人关 issue 速度都挺快的.
上游的问题, 关了.
看上去是个悲伤的故事.
不过, 3分钟后, the guy 又回复了:
Actually, looks like it is already fixed by https://github.com/Igalia/WPEBackend-fdo/pull/176/ which is awaiting review. (CC @aperezdc)
这个修复主要就是像我们上面说的, 在调用 wl_resource_get_destroy_listener
之前, 判断第一个参数确保它不是 null 指针.
不过除了这个提交, 他还做了第二个提交:
Only delete images in releaseImage Deleting images in bufferDestroyListenerCallback is incorrect, and caused a double free.
https://github.com/Igalia/WPEBackend-fdo/pull/176/commits/fcf330cc3036808b6fb83ee7a6cef4f5ff9e00c8
所以, 专业的东西,还是得专业的人去修, 如果是由我们自己动手, 可能只会有第一个commit, 也就是 null 判断.
自己动手
等上游的上游修复, 上游再修复, package packer 再更新, 这个周期可能相当长. 好在 Arch 里面要自己编译一个带 patch 的东西是非常简单的事情. 这可能也是 Arch 最大的魔力之一吧.
下载官方的 https://github.com/archlinux/svntogit-packages/blob/packages/wpebackend-fdo/trunk/PKGBUILD 然后加一行 patch 命令即可.
至于 patch 文件, Github 的 PR 都是直接 URL 后面加上 .patch
即可取: https://github.com/Igalia/WPEBackend-fdo/pull/176.patch
From 3318283ffe62a536cfbff307c77505d848d7098f Mon Sep 17 00:00:00 2001
From: Jordy Vieira <[email protected]>
Date: Sat, 9 Jul 2022 17:17:14 -0300
Subject: [PATCH 1/2] Fix SIGSEGV
---
src/view-backend-exportable-fdo-egl.cpp | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/view-backend-exportable-fdo-egl.cpp b/src/view-backend-exportable-fdo-egl.cpp
index 09bb2bf..1a73269 100644
--- a/src/view-backend-exportable-fdo-egl.cpp
+++ b/src/view-backend-exportable-fdo-egl.cpp
@@ -267,9 +267,11 @@ class ClientBundleEGL final : public ClientBundle {
private:
struct wpe_fdo_egl_exported_image* findImage(struct wl_resource* bufferResource)
{
- if (auto* listener = wl_resource_get_destroy_listener(bufferResource, bufferDestroyListenerCallback)) {
- struct wpe_fdo_egl_exported_image* image;
- return wl_container_of(listener, image, bufferDestroyListener);
+ if (bufferResource) {
+ if (auto* listener = wl_resource_get_destroy_listener(bufferResource, bufferDestroyListenerCallback)) {
+ struct wpe_fdo_egl_exported_image* image;
+ return wl_container_of(listener, image, bufferDestroyListener);
+ }
}
return nullptr;
From fcf330cc3036808b6fb83ee7a6cef4f5ff9e00c8 Mon Sep 17 00:00:00 2001
From: Jordy Vieira <[email protected]>
Date: Sat, 9 Jul 2022 19:16:03 -0300
Subject: [PATCH 2/2] Only delete images in releaseImage
Deleting images in bufferDestroyListenerCallback is incorrect, and
caused a double free.
---
src/view-backend-exportable-fdo-egl.cpp | 5 -----
1 file changed, 5 deletions(-)
diff --git a/src/view-backend-exportable-fdo-egl.cpp b/src/view-backend-exportable-fdo-egl.cpp
index 1a73269..0031222 100644
--- a/src/view-backend-exportable-fdo-egl.cpp
+++ b/src/view-backend-exportable-fdo-egl.cpp
@@ -247,8 +247,6 @@ class ClientBundleEGL final : public ClientBundle {
void releaseImage(struct wpe_fdo_egl_exported_image* image)
{
- image->exported = false;
-
if (image->bufferResource)
viewBackend->releaseBuffer(image->bufferResource);
else
@@ -297,9 +295,6 @@ class ClientBundleEGL final : public ClientBundle {
image = wl_container_of(listener, image, bufferDestroyListener);
image->bufferResource = nullptr;
-
- if (!image->exported)
- deleteImage(image);
}
};
打包好的文件老灯放这:
https://github.com/ttys3/my-archlinux-pkgbuild/releases/tag/wpebackend-fdo-1.12.0-2
paru -U ./wpebackend-fdo-1.12.0-2-x86_64.pkg.tar.zst
安装即可.
测试下, 不再 crash 了.
Refs
https://wiki.archlinux.org/title/Debuginfod
https://github.com/archlinux/svntogit-packages/blob/packages/wpebackend-fdo/trunk/PKGBUILD