Published on

不要在生产环境中使用alpine基础镜像 -- 容器基础镜像的选择

Authors
  • avatar
    Name
    ttyS3
    Twitter

You will NOT fucked up by alpine if you do not use alpine

alpine没有docker鼓吹得那么美好

少20M体积对你来说真的很重要?

en: Do Not Use Alpine as Container Base Image in Production Environment

本文写作日期为2020年3月26日

关于server的选择

虽然说容器与host机的Linux发行版关系不大。 但是,抛开容器不说,现在,我们来做一个假设,假设我们没有使用容器,使用裸机跑Linux server, 会选择使用哪个发行版? 没错,怎么选都轮不到Alpine吧。

server的话比较注重支持期限, 基本上大版本定下来,后面升级的可能性非常小. 比如很多项目用了CentOS6,基本上不大可能会升级到CentOS7和CentOS8 那么问题来了,大版本不升级,要是安全更新也停止了,服务器就处于危险状态了.所以,支持期限比较重要.

fedora的Life Cycle是6个月发一个大版,当然不适合做server.

centos是跟rhel走,并且更新稍落后于rhel的,差不多是 10 年

debian 非LTS一般是一到两年的样子, LTS 是至少5年。https://wiki.debian.org/DebianReleases 另外需要注意的是:Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success. Debian LTS其实主要是 Freexian 在维护. https://wiki.debian.org/LTS https://wiki.debian.org/LTS/Funding 另外, Freexian 这个公司还提供付费的ELTS https://www.freexian.com/services/debian-lts.html

centos后面有rhel, ubuntu server 背后有Canonical (Canonical | The company behind Ubuntu), 这两个server版都带有商业的相关背景. 因此一些卖系统级软件和服务的公司,往往选择debian stable来发布他们的产品,比如一些NAS厂家. 网上一些开源的NAS系统实现, 很多也是基于debian的:典型的比如Open Media Vault 是基于debian的.
专门为kvm虚拟机和Linux容器而生的操作系统: Proxmox VE (一般简称为PVE) 也是基于debian的. 一个特例是:unRAID 这个收费的NAS系统,是基于Slackware Linux发行版的

ubuntu lts 支持也是5年. A new LTS version is released every two years. In previous releases, a Long Term Support (LTS) version had three years support on Ubuntu (Desktop) and five years on Ubuntu Server. Starting with Ubuntu 12.04 LTS, both versions received five years support. There is no extra fee for the LTS version; we make our very best work available to everyone on the same free terms. Upgrades to new versions of Ubuntu are and always will be free of charge. https://wiki.ubuntu.com/LTS

所以,对比之下, 不用花钱就能得到10年支持的CentOS,当然很多人喜欢. 并不是说rpm这个包管理有多优秀,事实上老灯觉得pacman才是最棒的,deb包( apt-get)次之. 如果你写过pacman的 PKGBUILD 文件你就会发现它有多简洁好用. deb包构建其实也比较方便. rpm可能是这几个里面最麻烦的.

容器基础镜像的选择

基础镜像包含了应用运行需要的环境,对于使用裸机跑Linux来说,选择哪个发行版很重要,而对于用容器跑应用来说, 选择哪个镜像作为基础镜像构建应用,以及选择哪个镜像作为基础镜像来运行应用,也显得尤其重要。

容器的出现,使得应用的部署变得非常灵活. 应用不再依赖host机的软件包版本,可以放飞自我了. 说到容器,不得不提到docker, 说到docker你又很难不看到alpine这个字眼. 如果不是docker, 我可能这辈子都不会碰alpine linux这种东西.

据说docker早期的官方镜像都是基于debian发布的,直到有一天,发生了一件事:

Hykes (Docker’s founder and CTO) 宣布docker的官方镜像要从fedora和ubuntu这种笨重的发行版迁移到轻量级的alpine linux.

Solomon Hykes not only announced that all Official Docker images will move to Alpine Linux but that @Nathanel Copa the creator of Alpine Linux joined the Docker team.

Screenshot-from-2016-02-10-10-39-52-1.png

消息来源参考: https://news.ycombinator.com/item?id=10998667 https://news.ycombinator.com/item?id=11044980

很明显,这条消息,对于Fedora或Ubuntu来说,都是非常坏的.

当然,ubuntu马上发文对此进行了回应 https://ubuntu.com/blog/docker-alpine-ubuntu-and-you

回应说,你Docker家自己的默认镜像爱用啥用啥,我当然管不着。但是有几件事我得说清楚。

(1) 如果单从pull下载数量来说, DockerHub上的Busybox (Alpine Linux) 已经以66M的数量打败了Ubuntu 的40M. 但是如果从用户的喜好程序(star数量)来看,ubuntu 比 alpien 是3.2K 比 499

(2) ubuntu minimal版的镜像的tar压缩包大小只有59 MB,这个才是在pull的时候需要传输的数据量,虽然这个root filesystem解压后会有188 MB大小.

(3) 由于Docker用的是overlay fs, 这个fs的魔法之处在于,base image其实你只需要pull一次,存储在磁盘上也只需要存储一次而已。

在这种情况下,alpine 的2M tar压缩包(解压之后大概5M)并没有什么优势. 然后拿命令示范了一下pull两个base image的时间,以及run 一下这两个base image里的/bin/true的时间,数据表示几乎是一样的。 (言外之意,你体积小个几十M,速度上好像没啥优势啊?) 老灯看来,执行一个/bin/true真不能说明什么. 如果真要比较这两个系统的效率,要跑benchmark, 并且还要关注cpu和ram的使用及负载情况. 这样才客观,所以,ubuntu的这个run的例子,不必太当真. 老灯没有做过对比,因此也不好下结论.

(4) 第四点就不翻译了,技术无关。吹了一下ubuntu这个公司有多牛逼,然后,反问了下,你找的那个 alpine 创始人Nathanel我并没听过,不知道他有啥能耐.

(5) 第五点差不比就是比较一下ubuntu和alpine里package的数量,ubuntu完全秒了你alpine. 然后继续宣传下自己: 就像发条一样,Ubuntu带来的是精准,效率和稳定. Like clockwork. Choice. Velocity. Stability. That’s what Ubuntu brings.

虽然老灯当前不怎么使用ubuntu(早期还是用过的)了,但是对于ubuntu家发的这个文章,老灯基本上是认同的. alpine 5M体积的卖点,真的没有那么美好. 由于容器是使用的overlay fs, 因此,base image体积的大小,其实关系不是很大. 当然,作为base image应该尽量精简,在一定的体积范围内,我们完全没有必要追求极致地小. 我们要关心的是什么?用这个base image构建我的应用,方便不方便?用这个base image,来运行我的服务方便不方便? 构建应用,主要是要考虑 发行版 软件包的数量 和 质量 以及更新维护频率了 (老实说,apline包的数量,远不及debian系或rhel系). 而稳定性,主要是rootfs 以及 底层库决定的,比如glibc 之类的. 方便性的话,主要考虑,基于这个base image的服务器daemon应用多不多,比如常见的nginx, mysql, mariadb, redis, mongodb, php-fpm 等

只有5M的 alpine 无疑是最小的. 但是其带来的问题,有时候远远比节省的这几十M要多很多.

alpine踩坑真实案例

坑1. musl libc 和 glibc 之间的差异

libc是底层库,底层的变动,可能会带来很多你平常使用glibc时不会遇到的麻烦。 musl libc 一般是用于嵌入式操作系统比较多。比如 OpenWRT 就是采用的 musl libc, 原因很简单, 这类系统,运行的硬件资源都是非常有限的,可能只有8M闪存,甚至更小的,4M闪存。使用glibc当然是不太现实。 把alpine 和 musl libc吹上天的,可能也只有docker官方了。

老灯的qBittorrent镜像,之前一直是使用alpine构建,多层构建之后的运行环境,也使用的是alpine. 然而有一天,突然有人报告说,qBittorrent在添加了某个种子之后,就挂了,再也起不来了。

于是我想,能不能在编译时把调试开启,并且启用stacktrace,这样程序挂掉了,至少咱能知道,它是怎么死的。

然而,在alpine容器里尝试编译用于调试的qBittorrent时,我遇到了一个麻烦.

./configure --disable-gui --enable-stacktrace --enable-debug
Project MESSAGE: Project is built in DEBUG mode.
# 遇到错误1
compiling base/bittorrent/private/bandwidthscheduler.cpp
In file included from app/main.cpp:70:
app/stacktrace.h:9:10: fatal error: execinfo.h: No such file or directory
    9 | #include <execinfo.h>
      | ^~~~~~~~~~~~
compilation terminated.
make[1]: *** [Makefile:2119: main.o] Error 1
# 查询到包名,过了第一关
https://pkgs.alpinelinux.org/contents?file=execinfo.h&path=&name=&branch=v3.11&arch=x86_64
libexecinfo-dev
# 然而还是没用
linking qbittorrent-nox
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: main.o: in function `print_stacktrace':
/tmp/qbittorrent/src/app/stacktrace.h:23: undefined reference to `backtrace'
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: /tmp/qbittorrent/src/app/stacktrace.h:32: undefined reference to `backtrace_symbols'
collect2: error: ld returned 1 exit status

也就是说,在alpine下面,即使安装了libexecinfo-dev这个包,你也没法启用stacktrace, 因为它压根就没有实现backtrace这个函数。

我们知道,alpine的特别之处在于,它没有使用标准的linux组件和标准的glibc, 而是使用了busybox + musl libc

根据前人的一些讨论,老灯最终确认了 musl libc 是没有 backtrace 支持的:

https://github.com/openalpr/openalpr/issues/566#issuecomment-34820554

Alpine comes with a different c lib - musl instead of the more common glibc. And as it turns out, musl does not support backtrace.

https://gitlab.alpinelinux.org/alpine/aports/issues/5079#note_23491

Przemysław Pawełczyk @przemoc · 4 years ago RocksDB uses backtrace() and backtrace_symbols() functions and their declarations are usually put in execinfo.h, which comes with libc headers. Gnulib provides only stubs (doing nothing) on platforms lacking it. musl libc, which is used in Alpine Linux, does not provide such functions, though. Please read http://thread.gmane.org/gmane.linux.lib.musl.general/7356/focus=7369 for more information and replies to Rich Felker mail for alternative workaround, which may be easier to go with, because libunwind is already in testing repository. (from redmine: written on 2016-02-28)

没错,这些issue都是四五年前的了,至今无解.

坑2. 少数派的麻烦

通常大说libc, 基本上是指的glibc, 问题是,很多库的作者在写库的时候,可能从来就没考虑过musl libc这种东西。

有一个比较典型的例子是 docker 官方发布的 PHP 镜像。 https://github.com/docker-library/php/issues/240

这个镜像有什么问题呢? 最常用的 iconv 函数不能正常工作。这个问题在任何基于GNU LIBC的Linux发行版上都是不可能存在的。

老灯也不清楚,alpine上面的musl libc 在处理字符转换时是怎样的一个实现,很可能它压根就没有实现。

反正,最终的解决办法还是:只能从edge仓库安装 GNU libiconv (这是何苦呢?放着 GNU libc不用,完了 musl libc下面的东西搞不定,还得找GNU借内裤(lib))

RUN apk add --no-cache --repository http://dl-3.alpinelinux.org/alpine/edge/testing gnu-libiconv
ENV LD_PRELOAD /usr/lib/preloadable_libiconv.so php

没错,这个方法解决了问题,但是只能算是一个 workaround 或者说 hacking.

除了我所遇到的问题,还有其它人也遇到了:

  1. python相关的: 使用Alpine构建 Python docker容器速度慢了50倍

https://pythonspeed.com/articles/base-image-python-docker-images/

Using Alpine can make Python Docker builds 50× slower by Itamar Turner-Trauring Last updated 10 Feb 2020, originally created 29 Jan 2020 https://pythonspeed.com/articles/alpine-docker-python/

  1. elastic: 从alpine base image切换到CentOS 7 https://github.com/elastic/elasticsearch-docker/issues/44 New Common Docker Base OS: CentOS 7 https://www.elastic.co/blog/docker-base-centos7

  2. small is not all that matters http://crunchtools.com/comparison-linux-container-images/ Which C library, package format and core utilities are used, may be more important than you think. Most distributions use the same tools, but Alpine Linux has special versions of all of these, for the express purpose of a making a small distribution. But, small is not all that matters.

Changing core libraries and utilities can have a profound effect on what software will compile and run. It can also affect performance, security and cause discreet failure with the large and complex software stacks that are common today. Distributions have tried moving to smaller C libraries, and eventually moved back to glibc. The Debian Project and Elastic are two examples. Glibc just works, and it works everywhere, and it has had a profound amount of testing and usage over the years. It’s a similar story with GCC – tons of testing and automation.

  1. Debian is switching (back) to GLIBC Five years ago Debian and most derivatives switched from the standard GNU C Library (GLIBC) to the Embedded GLIBC (EGLIBC). Debian is now about to take the reverse way switching back to GLIBC, as EGLIBC is now a dead project, the last release being the 2.19 one. At the time of writing the glibc package has been uploaded to experimental and sits in the NEW queue. https://blog.aurel32.net/175

  2. DNS 和 其它不兼容问题

https://github.com/gliderlabs/docker-alpine/blob/master/docs/caveats.md#dns

DNS

One common issue you may find is with DNS. musl libc does not use domain or search directives in the /etc/resolv.conf file. For example, if you started your Docker daemon with --dns-search=service.consul, and then tried to resolve consul from within an Alpine Linux container, it would fail as the name consul.service.consul would not be tried. You will need to work around this by using fully qualified names.

Another difference is parallel querying of name servers. This can be problematic if your first name server has a different DNS view (such as service discovery through DNS). For example, if you started your Docker daemon with --dns=172.17.42.1 --dns=10.0.2.15 where 172.17.42.1 is a local DNS server to resolve name for service discovery and 10.0.2.15 is for external DNS resolving, you wouldn't be able to guarantee that 172.17.42.1 will always be queried first. There will be sporadic failures.

In both of these cases, it can help to run a local caching DNS server such as dnsmasq, that can be used for both caching and search path routing. Running dnsmasq with --server /consul/10.0.0.1 would forward queries for the .consul to 10.0.0.1. Incompatible Binaries

While there are binaries that will run on musl libc without needing to be recompiled, you will likely encounter binaries and applications that rely on specific glibc functionality that will fail to start up. An example of this would be Oracle Java which relies on specific symbols only found in glibc. You can often use ldd to determine the exact symbol:

# ldd bin/java
    /lib64/ld-linux-x86-64.so.2 (0x7f542ebb5000)
    libpthread.so.0 => /lib64/ld-linux-x86-64.so.2 (0x7f542ebb5000)
    libjli.so => bin/../lib/amd64/jli/libjli.so (0x7f542e9a0000)
    libdl.so.2 => /lib64/ld-linux-x86-64.so.2 (0x7f542ebb5000)
    libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7f542ebb5000)
Error relocating bin/../lib/amd64/jli/libjli.so: __rawmemchr: symbol not found

In this case, the upstream would need to remove the support for this offending symbol or have the ability to compile the software natively on musl libc. Be sure to check the Alpine Linux package index to see if a suitable replacement package already exists.

alpine的体积优势可以忽略

那么,debian stable呢? slim版(去除了man等一些不必要的东西)压缩后的体积也才26MB. 比如 https://hub.docker.com/layers/debian/library/debian/buster-slim/images/sha256-55d08af3a56d2adc81562f5b811b239e773df6037d4f7e41458ded660416b0cc?context=explore Size 26.46 MB 注:当前(2020年3月26日)stable 版本为 Debian 10 ("buster")

然后我们看下目前alpine最新稳定版 3.11 https://hub.docker.com/layers/alpine/library/alpine/3.11/images/sha256-04d04970e33c492fa411b508455d02a85978492db0403b8a714f365432c04f1c?context=explore Size 2.69 MB 没错,压缩后才2.69MB, 确实够小. 但是debian buster slim 多出的24MB 就很多么?以当前机械硬盘动不动就是上TB的容量来说,完全没啥区别吧. 即使是SSD, 现在也基本上是往480GB 以上的买了. 另一方面, 由于容器(比如podman, docker) 都是用的overlay fs, 镜像是分层的,同一个 base image 是可以在各个镜像之间共享的. 多个镜像并不会占用多倍的体积. 所以,对于base image来说,体积真的是不需要考虑的东西.

那么要考虑什么? package的更新维护及运行效率和资源占用,尤其是针对CPU比较弱的机器(比如arm64设备)

docker官方也不是一味地使用alpine

我们可以看看docker官方发布的镜像偏爱什么发行版?(只考虑linux系统的)

golang: debian 9 stretch 和 debian 10 buster 还有alpine 3.10和 3.11 注意: 当前的golnag默认镜像,是基于debian的,并不是alpine! docker run --rm -ti golang:1.14 cat /etc/os-release 便可知。 https://hub.docker.com/_/golang

rust: debian 9 stretch 和 debian 10 buster 及各自的slim版,还有alpine 3.10和 3.11 https://hub.docker.com/_/rust

我们看看python: https://hub.docker.com/_/python 基本上是debian 9 stretch 和 debian 10 buster 还有alpine 3.10和 3.11,都是这两个发行版当前的主流版本.

java: debian 9 stretch 和 debian 10 buster 及各自的slim版,还有alpine 3.10和 3.11 https://hub.docker.com/_/openjdk https://github.com/docker-library/docs/blob/master/openjdk/README.md#supported-tags-and-respective-dockerfile-links

php: debian 9 stretch 和 debian 10 buster 还有alpine 3.10和 3.11 https://github.com/docker-library/docs/blob/master/php/README.md#supported-tags-and-respective-dockerfile-links

node.js: debian 9 stretch 和 debian 10 buster 及各自的slim版,还有alpine 3.10和 3.11 https://hub.docker.com/_/node

nginx: 基本上是alpine https://hub.docker.com/_/nginx

apache httpd: 基本上是alpine https://hub.docker.com/_/httpd

jenkins: alpine https://hub.docker.com/_/jenkins phpmyadmin: alpine https://hub.docker.com/r/phpmyadmin/phpmyadmin/dockerfile

mysql数据库: debian:buster-slim https://github.com/docker-library/mysql/blob/d284e15821ac64b6eda1b146775bf4b6f4844077/8.0/Dockerfile

mariadb: ubuntu bionic https://hub.docker.com/_/mariadb

redis: debian buster 和 apline 3.11 https://hub.docker.com/_/redis

mongo: https://hub.docker.com/_/mongo ubuntu xenial 和 bionic

从上面这些可以看出什么?虽然docker极力想要把base image从ubuntu迁移到alpine, 但是考虑到用户的选择问题,还是做出了妥协, 既然不想再用fedora和ubuntu了,那么就用一下debian吧.

老灯推荐

无论是构建应用,还是运行应用,老灯觉得以下几个镜像都比较适合生产环境:

debian-slim https://hub.docker.com/_/debian

FROM debian:buster-slim

ubuntu https://hub.docker.com/_/ubuntu

FROM ubuntu:18.04
FROM ubuntu:20.04

centos https://hub.docker.com/_/centos

FROM centos:8.2.2004

对于日常使用来说,fedora也可以:

fedora https://hub.docker.com/_/fedora

FROM fedora:32

Fedora 官方 registry 还有mininal版 https://registry.fedoraproject.org/

FROM registry.fedoraproject.org/fedora-minimal:latest

任何时候在生产环境都不要上alpine, 如果你真的差这20M,别用docker了,直接编译了加个systemd跑服务吧