clips 3

运维向 1

设置时区并开启 ntp 1

systemd 篇 alarm 好像并没有自动同步时间,要我们手动设置下(
依旧是 systemd 全家桶

timedatectl set-timezone Asia/Taipei
timedatectl set-ntp true
timedatectl show-timesync
timedatectl status

设置时区并开启 ntp 2

chrony 篇

接 systemd 篇 在一些场景中, chrony 可能是不太够用的,比如说我想本地搭建个 ntp 服务器,systemd 这样的简易 ntp 客户端是做不了的。
这里我们使用 chrony 进行更深一步设置。
安装就自己玩了, apt / pacman / yum 应该都有。
配置文件在 /etc/chrony.conf 注释其实很详细。
首先设定 server :

root@cn-mahiru /etc# cat /etc/chrony.conf 
server 114.118.7.161
server 114.118.7.163
server cn.ntp.org.cn

server 106.55.184.199
server ntp.tuna.tsinghua.edu.cn
server 203.107.6.88
server ntp1.aliyun.com
server ntp2.aliyun.com
server ntp3.aliyun.com
server ntp4.aliyun.com
server ntp5.aliyun.com
server ntp1.tencent.com
server ntp2.tencent.com
server ntp3.tencent.com
server ntp4.tencent.com
server ntp5.tencent.com

查看服务状态

大概随便看看,这里是备忘

开机时刻的启动任务

systemctl list-jobs

目前运行服务

systemctl list-units --type service

状态

systemctl status systemd-networkd
journalctl -u systemd-networkd
journalctl -fu systemd-networkd

修改服务

不知道为什么网传都是直接改文件,还要 daemon-reload 好麻烦

# 为原来服务增加个 override 也就是覆盖项
systemctl edit systemd-networkd
# 修改完整的文件
systemctl edit systemd-networkd --full
# 带 --force 为新建不存在的服务
systemctl edit systemd-networkd --full --force
# 查看服务文件
systemctl cat systemd-networkd

转换分区格式为 btrfs

参考 https://wiki.archlinuxcn.org/wiki/Btrfs btrfs 是什么就不用过多介绍了吧(
btrfs-progs 本身是有个转换程序 btrfs-convert 的,直接开跑就好。

首先看看目标硬盘有没有被挂载:

df -h
umount /dev/xxx1

然后开跑:

btrfs-convert /dev/xxx1

我 1T hdd 大概等了1个半小时,在这期间没有输出。

然后看看 fstab:

root@cn ~ [1]# cat /etc/fstab 
# 其它的略
/dev/sda1 /storage btrfs defaults 0 0

把 ext4 改成 btrfs

然后重启 systemd-daemon

systemctl daemon-reload
systemctl list-units --type mount # 看看挂载点的服务名字是什么
systemctl start storage.mount

确定数据都没问题的话,最后删除 ext4 快照

btrfs subvolume delete ext2_saved
btrfs balance start /storage

以下是可选步骤

然后 metadata dup 一下:

btrfs balance start -mconvert=DUP /挂载点

mtu 导致的 smtp 发件超时问题

这几天折腾 mastodon,邮件很难发得出去,抓包还是有数据的:

root@router:~# tcpdump -i wan_cf port 587 -vvv
tcpdump: listening on wan_cf, link-type RAW (Raw IP), capture size 262144 bytes

20:10:37.864060 IP (tos 0x0, ttl 62, id 14200, offset 0, flags [DF], proto TCP (6), length 60)
    xxx.46928 > ti-in-f109.1e100.net.587: Flags [S], cksum 0x1dd9 (correct), seq 4268536941, win 42780, options [mss 1380,sackOK,TS val 4041302785 ecr 0,nop,wscale 11], length 0

同时 sidekiq 提示 timeout sidekiq

然后测试端口应该是通的,但是没怀疑到 mtu 上,于是又折腾了半天,没法子了,才继续改。

# time curl https://1.1.1.1/cdn-cgi/trace
(略)
* Connection #0 to host 1.1.1.1 left intact

________________________________________________________
Executed in    4.07 secs      fish           external
  usr time   67.06 millis    0.00 micros   67.06 millis
  sys time    5.48 millis  617.00 micros    4.86 millis

可以看到,对于要握手的的 ssl,协商时间直接4秒起步了,于是继续更改 wireguard 的 mtu 为 1350 (wg 默认为 1420)

# time curl https://1.1.1.1/cdn-cgi/trace 

(略)
________________________________________________________
Executed in  174.96 millis    fish           external
  usr time   73.35 millis    0.00 micros   73.35 millis
  sys time    1.50 millis  493.00 micros    1.00 millis

握手速度非常快,看上去就是 mtu 的问题导致的。

然后和 https 请求一样,smtp 也成功发送了。

btrfs 小翻车记录

最近在升级家里 NAS,之前做了 btrfs,然后采取的是 dd 方案到备份盘里面,可能是两个盘大小不一致,checksum 什么的校验不通过,硬盘可以正常挂载:

# mount -o degraded,ro /dev/sdh2 /mnt/

但是读很多文件 io error:

# dmesg
[1952.440577] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650518, gen 0
[ 1952.441594] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 270008320 csum 0x8941f998 expected csum 0xddadd78d mirror 2
[ 1952.441599] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650519, gen 0
[ 1952.450407] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 270270464 csum 0x8941f998 expected csum 0xf8faf7ab mirror 2
[ 1952.450412] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650520, gen 0
[ 1952.451624] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 270270464 csum 0x8941f998 expected csum 0xf8faf7ab mirror 2
[ 1952.451630] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650521, gen 0
[ 1952.460320] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 270532608 csum 0x8941f998 expected csum 0x6cdba1a4 mirror 2
[ 1952.460325] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650522, gen 0
[ 1952.461816] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 270532608 csum 0x8941f998 expected csum 0x6cdba1a4 mirror 2
[ 1952.461821] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650523, gen 0
[ 1952.470776] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 270794752 csum 0x8941f998 expected csum 0x5f9e1467 mirror 2
[ 1952.470780] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650524, gen 0
[ 1952.471391] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 270794752 csum 0x8941f998 expected csum 0x5f9e1467 mirror 2
[ 1952.471398] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650525, gen 0
[ 1952.480865] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 271056896 csum 0x8941f998 expected csum 0x428d3da1 mirror 2
[ 1952.480872] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650526, gen 0
[ 1952.481625] BTRFS warning (device sdh2: state EA): csum failed root 5 ino 32264 off 271056896 csum 0x8941f998 expected csum 0x428d3da1 mirror 2
[ 1952.481628] BTRFS error (device sdh2: state EA): bdev /dev/sdh2 errs: wr 0, rd 0, flush 0, corrupt 650527, gen 0
[ 2065.475682] BTRFS info (device sdh2): using crc32c (crc32c-intel) checksum algorithm
[ 2065.475690] BTRFS info (device sdh2): allowing degraded mounts
[ 2065.475692] BTRFS info (device sdh2): using free space tree
[ 2065.487765] BTRFS warning (device sdh2): devid 2 uuid 0705f155-7d21-4da9-bedf-3167801464de is missing
[ 2132.878607] BTRFS error (device sdh2): parent transid verify failed on logical 372490240 mirror 1 wanted 15715 found 14986
[ 2132.878641] BTRFS error (device sdh2: state A): Transaction aborted (error -5)
[ 2132.878653] BTRFS: error (device sdh2: state A) in __btrfs_free_extent:3114: errno=-5 IO failure
[ 2132.878660] BTRFS info (device sdh2: state EA): forced readonly
[ 2132.878662] BTRFS error (device sdh2: state EA): failed to run delayed ref for logical 330629120 num_bytes 16384 type 176 action 2 ref_mod 1: -5
[ 2132.878669] BTRFS: error (device sdh2: state EA) in btrfs_run_delayed_refs:2182: errno=-5 IO failure
[ 2132.878675] BTRFS warning (device sdh2: state EA): Skipping commit of aborted transaction.
[ 2132.878677] BTRFS: error (device sdh2: state EA) in cleanup_transaction:1997: errno=-5 IO failure

ps:作为参考,我是只做了 RAID1 方案,对于你现实情况是,可能要先把 mdadm 先救活了在说?

感觉就是 metadata 需要救一下,然后分区尾部的一些数据可能炸了,我们这里念のため(保险起见),再 dd 一份盘到本地(恢复翻车的话可以随时重来),

# dd if=/dev/sdh of=/storage-hdd/archive-data.img bs=4M status=progress
150994944 bytes (151 MB, 144 MiB) copied, 1 s, 150 MB/s
3000525389824 bytes (3.0 TB, 2.7 TiB) copied, 18379 s, 163 MB/s
715397+1 records in
715397+1 records out
3000592982016 bytes (3.0 TB, 2.7 TiB) copied, 18380.9 s, 163 MB/s

然后跑一下 btrfs check

# btrfs check /dev/mapper/loop1p2 
Opening filesystem to check...
warning, device 2 is missing
warning, device 2 is missing
Checking filesystem on /dev/mapper/loop1p2
UUID: 4ae7d27b-c892-4ed2-8b6b-b97a82837e6e
[1/7] checking root items
parent transid verify failed on 362463232 wanted 14999 found 14986
parent transid verify failed on 362463232 wanted 14999 found 14986
Ignoring transid failure
parent transid verify failed on 261079040 wanted 15642 found 999
parent transid verify failed on 261079040 wanted 15642 found 999
Ignoring transid failure
parent transid verify failed on 363921408 wanted 15657 found 14986
parent transid verify failed on 363921408 wanted 15657 found 14986
Ignoring transid failure
parent transid verify failed on 277495808 wanted 15689 found 13418
parent transid verify failed on 277495808 wanted 15689 found 13418
Ignoring transid failure
parent transid verify failed on 822509568 wanted 15700 found 14986
parent transid verify failed on 822509568 wanted 15700 found 14986
Ignoring transid failure
parent transid verify failed on 366002176 wanted 14999 found 14986
parent transid verify failed on 366002176 wanted 14999 found 14986
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=134348800 item=66 parent level=1 child bytenr=366002176 child level=1
ERROR: failed to repair root items: Input/output error
[2/7] checking extents
parent transid verify failed on 883408896 wanted 15716 found 14987
parent transid verify failed on 883408896 wanted 15716 found 14987
Ignoring transid failure
parent transid verify failed on 871890944 wanted 15762 found 14989
parent transid verify failed on 871890944 wanted 15762 found 14989
Ignoring transid failure
parent transid verify failed on 871890944 wanted 15762 found 14989
Ignoring transid failure
parent transid verify failed on 127188992 wanted 15024 found 14989
parent transid verify failed on 127188992 wanted 15024 found 14989
Ignoring transid failure
extent back ref already exists for 245366784 parent 0 root 7
(略)
extent back ref already exists for 249921536 parent 0 root 7
extent back ref already exists for 249937920 parent 0 root 7
parent transid verify failed on 251297792 wanted 14994 found 11739
parent transid verify failed on 251297792 wanted 14994 found 11739
Ignoring transid failure
parent transid verify failed on 362463232 wanted 14999 found 14986
Ignoring transid failure
(略)
parent transid verify failed on 366002176 wanted 14999 found 14986
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=134348800 item=66 parent level=1 child bytenr=366002176 child level=1
(略)
parent transid verify failed on 366002176 wanted 14999 found 14986
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=134348800 item=66 parent level=1 child bytenr=366002176 child level=1
parent transid verify failed on 871940096 wanted 15704 found 14989
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 277495808 wanted 15689 found 13418
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 871940096 wanted 15704 found 14989
Ignoring transid failure
parent transid verify failed on 372932608 wanted 15391 found 14986
parent transid verify failed on 372932608 wanted 15391 found 14986
Ignoring transid failure
parent transid verify failed on 372932608 wanted 15391 found 14986
Ignoring transid failure
parent transid verify failed on 372932608 wanted 15391 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 373440512 wanted 15372 found 14986
Ignoring transid failure
parent transid verify failed on 883392512 wanted 15716 found 14987
parent transid verify failed on 883392512 wanted 15716 found 14987
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=134348800 item=72 parent level=1 child bytenr=883392512 child level=1
parent transid verify failed on 883392512 wanted 15716 found 14987
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=134348800 item=72 parent level=1 child bytenr=883392512 child level=1
parent transid verify failed on 883392512 wanted 15716 found 14987
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=134348800 item=72 parent level=1 child bytenr=883392512 child level=1
parent transid verify failed on 883392512 wanted 15716 found 14987
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=134348800 item=72 parent level=1 child bytenr=883392512 child level=1
parent transid verify failed on 883392512 wanted 15716 found 14987
Ignoring transid failure
fish: Job 1, 'btrfs check /dev/mapper/loop1p2' terminated by signal SIGSEGV (Address boundary error)

然后 dd 的盘自己打个快照,然后通过 kpartx -a 挂载成 /dev/loopxxx (参考 教程[19] 在 PVE 中挂载虚拟磁盘 & 跳过 Windows OOBE 关于账户创建部分

顺便安排一下数据火葬场的操作(好孩子千万不要学):

# btrfs check --repair /dev/mapper/loop1p2 
enabling repair mode
WARNING:

  Do not use --repair unless you are advised to do so by a developer
  or an experienced user, and then only after having accepted that no
  fsck can successfully repair all types of filesystem corruption. Eg.
  some software or hardware bugs can fatally damage a volume.
  The operation will start in 10 seconds.
  Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1

Starting repair.
Opening filesystem to check...
warning, device 2 is missing
warning, device 2 is missing
Checking filesystem on /dev/mapper/loop1p2
UUID: 4ae7d27b-c892-4ed2-8b6b-b97a82837e6e
[1/7] checking root items
parent transid verify failed on 362463232 wanted 14999 found 14986
parent transid verify failed on 362463232 wanted 14999 found 14986
Ignoring transid failure
parent transid verify failed on 261079040 wanted 15642 found 999
parent transid verify failed on 261079040 wanted 15642 found 999
Ignoring transid failure
parent transid verify failed on 363921408 wanted 15657 found 14986
parent transid verify failed on 363921408 wanted 15657 found 14986
Ignoring transid failure
parent transid verify failed on 277495808 wanted 15689 found 13418
parent transid verify failed on 277495808 wanted 15689 found 13418
Ignoring transid failure
parent transid verify failed on 822509568 wanted 15700 found 14986
parent transid verify failed on 822509568 wanted 15700 found 14986
Ignoring transid failure
parent transid verify failed on 366002176 wanted 14999 found 14986
parent transid verify failed on 366002176 wanted 14999 found 14986
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=134348800 item=66 parent level=1 child bytenr=366002176 child level=1
ERROR: failed to repair root items: Input/output error

觉得操作得更翻车了就恢复快照然后 sync 一下

然后安排 btrfs restore

# btrfs restore /dev/mapper/loop1p2 /storage-hdd/rescue
warning, device 2 is missing
warning, device 2 is missing
(略)
ERROR: zstd decompress failed Unknown frame descriptor

ERROR: copying data for /storage-hdd/rescue/h-anime/xxxx.mkv failed
ERROR: searching directory /storage-hdd/rescue/h-anime/xxxx.mkv failed: -1
ERROR: searching directory /storage-hdd/rescue/h-anime/xxxx.mkv failed: -1

能救回多少就自己看了(

告诉我们下次迁移操作的还是安排 rsync / rclone / btrfs send 完成,不要想凑合而安排 dd 导致数据火葬场。

PVE 安装 libguestfs-tools 的依赖问题

我在 PVE (Proxmox VE) 上安装 libguestfs-tools,出现了依赖冲突:
PVE LTT edition

什么 LTT 行为

通过检查依赖发现是 supermin 包不知道为什么强行依赖 debian 的内核包,而 pve-kernel 打包配置没有说明我也是 linux-kernel

临时解决:

apt download supermin
dpkg -i supermin*.deb
apt install libguestfs-tools

使用基本无影响。

ArchLinux 更新内核不重启继续使用旧内核

在 ArchLinux 更新内核的后 老的 linux-*-headers 也会被一起清理掉,然后有些模块就用不了了,需要重启以后才能用。
安装 kernel-modules-hook 包即可解决这个问题。
afterinstall hook

实际上就是上了个 hook 把删掉的文件重新拷贝过来。

另外推荐没有特别需求的话, linux-lts 养老算了(虽然也是很经常更新)