详解 Kubernetes 中的等待重试机制

2025/02/17 20:16 下午 posted in  Kubernetes

Kubernetes 中有很多需要重试的地方,比如重启 Pod、CSI 的 PVC 挂载重试等。出错重试时通常都会等待一个指数增长的时间,本文就来解析这个等待重试的机制。

Pod 的 CrashLoopBackOff 状态

经常使用 Kubernetes 的朋友应该对 CrashLoopBackOff 不陌生,这是一种较常见的 Pod 异常状态。通常发生在 Pod 内的进程启动失败或意外退出(退出码不为 0),而 Pod 的重启策略为 OnFailureAlways,kubelet 重启该 Pod 后。

该状态表示 Pod 在运行失败不断重启的循环中,而 kubelet 每次重启的时候都会等待指数级增长的时间。这个重启等待时间就是通过 backoff 实现的,以下是相关代码:

// If a container is still in backoff, the function will return a brief backoff error and
// a detailed error message.
func (m *kubeGenericRuntimeManager) doBackOff(pod *v1.Pod, container *v1.Container, podStatus *kubecontainer.PodStatus, backOff *flowcontrol.Backoff) (bool, string, error) {
	var cStatus *kubecontainer.Status
	for _, c := range podStatus.ContainerStatuses {
		if c.Name == container.Name && c.State == kubecontainer.ContainerStateExited {
			cStatus = c
			break
		}
	}

	if cStatus == nil {
		return false, "", nil
	}

	klog.V(3).InfoS("Checking backoff for container in pod", "containerName", container.Name, "pod", klog.KObj(pod))
	// Use the finished time of the latest exited container as the start point to calculate whether to do back-off.
	ts := cStatus.FinishedAt
	// backOff requires a unique key to identify the container.
	key := getStableKey(pod, container)
	if backOff.IsInBackOffSince(key, ts) {
		if containerRef, err := kubecontainer.GenerateContainerRef(pod, container); err == nil {
			m.recorder.Eventf(containerRef, v1.EventTypeWarning, events.BackOffStartContainer,
				fmt.Sprintf("Back-off restarting failed container %s in pod %s", container.Name, format.Pod(pod)))
		}
		err := fmt.Errorf("back-off %s restarting failed container=%s pod=%s", backOff.Get(key), container.Name, format.Pod(pod))
		klog.V(3).InfoS("Back-off restarting failed container", "err", err.Error())
		return true, err.Error(), kubecontainer.ErrCrashLoopBackOff
	}

	backOff.Next(key, ts)
	return false, "", nil
}

backoff 的用法

使用 backoff 的方法很简单,只需要用到 .IsInBackOffSince.Next 方法:

func startBackoff() {
	backOff := flowcontrol.NewBackOff(5*time.Second, 60*time.Second)
	backOffID := "test"

	lastDo := time.Now()
	t := time.NewTicker(1 * time.Second)
	defer t.Stop()
	for range t.C {
		if backOff.IsInBackOffSince(backOffID, lastDo) { // 判断当前是否应该执行
			continue
		}
		fmt.Printf("doing work after %s\n", time.Now().Sub(lastDo))
		backOff.Next(backOffID, time.Now()) // 标记已经执行过了
		lastDo = time.Now()
	}
}

以上代码的输出结果:

doing work after 1.001035775s
doing work after 5.999162394s
doing work after 10.9999193s
doing work after 21.000754631s
doing work after 40.999154124s
...

也可以对特定 id 重新计时:

backOff.Reset(backOffID)

将所有 id 全部清除:

backOff.GC()

backoff 的实现原理

backoff 的实现就百来行代码,短小精悍。主结构体内定义了每个 id 对应的任务执行时间和等待时间。

在记录当前执行时间时,将等待时间设置为上一次等待时间乘 2,实现等待时间指数级增长的效果:

func (p *Backoff) Next(id string, eventTime time.Time) {
	p.Lock()
	defer p.Unlock()
	entry, ok := p.perItemBackoff[id]
	if !ok || hasExpired(eventTime, entry.lastUpdate, p.maxDuration) {
		entry = p.initEntryUnsafe(id)
		entry.backoff += p.jitter(entry.backoff)
	} else {
		delay := entry.backoff * 2       // exponential
		delay += p.jitter(entry.backoff) // add some jitter to the delay
		entry.backoff = min(delay, p.maxDuration)
	}
	entry.lastUpdate = p.Clock.Now()
}

判断当前是否需要执行时,只需要判断是否到了等待时间即可:

func (p *Backoff) IsInBackOffSince(id string, eventTime time.Time) bool {
	p.RLock()
	defer p.RUnlock()
	entry, ok := p.perItemBackoff[id]
	if !ok {
		return false
	}
	if hasExpired(eventTime, entry.lastUpdate, p.maxDuration) {
		return false
	}
	return p.Clock.Since(eventTime) < entry.backoff
}