K8S使用cert-manager自动续订SSL证书
安装cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.3/cert-manager.yaml
kubectl get pods --namespace cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-5c6866597-zw7kh 1/1 Running 0 2m
cert-manager-cainjector-577f6d9fd7-tr77l 1/1 Running 0 2m
cert-manager-webhook-787858fcdb-nlzsq 1/1 Running 0 2m配置 Let's Encrypt 颁发者
Let's Encrypt 生产颁发者有非常严格的速率限制。 当进行实验和学习时,很容易达到这些极限。由于这种风险, 我们将从 Let's Encrypt 暂存颁发者开始,一旦我们对它的工作感到满意 我们将切换到生产颁发者。
请注意,你将看到有关暂存颁发者颁发的不受信任证书的警告,但这完全是意料之中的。
颁发者有两种,一种是`Issuer`,一种是`ClusterIssuer`,前者只能在单个命名空间使用,后者可以跨集群使用,yaml文件是一致的,只是在前面要声明`kink`是`Issuer`还是`ClusterIssuer`。质询方式
使用Let's Encrypt 申请证书需要验证对域名的所有权,验证方式分两种,一种是http01一种是dns01
http01
http01在申请证书的过程中,cert-manager 会在你的 Kubernetes 集群中创建一个临时的 HTTP 服务,该服务用于响应 Let's Encrypt 的验证请求。Let's Encrypt 的服务器将尝试通过公开的 Internet 访问你指定的域名,以获取这个服务上的特定验证文件。如果它能成功访问并验证这个文件,证书发放过程就会继续。
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt-staging
spec:
acme:
# The ACME server URL
# 有staging的表示为临时用,不会因为请求频率导致被限制
# server: https://acme-v02.api.letsencrypt.org/directory
server: https://acme-staging-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: [email protected]
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-staging
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
ingressClassName: nginx # Ingress-controller,因为我们用的是nginx-ingress-controllerdns01
如果你的环境不能公开访问互联网,则需要考虑使用 dns01 验证方式,这种方式通过 DNS 记录来验证域名所有权,不需要直接的外网访问。
支持dns01质询的有以下服务商
ACMEDNS
Akamai
AzureDNS
CloudFlare
Google
Route53
DigitalOcean
RFC2136
用CloudFlare平台进行示范
需要域名本身就在CloudFlare上注册的,或者将域名的DNS改到CloudFlare进行管理
申请API Token:
My Profile -- API Tokens
分配权限时,按下面列出的几项进行分配即可。
Permissions
Zone - DNS - Edit
Zone - Zone - Read
Zone Resources:
Include - All Zones
ClusterIssuer或Issuer的yaml配置文件示例
cloudflare的api secret所在的命名空间要保持跟cert-manager pod同一命名空间或在kube-system命名空间(优先前者)。否则会报“error getting cloudflare secret: secrets \"cloudflare-api-key-secret\" not found"”apiVersion: v1
kind: Secret
metadata:
name: cloudflare-api-key-secret
namespace: kube-system
type: Opaque
stringData:
api-key: <token-str>
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt-staging
namespace: kube-system # 如果是选了Issuer,则这里跟要部署的服务在同一命名空间,推荐用ClusterIssuer
spec:
acme:
# server: https://acme-staging-v02.api.letsencrypt.org/directory
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected] # 收取let's encrypt邮件的邮箱
privateKeySecretRef:
name: letsencrypt-staging
solvers:
- dns01:
cloudflare:
email: [email protected] # cloudflare账号的邮箱
apiKeySecretRef:
name: cloudflare-api-key-secret
key: api-key不支持直接通过DNS01质询的配置示例
有些域名服务商的DNS服务不支持直接通过DNS01质询的,服务则需要增加一个webhook服务进行中转。例如阿里云DNS。
alidns不支持直接使用dns01质询,需要先部署一个alidns webhook的服务来处理。
这是Cert-Manager与阿里云DNS(又名 AliDNS)一起使用的Webhook实现方法。
申请API key/secret:
到阿里云账号的AccessKey页面中去申请。
分配权限时,给予DNS相关的所有权限
部署adlidns webhook
需要用到helm且是3版本,需要提前安装。
helm repo add cert-manager-alidns-webhook https://devmachine-fr.github.io/cert-manager-alidns-webhook
helm repo update
helm pull cert-manager-alidns-webhook/alidns-webhook
tar -zxf alidns-webhook-0.7.0.tgz修改一下values.yaml中的几个参数。用来适配我们的环境
# groupName: example.com
# certManager:
# namespace: cert-manager
# serviceAccountName: cert-manager
# image:
# repository: ghcr.io/devmachine-fr/cert-manager-alidns-webhook/cert-manager-alidns-webhook
# tag: 0.2.0
# pullPolicy: IfNotPresent
# privateRegistry:
# enabled: false
# dockerRegistrySecret: alibaba-container-registry
# 上面是原配置,下面是修改后
groupName: mydomain.com # 后面的ClusterIssuer中会用到
certManager:
namespace: cert-manager
serviceAccountName: cert-manager
# 改一下镜像仓库地址,最好是拉到本地传到私有仓库,同时记得改一下私有仓库配置privateRegistry的认证信息。
image:
repository: hub.mydomain.com/cert-manager-alidns-webhook
tag: 0.2.0
pullPolicy: IfNotPresent
privateRegistry:
enabled: true
dockerRegistrySecret: default-secret使用helm部署alidns webhook服务
helm install -f alidns-webhook/values.yaml alidns-webhook alidns-webhook/
helm list | grep alidns-webhook配合alidns-webhook 使用dns01的ClusterIssuer yaml配置文件
dns的api secret所在的命名空间要保持跟cert-manager pod同一命名空间。否则会报“error getting secret: secrets "……" not found"”
apiVersion: v1
kind: Secret
metadata:
name: alidns-secrets
namespace: cert-manager
type: Opaque
stringData:
# aliyun dnsuser key/secret
access-token: xxx
secret-key: xxxxx
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod-dns01
spec:
acme:
email: [email protected]
# 有staging的server表示为临时验证用,不会因为请求频率导致被限制
# server: https://acme-staging-v02.api.letsencrypt.org/directory
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-prod-dns01
solvers:
- dns01:
webhook:
config:
accessTokenSecretRef:
key: access-token
name: alidns-secrets
regionId: cn-beijing
secretKeySecretRef:
key: secret-key
name: alidns-secrets
groupName: mydomain.com # groupName必须与webhook配置中的一致(查看alidns-webhook的values.yaml中的配置)!
solverName: alidns-solver配置Ingress TLS
在配置Ingress TLS之前,还要为对应的Ingress创建一个Certificate资源,用来自动管理签发的证书,并保存到对应的Secret中
Certificate示例
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: test-ingress-ssl
namespace: kube-system
spec:
secretName: test-ingress-ssl # 会自动管理的secret名称,在Ingress中会调用
issuerRef:
name: letsencrypt-staging # 这个名称要跟Issuer中的“privateKeySecretRef”名称一致
kind: Issuer
dnsNames:
- abc.kaside365.com # 要申请SSL证书的域名Ingress示例
单域名配置
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: test-ingress-ssl
namespace: kube-system
annotations:
# 声明一下使用的Issuer名称
cert-manager.io/issuer: "letsencrypt-staging"
spec:
ingressClassName: nginx
tls:
- secretName: test-ingress-ssl # Certificate中配置的secretName值
hosts:
- abc.kaside365.com
rules:
- host: abc.kaside365.com
http:
paths:
- backend:
service:
name: web-1
port:
number: 80
path: /
pathType: Prefix多域名配置
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: test-ingress-ssl
namespace: kube-system
annotations:
# 声明一下使用的Issuer名称
cert-manager.io/issuer: "letsencrypt-staging"
spec:
ingressClassName: nginx
tls:
- secretName: test-ingress-ssl # Certificate中配置的secretName值
hosts:
- abc.kaside365.com
secretName: test-ingress-ssl
hosts:
- cdb.kaside365.com
secretName: test-ingress-ssl # 如果是同一个服务,多个域名可以共用一个Certificate配置
rules:
- host: abc.kaside365.com
http:
paths:
- backend:
service:
name: web-1
port:
number: 80
path: /
pathType: Prefix
- host: cdb.kaside365.com
http:
paths:
- backend:
service:
name: web-2
port:
number: 80
path: /
pathType: Prefix
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: test-ingress-ssl
namespace: kube-system
spec:
secretName: test-ingress-ssl # 会自动管理的secret名称,在Ingress中会调用
issuerRef:
name: letsencrypt-staging # 这个名称要跟Issuer中的“privateKeySecretRef”名称一致
kind: Issuer
dnsNames:
- abc.kaside365.com # 要申请SSL证书的域名
- cdb.kaside365.com # 要申请SSL证书的域名验证、排错
# 查看certificate的状态是Flase,因为还未完成申请,如果这个等待时间过长,可以通过下面的几个操作查看原因
kubectl get certificate -A
NAMESPACE NAME READY SECRET AGE
kube-system test-ingress-ssl False test-ingress-ssl 1m
# 查看certificate请求证书的详细描述
kubectl describe -n kube-system certificate test-ingress-ssl
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Issuing 1m cert-manager-certificates-trigger Issuing certificate as Existing issued Secret is not up to date for spec: [spec.commonName spec.dnsNames]
Normal Reused 1m cert-manager-certificates-key-manager Reusing private key stored in existing Secret resource "test-ingress-ssl"
Normal Requested 1m cert-manager-certificates-request-manager Created new CertificateRequest resource "test-ingress-ssl-1"
# 查看certificaterequest请求证书的详细描述
kubectl describe n kube-system certificaterequest test-ingress-ssl-1
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitingForApproval 15s cert-manager-certificaterequests-issuer-acme Not signing CertificateRequest until it is Approved
Normal WaitingForApproval 15s cert-manager-certificaterequests-issuer-venafi Not signing CertificateRequest until it is Approved
Normal WaitingForApproval 15s cert-manager-certificaterequests-issuer-selfsigned Not signing CertificateRequest until it is Approved
Normal WaitingForApproval 15s cert-manager-certificaterequests-issuer-vault Not signing CertificateRequest until it is Approved
Normal WaitingForApproval 15s cert-manager-certificaterequests-issuer-ca Not signing CertificateRequest until it is Approved
Normal cert-manager.io 14s cert-manager-certificaterequests-approver Certificate request has been approved by cert-manager.io
Normal IssuerNotReady 14s cert-manager-certificaterequests-issuer-acme Referenced issuer does not have a Ready status condition
# 查看cert-manager pod的日志,有condition "Ready": "False" -> "True"; 这样的消息,基本就妥了
k logs -f --tail=20 -n cert-manager cert-manager-7d75f47cc5-dgbtb
I1221 06:20:10.269145 1 conditions.go:192] Found status change for Certificate "test-ingress-ssl" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2023-12-21 06:20:10.269136695 +0000 UTC m=+1399633.444241060
# 最后看certificate的状态从Flase变成True
kubectl get certificate -A
NAMESPACE NAME READY SECRET AGE
kube-system test-ingress-ssl True test-ingress-ssl 3m
错误处理
I1221 06:11:40.216899 1 conditions.go:96] Setting lastTransitionTime for Issuer "letsencrypt-staging" condition "Ready" to 2023-12-21 06:11:40.216888974 +0000 UTC m=+1399123.391993341
I1221 06:11:40.233005 1 setup.go:208] "cert-manager/issuers: skipping re-verifying ACME account as cached registration details look sufficient" resource_name="letsencrypt-staging" resource_namespace="kube-system" resource_kind="Issuer" resource_version="v1" related_resource_name="letsencrypt-staging" related_resource_namespace="kube-system" related_resource_kind="Secret"
I1221 06:11:40.259094 1 controller.go:162] "cert-manager/certificaterequests-issuer-acme: re-queuing item due to optimistic locking on resource" key="kube-system/test-ingress-ssl-1" error="Operation cannot be fulfilled on certificaterequests.cert-manager.io \"test-ingress-ssl-1\": the object has been modified; please apply your changes to the latest version and try again"
E1221 06:11:53.767531 1 controller.go:167] "cert-manager/challenges: re-queuing item due to error processing" err=<
while attempting to find Zones for domain _acme-challenge.kaside365.com.
while querying the Cloudflare API for GET "/zones?name=_acme-challenge.kaside365.com"
Error: 6003: Invalid request headers<- 6103: Invalid format for X-Auth-Key header
> key="kube-system/test-ingress-ssl-1-1799247307-1606112995"
E1221 06:11:55.947838 1 controller.go:167] "cert-manager/challenges: re-queuing item due to error processing" err=<
while attempting to find Zones for domain _acme-challenge.kaside365.com.
while querying the Cloudflare API for GET "/zones?name=_acme-challenge.kaside365.com"
Error: 6003: Invalid request headers<- 6103: Invalid format for X-Auth-Key header
> key="kube-system/test-ingress-ssl-1-1799247307-1606112995"这个错误是api认证的问题,检查一下token权限是否正确。
最后更新于