Skip to main content

Command Palette

Search for a command to run...

CKS Notes - Image Security

Updated
4 min read
C

Some blogs are from my previous blogs, even though I have renovated and checked before migration, but there may be still some parts out of date. (https://blog.sina.com.cn/u/1784323047 or https://blog.csdn.net/li_6698230?type=blog, if they're still accessible.)

ImagePolicyWebhook

Like NodeRestriction, ImagePolicyWebhook is also a “Admission Controller” plugin.

As we said before, the “Admission Controller” works after authentication and authorization (RBAC).

Authentication → Authorization → Admission ( NodeRestriction/ImagePolicyWebhook)

  • NodeRestriction : Prevent nodes from modifying objects they are not allowed.

  • ImagePolicyWebhook : Validate container images used in Pods before the pod is admitted

Basically it checks before Kubernetes allows a Pod to run, and call an external image security service and ask: Is this image allowed?

1. External service

Here the external server works as:

Kubernetes API server performs an HTTP POST to an external webhook server (HTTPS endpoint) to ask:
“Is this image allowed?”

That server might be:

  • A company’s internal image scanner

  • A vulnerability scanner (Trivy server)

  • A signature validator (Notary v1/2, Cosign webhook)

  • A custom Python/Go program

  • A local service on the same node (like https://localhost:1234)

  • A remote service outside the cluster

The API server only sends image info.
The external service decides what to check.

2. Admission Controller Configure file

the admission controller configure file can be either in yaml/json format:

{
  "apiVersion": "apiserver.config.k8s.io/v1",
  "kind": "AdmissionConfiguration",
  "plugins": [
    {
      "name": "ImagePolicyWebhook",
      "configuration": {
        "imagePolicy": {
          "kubeConfigFile": "/path/to/<kubeconfig_file>",
          "allowTTL": 100,
          "denyTTL": 50,
          "retryBackoff": 500,
          "defaultAllow": false
        }
      }
    }
  ]
}
  • kubeConfigFile The kubeconfig file used by API server to call the external webhook

  • allowTTL Cache “allowed” decisions for 100 seconds

  • denyTTL Cache “denied” decisions for 50 seconds

  • retryBackoff If webhook fails, wait 500ms before retrying

  • defaultAllow = false If webhook cannot be reached → deny pods by default (strict mode)

3. KubeConfigFile

The kubeconfig tells the API server:

  • Which webhook server to call

  • Which TLS certs to use for mutual TLS

  • Which CA to trust for server verification

Example:

clusters:
- cluster:
    certificate-authority: /etc/kubernetes/policywebhook/external-cert.pem
    server: https://localhost:1234
  name: image-checker

users:
- name: api-server
  user:
    client-certificate: /etc/kubernetes/policywebhook/apiserver-client-cert.pem
    client-key:  /etc/kubernetes/policywebhook/apiserver-client-key.pem

Meaning:

  • API server must contact a webhook running on https://localhost:1234

  • API server will verify the webhook using the provided CA cert.

  • API server authenticates itself to the webhook using mutual TLS.

4. manifest kube-apiserver.yaml

/etc/kubernetes/manifest/kube-apiserver.yaml

apiVersion: v1
kind: Pod
metadata:
  ...
spec:
  containers:
  - command:
    - kube-apiserver
    - --enable-admission-plugins=NodeRestriction,ImagePolicyWebhook
    - --admission-control-config-file=/PATH/TO/<ADMISSION_CONTROLLER_CONFIG_FILE>
    # eg. /etc/kubernetes/policywebhook/admission_config.json
    ...

Reference

Official doc: Imagepolicywebhook

Also can check the examples here

Trivy Image scan

Before we talk about the Trivy can be the external server to check the images. Here we will see how to use Trivy to scan images.

# find the images
k get pod -oyaml | grep image:
k describe pod |grep -iE '^Name:|Image:'

# find the high,critical images
trivy image -s HIGH,CRITICAL <IMAGE_NAME>:<VERSION>
# we can also show the specified Vulnerability
trivy image -s HIGH,CRITICAL <IMAGE_NAME>:<VERSION> |grep <Vulnerability>
# eg.
trivy image nginx:1.19.1-alpine-perl | grep CVE-2021

Find the image with vulnerabilites, and we can either

  • remove the pod or scale down the deployment

  • patch the source of the image Dockerfile

# remove the pod or scale down the dp/sc
k -n <NAMESPACE> delete pod <NAME>
k -n <NAMESPACE> scale deploy <DEPLOY> --replicas 0

remove the pod or scale down the dp is just temporal way, in reality we still need to find a image which is no vulnerabilities to patch the source

# 1. find a base image
Report Summary

┌────────────────────────────────┬────────┬─────────────────┬─────────┐
│             Target             │  Type  │ Vulnerabilities │ Secrets │
├────────────────────────────────┼────────┼─────────────────┼─────────┤
│ morc-api:0.1-5 (alpine 3.22.1) │ alpine │       25        │    -    │
└────────────────────────────────┴────────┴─────────────────┴─────────┘

from the Report summary we can know it is based on alpine, so we can find the latest version for patch, but first check it.

$ trivy image nginx:alpine

and then build the image,

docker build -t <IMAGE>:<PATCHED_VERSION> -f <Dockerfile> .

And as Kubernetes does not use Docker anymore.(Kubernetes ≥ 1.24 does NOT run Docker.)

It uses:

  • containerd

  • or CRI-O

So even if we have built the image using docker build, that image exists only in Docker’s local storage, not in the cluster runtime.

This means:

docker build → Docker local image store

  • NOT visible to Kubernetes / containerd / CRI-O

Therefore we need to do following steps:

1. Save the Docker image

docker save <IMAGE_NAME>:<PATCHED_VERSION> -o <IMAGE_NAME>_<PATCHED_VERSION>.tar

2. Import into containerd

ctr -n k8s.io images import <IMAGE_NAME>_<PATCHED_VERSION>.tar

This loads the image into the runtime used by kubelet → containerd.

3. Check image in containerd

crictl images | grep <IMAGE_NAME>

If they do NOT import the image:

  • Kubelet won’t see the patched image

  • containerd will try to pull the old image from registry

  • Trivy scanning containerd won’t see the patched image

Reference:

Can check an example here