Kubernetes Reviewing logs and events
One of the most important aspects of reviewing a kubernetes cluster for issues, problems or warnings is reviewing events and logs. The following command examples demonstrates these activities.
Listing events for a namespace
Listing events can be useful to troubleshoot problems within a namespace. It will provide a detailed listing of all captured events for the namespace and may indicate underlying issues that may need to be addressed.
kubectl -n tenant1 get events
Retrieving events from kubernetes resources
In many cases, to review issues with resources, one would like to get a summary of the specific resource and the latest events for that resource as captured by the Kubernetes cluster. The way to do this is with the “describe” instruction. Describe can be utilized for a variety of resources so feel free to play around with it. As with other examples used through the documentation, please assume that tenant1 is the namespace being used for the examples.
Describe a pod
kubectl -n tenant1 describe pod tenant1-wazuh-0
Output example:
Name: tenant1-wazuh-0
Namespace: tenant1
Priority: 0
Node: ip-xxx-xxx-xxx-xxx.ec2.internal/10.230.14.201
Start Time: Fri, 20 Jan 2023 13:04:37 +0000
Labels: app=tenant1
chart=wazuh-0.2.0
component=tenant1
controller-revision-hash=tenant1-wazuh-57447c558c
heritage=Helm
release=wazuh
role=wazuh
statefulset.kubernetes.io/pod-name=tenant1-wazuh-0
Annotations: backup.velero.io/backup-volumes:
wazuh-data,wazuh-configuration,wazuh-etc,wazuh-logs,wazuh-queue,wazuh-multigroups,wazuh-integrations,wazuh-bin,wazuh-agentless,wazuh-wodle...
checksum/secret: db736540636e3bb9ab5e8a4019f3e26a7c11656a6eebf2666ac36d6f483ab8e7
kubernetes.io/psp: eks.privileged
Status: Running
IP: 10.230.27.154
IPs:
IP: 10.230.27.154
Controlled By: StatefulSet/tenant1-wazuh
Containers:
wazuh:
Container ID: docker://54cea6c4fa3cb94648c63a55b4de11479332d43e8000849f0c0375c30561f1a7
Image: docker.io/siemonster/wazuh:prod-v4.6.1
Image ID: docker-pullable://siemonster/wazuh@sha256:aeef2bb6c87db1f491367eb79d88ec789e1060f3f7515bd992d89dd9a61a6489
Ports: 55000/TCP, 514/UDP, 514/TCP, 1514/UDP, 1514/TCP, 1515/TCP, 1516/TCP, 1516/TCP, 4000/TCP
Host Ports: 0/TCP, 0/UDP, 0/TCP, 0/UDP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Fri, 20 Jan 2023 13:05:41 +0000
Ready: True
Restart Count: 0
Limits:
memory: 4Gi
Requests:
memory: 3Gi
Liveness: tcp-socket :api delay=600s timeout=30s period=30s #success=1 #failure=10
Readiness: tcp-socket :api delay=10s timeout=30s period=30s #success=1 #failure=10
Environment Variables from:
tenant1-wazuh Secret Optional: false
Environment:
KAFKA_HOST: tenant1-kafka-broker.tenant1:9092
NIFI_HOST: tenant1-nifi.tenant1:8080
KIBANA_HOST: tenant1-wazuh-dashboards.tenant1:5601
WAZUH_API_USER: siemonster
SIEM_OFFLINE_MODE: true
TENANT_NAME: tenant1
SIEMONSTER_URL: http://soc-siemonster.soc.svc.cluster.local:3000
TENANT_CLUSTER_DOMAIN: tenant1.test50.siemonster.io
Mounts:
/etc/filebeat from wazuh-data (rw,path="tenant1-wazuh-wazuh-filebeat-etc")
/templates from wazuh-data (rw,path="tenant1-wazuh-wazuh-templates")
/var/lib/filebeat from wazuh-data (rw,path="tenant1-wazuh-wazuh-filebeat")
/var/ossec/active-response/bin from wazuh-data (rw,path="tenant1-wazuh-wazuh-bin")
/var/ossec/agentless from wazuh-data (rw,path="tenant1-wazuh-wazuh-agentless")
/var/ossec/api/configuration from wazuh-data (rw,path="tenant1-wazuh-wazuh-configuration")
/var/ossec/data from wazuh-data (rw,path="tenant1-wazuh-wazuh-data")
/var/ossec/etc from wazuh-etc (rw,path="tenant1-wazuh-wazuh-etc")
/var/ossec/integrations from wazuh-data (rw,path="tenant1-wazuh-wazuh-integrations")
/var/ossec/logs from wazuh-data (rw,path="tenant1-wazuh-wazuh-logs")
/var/ossec/queue from wazuh-data (rw,path="tenant1-wazuh-wazuh-queue")
/var/ossec/var/multigroups from wazuh-data (rw,path="tenant1-wazuh-wazuh-multigroups")
/var/ossec/wodles from wazuh-data (rw,path="tenant1-wazuh-wazuh-wodles")
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wgq2j (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
wazuh-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: wazuh-data-tenant1-wazuh-0
ReadOnly: false
wazuh-etc:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: wazuh-etc-tenant1-wazuh-0
ReadOnly: false
default-token-wgq2j:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-wgq2j
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
You will notice that there are no events (last line) for this describe example. If there were it would have been populated there.
Describing a STS/deployment
You can describe STS’s and deployments using the same command structure as describing a pod.
NOTE: The -0 has been removed from the end and the resource type has been changed to STS
kubectl -n tenant1 describe sts tenant1-wazuh
Example output:
Name: tenant1-wazuh
Namespace: tenant1
CreationTimestamp: Tue, 08 Mar 2022 16:27:05 +0000
Selector: component=tenant1,release=wazuh
Labels: app=tenant1
app.kubernetes.io/managed-by=Helm
chart=wazuh-0.2.0
component=tenant1
heritage=Helm
release=wazuh
role=wazuh
Annotations: meta.helm.sh/release-name: wazuh
meta.helm.sh/release-namespace: tenant1
Replicas: 1 desired | 1 total
Update Strategy: RollingUpdate
Partition: 0
Pods Status: 1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=tenant1
chart=wazuh-0.2.0
component=tenant1
heritage=Helm
release=wazuh
role=wazuh
Annotations: backup.velero.io/backup-volumes:
wazuh-data,wazuh-configuration,wazuh-etc,wazuh-logs,wazuh-queue,wazuh-multigroups,wazuh-integrations,wazuh-bin,wazuh-agentless,wazuh-wodle...
checksum/secret: db736540636e3bb9ab5e8a4019f3e26a7c11656a6eebf2666ac36d6f483ab8e7
Containers:
wazuh:
Image: docker.io/siemonster/wazuh:prod-v4.6.1
Ports: 55000/TCP, 514/UDP, 514/TCP, 1514/UDP, 1514/TCP, 1515/TCP, 1516/TCP, 1516/TCP, 4000/TCP
Host Ports: 0/TCP, 0/UDP, 0/TCP, 0/UDP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Limits:
memory: 4Gi
Requests:
memory: 3Gi
Liveness: tcp-socket :api delay=600s timeout=30s period=30s #success=1 #failure=10
Readiness: tcp-socket :api delay=10s timeout=30s period=30s #success=1 #failure=10
Environment Variables from:
tenant1-wazuh Secret Optional: false
Environment:
KAFKA_HOST: tenant1-kafka-broker.tenant1:9092
NIFI_HOST: tenant1-nifi.tenant1:8080
KIBANA_HOST: tenant1-wazuh-dashboards.tenant1:5601
WAZUH_API_USER: siemonster
SIEM_OFFLINE_MODE: true
TENANT_NAME: tenant1
SIEMONSTER_URL: http://soc-siemonster.soc.svc.cluster.local:3000
TENANT_CLUSTER_DOMAIN: tenant1.test50.siemonster.io
Mounts:
/etc/filebeat from wazuh-data (rw,path="tenant1-wazuh-wazuh-filebeat-etc")
/templates from wazuh-data (rw,path="tenant1-wazuh-wazuh-templates")
/var/lib/filebeat from wazuh-data (rw,path="tenant1-wazuh-wazuh-filebeat")
/var/ossec/active-response/bin from wazuh-data (rw,path="tenant1-wazuh-wazuh-bin")
/var/ossec/agentless from wazuh-data (rw,path="tenant1-wazuh-wazuh-agentless")
/var/ossec/api/configuration from wazuh-data (rw,path="tenant1-wazuh-wazuh-configuration")
/var/ossec/data from wazuh-data (rw,path="tenant1-wazuh-wazuh-data")
/var/ossec/etc from wazuh-etc (rw,path="tenant1-wazuh-wazuh-etc")
/var/ossec/integrations from wazuh-data (rw,path="tenant1-wazuh-wazuh-integrations")
/var/ossec/logs from wazuh-data (rw,path="tenant1-wazuh-wazuh-logs")
/var/ossec/queue from wazuh-data (rw,path="tenant1-wazuh-wazuh-queue")
/var/ossec/var/multigroups from wazuh-data (rw,path="tenant1-wazuh-wazuh-multigroups")
/var/ossec/wodles from wazuh-data (rw,path="tenant1-wazuh-wazuh-wodles")
Volumes: <none>
Volume Claims:
Name: wazuh-data
StorageClass: standard
Labels: <none>
Annotations: <none>
Capacity: 15Gi
Access Modes: [ReadWriteOnce]
Name: wazuh-etc
StorageClass: standard
Labels: <none>
Annotations: <none>
Capacity: 100Mi
Access Modes: [ReadWriteOnce]
Events: <none>
You will note that the output of the describe for the pod and the STS are very similar. The reason for this is that the STS is higher in the hierarchy. Sometimes it’s best to review the lowest object first and then the parent resource.
🔖 NOTE: You can use the pipe ( | ) in shell to pipe the output to grep and other cli based tools to trim the information to what you need.
Retrieving logs from pods
Unlike the describe instruction, logs are used solely on pods. As such the logs instruction doesn’t contain a resource specifier such as pod, sts or deployment etc. Below is an example of log outputs
kubectl -n tenant1 logs -f --tail 10 tenant1-wazuh-0
In the command above, the logs instruction tells the system to retrieve the logs. The -f option tells the instruction to follow the logs, the same way you would have historically tailed a log with the -f e.g. tail -f /var/log/example.log. Take care with the “--tail” option however, this is an additional directive that indicates how many older lines to retrieve. If a pod has been running for a long period and/or is very busy, issuing the logs directive without limiting the output can lock up your console output, forcing you to wait for the output to get to the last logs or to close the shell terminal and connect again.
⚠️ NOTE: This experience can be made worse if the administration is over slow or high latency connections.
Example output:
2023-01-25T12:13:07.612Z INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":21430,"time":{"ms":2}},"total":{"ticks":114310,"time":{"ms":5},"value":114310},"user":{"ticks":92880,"time":{"ms":3}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":12},"info":{"ephemeral_id":"cdd45fb7-0bd7-4b12-8eaf-c000ad130aa3","uptime":{"ms":428790021}},"memstats":{"gc_next":8313552,"memory_alloc":4908736,"memory_total":9388993240},"runtime":{"goroutines":35}},"filebeat":{"harvester":{"open_files":1,"running":1}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":1}},"system":{"load":{"1":0.57,"15":0.65,"5":0.59,"norm":{"1":0.0712,"15":0.0813,"5":0.0738}}}}}}
2023/01/25 12:13:09 wazuh-remoted: WARNING: (1408): Invalid ID 123 for the source ip: '10.230.56.129' (name 'unknown').
2023/01/25 12:13:14 wazuh-remoted: WARNING: (1408): Invalid ID 120 for the source ip: '10.230.68.73' (name 'unknown').
2023/01/25 12:13:19 wazuh-remoted: WARNING: (1408): Invalid ID 123 for the source ip: '10.230.49.118' (name 'unknown').
2023/01/25 12:13:24 wazuh-remoted: WARNING: (1408): Invalid ID 120 for the source ip: '10.230.82.70' (name 'unknown').
2023/01/25 12:13:29 wazuh-remoted: WARNING: (1408): Invalid ID 123 for the source ip: '10.230.29.181' (name 'unknown').
2023/01/25 12:13:30 wazuh-authd: INFO: New connection from 10.230.35.188
2023/01/25 12:13:30 wazuh-authd: INFO: Received request for a new agent (test-agent) from: 10.230.35.188
2023/01/25 12:13:30 wazuh-authd: ERROR: Invalid group: Unit26
2023/01/25 12:13:34 wazuh-remoted: WARNING: (1408): Invalid ID 120 for the source ip: '10.230.98.154' (name 'unknown').