Server Log Analysis

Purpose

Use this Skill to investigate service issues when logs are stored on remote servers.

This Skill assumes:

- The agent can connect to servers via SSH or equivalent remote execution tooling.
INLINECODE0 in this Skill directory defines service metadata, log paths, and business context.
Before deep analysis, relevant log snippets should be copied to local temp/ first.

Required Reading

- Read config.yaml first.
Read reference.md when field details or command patterns are needed.

Core Workflow

1. Read config.yaml.
Map the user issue to one or more configured services.
Define the smallest necessary investigation scope:

- target service - target host - relevant time window - candidate log files

4. Connect to the target server via SSH or available remote tools.
Perform remote checks before downloading:

- file existence and file size - last modified time - whether keyword filtering or tail output is sufficient

6. Download only minimal required log snippets to configured local temp/.
Analyze local copies for errors, timing correlation, repeated failures, and likely root cause.
Output concise diagnosis with conclusions, evidence, uncertainty, and follow-up actions.

Investigation Rules

- Prioritize service definitions and business context in config.yaml; do not guess.
Prefer remote filtering before full download:

- narrow time window first - then filter by keywords - use tail first for recent incidents

- Download full logs only when snippets are insufficient.
Local filenames should clearly include service, host, and time range.
Unless explicitly requested, do not fetch sensitive files, binaries, or unrelated large archives.
For cross-service issues, analyze primary service first, then expand to dependencies.

Service Selection

When user intent is ambiguous:

1. Use service aliases, keywords, and description in config.yaml.
Pick the service with the highest semantic match.
If still unclear, ask the user which service to inspect before remote connection.

Remote Pre-Check Checklist

Before downloading logs, confirm:

- host configuration matches target service
configured log files exist
which log file was updated most recently
whether rolling logs must be included
whether issue is recent or historical

Common remote checks include:

- file metadata checks
recent log tail checks
quick keyword search
time-window extraction
process/service status when needed

Local Download Rules

Store downloaded logs under configured local_temp_dir.

Recommended filename format:

INLINECODE12

Priority order:

1. recent tail logs
keyword-filtered snippets
explicit time-window snippets
full file as last resort

Analysis Focus

Focus on:

- startup failures
repeated exceptions
timeout and connection issues
resource pressure signals
failures in DB/cache/message queue/DNS/HTTP upstream dependencies
config errors exposed by stack traces or startup logs
timestamp alignment across related services

The response should include:

- issue summary
key evidence
preliminary cause
confidence level
next verification steps

Security Constraints

- Treat config.yaml as operations metadata; do not store plaintext secrets.
Prefer environment variables, key files, or external secret managers for SSH credentials.
Unless explicitly requested, do not modify remote files or restart services.
Unless requested, do not auto-delete downloaded logs.

Exception Handling

If remote access fails:

1. Clearly state which step failed.
State target host and service.
Ask user for correct SSH access method, network path, or credentials.

If configured log path does not exist:

1. Clearly identify missing path.
Check whether alternate paths are configured for the same service.
Ask user whether deployment paths changed.

Quick Execution Order

Always follow this order:

1. Read config.yaml.
Identify service and host.
Perform remote log pre-checks.
Copy minimal required logs to temp/.
Analyze locally.
Summarize conclusions with evidence.

服务器日志分析

目的

当日志存储在远程服务器上时，使用此技能调查服务问题。

此技能假设：

- 代理可以通过SSH或等效的远程执行工具连接到服务器。
此技能目录中的config.yaml定义了服务元数据、日志路径和业务上下文。
在进行深入分析之前，应先将相关日志片段复制到本地temp/目录。

必读内容

- 首先阅读config.yaml。
当需要字段详情或命令模式时，阅读reference.md。

核心工作流程

1. 读取config.yaml。
将用户问题映射到一个或多个已配置的服务。
定义最小的必要调查范围：

- 目标服务 - 目标主机 - 相关时间窗口 - 候选日志文件

4. 通过SSH或可用的远程工具连接到目标服务器。
在下载前执行远程检查：

- 文件是否存在及文件大小 - 最后修改时间 - 是否仅需关键词过滤或tail输出即可满足需求

6. 仅下载最少量的必要日志片段到配置的本地temp/目录。
分析本地副本中的错误、时间相关性、重复故障及可能的根本原因。
输出简洁的诊断结果，包括结论、证据、不确定性和后续操作。

调查规则

- 优先使用config.yaml中的服务定义和业务上下文；不要猜测。
在完整下载前优先使用远程过滤：

- 先缩小时间窗口 - 然后按关键词过滤 - 对于近期事件，优先使用tail

- 仅在片段不足时下载完整日志。
本地文件名应清晰包含服务、主机和时间范围。
除非明确要求，否则不要获取敏感文件、二进制文件或不相关的大型归档文件。
对于跨服务问题，先分析主要服务，再扩展到依赖服务。

服务选择

当用户意图不明确时：

1. 使用config.yaml中的服务aliases、keywords和description。
选择语义匹配度最高的服务。
如果仍不明确，在远程连接前询问用户要检查哪个服务。

远程预检清单

在下载日志前，确认：

- 主机配置与目标服务匹配
配置的日志文件存在
哪个日志文件最近被更新
是否必须包含滚动日志
问题是近期还是历史性的

常见的远程检查包括：

- 文件元数据检查
近期日志tail检查
快速关键词搜索
时间窗口提取
必要时检查进程/服务状态

本地下载规则

将下载的日志存储在配置的localtempdir下。

推荐的文件名格式：

<服务><主机><日志名称><时间提示>.log

优先级顺序：

1. 近期tail日志
关键词过滤片段
明确时间窗口片段
完整文件作为最后手段

分析重点

重点关注：

- 启动失败
重复异常
超时和连接问题
资源压力信号
数据库/缓存/消息队列/DNS/HTTP上游依赖的故障
堆栈跟踪或启动日志暴露的配置错误
相关服务间的时间戳对齐

响应应包含：

- 问题摘要
关键证据
初步原因
置信度
下一步验证步骤

安全约束

- 将config.yaml视为运维元数据；不要存储明文密钥。
SSH凭证优先使用环境变量、密钥文件或外部密钥管理器。
除非明确要求，否则不要修改远程文件或重启服务。
除非要求，否则不要自动删除下载的日志。

异常处理

如果远程访问失败：

1. 明确说明哪个步骤失败。
说明目标主机和服务。
询问用户正确的SSH访问方式、网络路径或凭证。

如果配置的日志路径不存在：

1. 明确标识缺失的路径。
检查同一服务是否配置了备用路径。
询问用户部署路径是否已更改。

快速执行顺序

始终遵循此顺序：

1. 读取config.yaml。
识别服务和主机。
执行远程日志预检。
将最少量的必要日志复制到temp/目录。
在本地进行分析。
总结结论并提供证据。

server-log-analysisSSH日志诊断