Alibaba Cloud Flink Instance Manage
Operate Alibaba Cloud Flink VVP resources with a strict create/query scope through one wrapper script.
Scope and Entrypoint
- - Always run operations through:
python scripts/instance_ops.py <command> [options]
- - Allowed commands:
create, create_namespace, describe, describe_regions, describe_zones, describe_namespaces, INLINECODE6 - Out of scope: update/delete, Flink SQL/job runtime operations, and non-Flink services
Trigger Rules
Use this skill when prompts are about Flink instance/namespace lifecycle operations.
- - Positive intent examples:
- "Create a Flink instance in cn-beijing"
- "List Flink instances and status"
- "Describe namespaces for instance f-cn-xxx"
- "查询 Flink 实例标签"
- "Flink 可用区有哪些"
- - Negative intent examples:
- ECS/Kafka/OSS/DataWorks operations
- Generic questions (weather, translation, etc.)
- Flink SQL / Flink job authoring or runtime tuning
- Ask one clarification question: instance/namespace management vs SQL/job operations.
Intent to Command Mapping
| User intent | Command |
|---|
| Query all instances in a region | INLINECODE7 |
| Create instance |
create ... --confirm |
| Query namespaces under an instance |
describe_namespaces --region_id <REGION> --instance_id <ID> |
| Create namespace |
create_namespace ... --confirm |
| Query supported regions/zones |
describe_regions /
describe_zones --region_id <REGION> |
| Query tags |
list_tags --region_id <REGION> --resource_type <TYPE> [--resource_ids ...] |
Operating Rules
- 1. Confirmation is mandatory for create commands
-
create and
create_namespace must include
--confirm.
- 2. Verify create results with read-back
- Do not conclude success from create response alone.
- 3. Retry policy is strict
- Maximum 2 attempts for the same command (initial + one corrected retry).
- 4. No automatic operation switching
- If an operation fails, do not switch to a different operation without user approval.
- 5. Lifecycle target lock
- In
create -> create_namespace flow, namespace must target the same newly created
InstanceId unless user approves fallback.
- 6. Namespace pre-check is required
- Before
create_namespace, check instance status/resources and existing namespace allocation.
- 7. No secret exposure
- Do not output or request plaintext AK/SK. Use default credential chain guidance.
- 8. Do not invent parameters
- Never fabricate VPC/VSwitch/instance IDs.
- 9. Keep auditable confirmation evidence
- Lifecycle outputs must contain
SafetyCheckRequired or explicit
--confirm evidence.
- 10. No partial-completion claims for lifecycle flows
- For flows requiring both
create and
create_namespace, overall status can be
completed only when both create operations succeed.
- 11. No automatic capacity scaling
- If
create_namespace fails due to insufficient resources, report it clearly and ask user to manually scale resources outside this skill scope.
Execution Protocol
Step 1: Classify request
- - In-scope create/query for Flink instance/namespace/tag/region/zone -> continue.
- Out-of-scope or non-Flink -> reject or route with explanation.
Step 2: Validate parameters
- - Apply
references/parameter-validation.md. - If required parameters are missing, ask user or return clear remediation.
Step 3: Execute command
- - Query commands: run once unless transient query error.
- Create commands: construct final command string and verify
--confirm is present before execution.
Step 4: Verify create outcomes
- - For
create: verify with describe --region_id <REGION>. - For
create_namespace: verify with describe_namespaces --region_id <REGION> --instance_id <ID>. - Use up to 3 read checks with short backoff before concluding the create is not reflected yet.
- For chained
create -> create_namespace:
- poll
describe --region_id <REGION> on the same
InstanceId every 30 seconds
- max wait: 10 minutes
- if still not
RUNNING, stop and provide next action (wait/retry later)
- do not switch to another instance without explicit user approval
- if namespace create fails, mark lifecycle chain as
failed/
not_ready, not
completed
- for
InsufficientResources, ask user to manually scale the instance and retry later
Key References
-
references/README.md
-
references/quick-start.md
-
references/trigger-recognition-guide.md
-
references/core-execution-flow.md
-
references/command-templates.md
| Document | Purpose |
|---|
| INLINECODE45 | Pre-execution validation checklist |
| INLINECODE46 |
Complete execution sequences |
|
references/common-failures.md | Typical mistakes and fixes |
|
references/required-confirmation-model.md | Confirmation gate rules |
|
references/instance-state-management.md | Instance state and readiness checks |
|
references/output-handling.md | Output parsing and retry policy |
|
references/verification-method.md | Verification patterns after create/query |
|
references/acceptance-criteria.md | Completion checklist for normal operations |
|
references/python-environment-setup.md | Python dependency and auth setup |
|
references/cli-installation-guide.md | Aliyun CLI diagnostics setup |
|
references/ram-policies.md | Required RAM permissions |
|
references/related-apis.md | API and command mapping |
Output Format
All commands return JSON:
CODEBLOCK1
INLINECODE57 appears on create operations and is used for auditable safety evidence.
Exit codes: 0 = success, 1 = error.
阿里云 Flink 实例管理
通过一个封装脚本,在严格的创建/查询范围内操作阿里云 Flink VVP 资源。
范围与入口
bash
python scripts/instance_ops.py <命令> [选项]
- - 允许的命令:create、createnamespace、describe、describeregions、describezones、describenamespaces、list_tags
- 不在范围内:更新/删除、Flink SQL/作业运行时操作,以及非 Flink 服务
触发规则
当提示涉及 Flink 实例/命名空间生命周期操作时使用此技能。
- 在 cn-beijing 创建一个 Flink 实例
- 列出 Flink 实例及其状态
- 描述实例 f-cn-xxx 的命名空间
- 查询 Flink 实例标签
- Flink 可用区有哪些
- ECS/Kafka/OSS/DataWorks 操作
- 通用问题(天气、翻译等)
- Flink SQL / Flink 作业编写或运行时调优
- 提出一个澄清性问题:实例/命名空间管理还是 SQL/作业操作。
意图到命令的映射
| 用户意图 | 命令 |
|---|
| 查询某个区域的所有实例 | describe --region_id <区域> |
| 创建实例 |
create ... --confirm |
| 查询实例下的命名空间 | describe
namespaces --regionid <区域> --instance_id
|
| 创建命名空间 | create_namespace ... --confirm |
| 查询支持的区域/可用区 | describeregions / describezones --region_id <区域> |
| 查询标签 | listtags --regionid <区域> --resourcetype <类型> [--resourceids ...] |
操作规则
- 1. 创建命令必须确认
- create 和 create_namespace 必须包含 --confirm。
- 2. 通过回读验证创建结果
- 不要仅凭创建响应就断定成功。
- 3. 重试策略严格
- 同一命令最多尝试 2 次(首次 + 一次修正后重试)。
- 4. 不自动切换操作
- 如果操作失败,未经用户批准不得切换到其他操作。
- 5. 生命周期目标锁定
- 在 create -> create_namespace 流程中,命名空间必须针对同一新创建的 InstanceId,除非用户批准回退。
- 6. 命名空间预检查是必需的
- 在 create_namespace 之前,检查实例状态/资源和现有命名空间分配情况。
- 7. 不泄露密钥
- 不输出或请求明文 AK/SK。使用默认凭证链指导。
- 8. 不虚构参数
- 绝不编造 VPC/VSwitch/实例 ID。
- 9. 保留可审计的确认证据
- 生命周期输出必须包含 SafetyCheckRequired 或显式的 --confirm 证据。
- 10. 生命周期流程不声明部分完成
- 对于需要同时执行 create 和 create_namespace 的流程,仅当两个创建操作都成功时,整体状态才能标记为 completed。
- 11. 不自动扩容
- 如果 create_namespace 因资源不足而失败,清晰报告并请用户在此技能范围外手动扩容。
执行协议
步骤 1:分类请求
- - 在范围内的 Flink 实例/命名空间/标签/区域/可用区创建/查询 -> 继续。
- 超出范围或非 Flink 操作 -> 拒绝或引导并说明原因。
步骤 2:验证参数
- - 应用 references/parameter-validation.md。
- 如果缺少必需参数,询问用户或返回清晰的修正说明。
步骤 3:执行命令
- - 查询命令:除非出现临时查询错误,否则执行一次。
- 创建命令:构造最终命令字符串,并在执行前验证 --confirm 是否存在。
步骤 4:验证创建结果
- - 对于 create:使用 describe --regionid <区域> 验证。
- 对于 createnamespace:使用 describenamespaces --regionid <区域> --instanceid 验证。
- 在断定创建尚未反映之前,最多进行 3 次回读检查,每次间隔短暂等待。
- 对于链式 create -> createnamespace:
- 每 30 秒轮询同一 InstanceId 的 describe --region_id <区域>
- 最长等待:10 分钟
- 如果仍未处于 RUNNING 状态,停止并提供后续操作(等待/稍后重试)
- 未经用户明确批准,不得切换到其他实例
- 如果命名空间创建失败,将生命周期链标记为 failed/not_ready,而非 completed
- 对于 InsufficientResources,请用户手动扩容实例并稍后重试
关键参考文档
- references/README.md
- references/quick-start.md
- references/trigger-recognition-guide.md
- references/core-execution-flow.md
- references/command-templates.md
| 文档 | 用途 |
|---|
| references/parameter-validation.md | 执行前验证清单 |
| references/e2e-playbooks.md |
完整执行序列 |
| references/common-failures.md | 典型错误及修复方法 |
| references/required-confirmation-model.md | 确认关卡规则 |
| references/instance-state-management.md | 实例状态与就绪检查 |
| references/output-handling.md | 输出解析与重试策略 |
| references/verification-method.md | 创建/查询后的验证模式 |
| references/acceptance-criteria.md | 正常操作的完成清单 |
| references/python-environment-setup.md | Python 依赖与认证设置 |
| references/cli-installation-guide.md | Aliyun CLI 诊断设置 |
| references/ram-policies.md | 所需 RAM 权限 |
| references/related-apis.md | API 与命令映射 |
输出格式
所有命令返回 JSON:
json
{
success: true,
operation: <命令>,
confirmation_check: {
required_flag: --confirm,
provided: true,
status: passed
},
data: {},
request_id: ...
}
confirmation_check 出现在创建操作中,用于可审计的安全证据。
退出码:0 = 成功,1 = 错误。