ClickHouse Sync Plan (cksync-plan)
A skill for planning ClickHouse cluster data migration solutions, including migration plans, risks, and considerations.
When to Use
- - Data migration between different ClickHouse clusters
- Horizontal scaling (adding/removing nodes) for ClickHouse clusters
- Disk downgrade operations
- Cross-availability zone migrations
- Upgrading to multi-replica, multi-AZ deployments
Workflow
Step 1: Gather Source Cluster Information
Ask user for source cluster type:
- - Self-built ClickHouse or non-Alibaba Cloud ClickHouse
- Alibaba Cloud ClickHouse Community Edition
- Alibaba Cloud ClickHouse Enterprise Edition
Ask user for source cluster version (e.g., 20.8, 22.8, 23.8, 24.3):
- - Version affects migration method compatibility
- BACKUP/RESTORE requires ≥22.8
- Incremental cksync migration requires target ≥20.8
Step 2: Gather Target Cluster Information
Ask user for target cluster type:
- - Alibaba Cloud ClickHouse Community Edition
- Alibaba Cloud ClickHouse Enterprise Edition
- To be determined
Step 3: Collect Cluster Details (REQUIRED)
This step is mandatory. You MUST collect database and table information before proceeding to migration plan selection.
Required Information
- 1. Database list with engines
- Table list with engines, partition counts, data sizes, and write speeds
Option A: User Executes SQL
Provide SQL queries from
references/sql.md section 1 for user to execute:
- 1. Database Information - Query
system.databases for database names and engines - Table Information - Comprehensive query including table names, engines, engine_full (for TTL), partition counts, data sizes, and write speeds
Key fields to collect:
- -
engine_full: Contains TTL clause (e.g., TTL event_time + INTERVAL 7 DAY) - INLINECODE3 : Partition count per table
- INLINECODE4 : Data size per shard
- INLINECODE5 : Write speed calculated from part_log
For complete SQL queries, see references/sql.md section 1.
Option B: Direct Query via HTTP
Request connection details from user:
- -
HOST_NAME: Cluster endpoint (e.g., cc-xxx.clickhouse.rds.aliyuncs.com) - INLINECODE8 : HTTP port (default:
8123) - INLINECODE10 : Database username
- INLINECODE11 : Database password
Use secure credential handling and HTTP query examples from references/sql.md section 5.
Analysis Checklist
After collecting data, verify:
- - [ ] Required metadata is complete (database engine, table engine,
engine_full, partitions, data size, write speed) - [ ] Migration compatibility checks are completed using references/plans.md (method-specific conditions)
- [ ] Version and read-only window constraints are mapped to candidate methods
- [ ] Risks and mitigations are identified and recorded in the plan
Step 4: Business Requirements
Ask for allowed read-only time:
- - 0 minutes
- Within 30 minutes
- Within 1 day
- Not sure yet
Step 5: Select and Present Migration Plan
Based on gathered information, analyze and recommend from these migration methods:
| Method | Best For | Min Read-Only Time |
|---|
| Console (cksync) | Most migrations to Alibaba Cloud | ~10 min |
| BACKUP/RESTORE |
Large data, same edition type, version ≥22.8 | Varies by data size |
| INSERT FROM REMOTE | Flexible control, small-medium data | ~10 min per batch |
| Business Double-Write | Zero downtime required | 0 |
| Kafka Double-Write | Existing Kafka pipelines or business writes switched to Kafka | 0 |
| Big Cluster Federation | Large scale, complex scenarios | 0 |
Hard requirement: MUST output a plan, never output empty content.
Even when information is incomplete, you MUST output a provisional migration plan.
The provisional plan must include:
- - assumptions used,
- missing-information checklist,
- confidence level and key uncertainties,
- next steps to finalize recommendation after user provides missing inputs.
Migration Methods Overview
1. Console (cksync) Migration
Default choice for most Alibaba Cloud migration scenarios, especially in-place operations.
For support boundaries, engine constraints, TTL/write-speed checks, merge risk, and resource prerequisites, see references/plans.md section 1.
2. BACKUP/RESTORE Migration
Suitable for same-edition migrations where full backup/restore workflow is acceptable.
For version/edition constraints, supported engines, command patterns, and progress monitoring, see references/plans.md section 2.
3. INSERT FROM REMOTE Migration
Best when fine-grained table/partition/time-range control is needed.
For applicability boundaries and operational constraints, see references/plans.md section 3.
For SQL templates and detailed steps, see references/sql.md section 2.
4. Business Double-Write
Use when zero downtime is required and application-side dual-write is feasible.
For detailed conditions, see references/plans.md section 4.
5. Kafka Double-Write
Use when dual-consumer switchover via Kafka is feasible, including both existing Kafka pipelines and cases where business writes can be switched to Kafka first.
For detailed conditions, see references/plans.md section 5.
6. Big Cluster Federation
Advanced option for large/complex migrations with strong business and technical collaboration.
Output Format
Default deliverable: Produce one migration plan only. Structure it using assets/migration-plan-template.md and include the key sections below (cluster facts and commands may appear inline in the plan; that counts as the single deliverable).
Additional files only on request: Do not create separate files for cluster-information documentation, scripts, or SQL unless the customer explicitly asks for them. When they do, use assets/cluster-info-template.md for cluster documentation and place scripts/SQL in clearly named files as requested.
Key sections in the migration plan:
- 1. Executive Summary - Method, data size, duration, downtime
- Source Cluster Analysis - Databases, tables, compatibility check
- Migration Method Selection - Rationale and alternatives
- Migration Steps - Pre/execution/post with commands
- Risks & Mitigations - With probability and impact
- Rollback Plan - Trigger conditions and steps
- Timeline - Phase schedule with owners
- Reference Links - Documentation URLs
Method Selection Reference
For quick scenario-to-method mapping and method-specific constraints (including in-place migration priority and Enterprise → Enterprise options), see references/plans.md section "Method Selection Priority" and related method sections.
Additional Resources
ClickHouse 同步计划 (cksync-plan)
用于规划ClickHouse集群数据迁移方案的技能,包括迁移计划、风险和注意事项。
使用场景
- - 不同ClickHouse集群间的数据迁移
- ClickHouse集群水平扩缩容
- 磁盘降配操作
- 跨可用区迁移
- 升级为多副本、多可用区部署
工作流程
步骤1:收集源集群信息
询问用户源集群类型:
- - 自建ClickHouse或非阿里云ClickHouse
- 阿里云ClickHouse社区版
- 阿里云ClickHouse企业版
询问用户源集群版本(例如:20.8、22.8、23.8、24.3):
- - 版本影响迁移方法的兼容性
- BACKUP/RESTORE需要≥22.8
- 增量cksync迁移需要目标端≥20.8
步骤2:收集目标集群信息
询问用户目标集群类型:
- - 阿里云ClickHouse社区版
- 阿里云ClickHouse企业版
- 待确定
步骤3:收集集群详细信息(必选)
此步骤为强制步骤。 在选择迁移方案前,您必须收集数据库和表信息。
必需信息
- 1. 数据库列表(含引擎)
- 表列表(含引擎、分区数、数据量、写入速度)
选项A:用户执行SQL
提供
references/sql.md第1节中的SQL查询语句供用户执行:
- 1. 数据库信息 - 查询system.databases获取数据库名称和引擎
- 表信息 - 综合查询,包括表名、引擎、engine_full(含TTL)、分区数、数据量和写入速度
需收集的关键字段:
- - enginefull:包含TTL子句(例如TTL eventtime + INTERVAL 7 DAY)
- partcount:每张表的分区数
- databytes:每个分片的数据量
- writespeedbytespersec:基于part_log计算的写入速度
完整SQL查询请参见references/sql.md第1节。
选项B:通过HTTP直接查询
向用户请求连接信息:
- - HOSTNAME:集群端点(例如cc-xxx.clickhouse.rds.aliyuncs.com)
- HTTPPORT:HTTP端口(默认:8123)
- USER_NAME:数据库用户名
- PASSWORD:数据库密码
使用references/sql.md第5节中的安全凭据处理和HTTP查询示例。
分析检查清单
收集数据后,验证:
- - [ ] 必需的元数据是否完整(数据库引擎、表引擎、enginefull、分区、数据量、写入速度)
- [ ] 使用references/plans.md完成迁移兼容性检查(方法特定条件)
- [ ] 版本和只读窗口限制是否映射到候选方法
- [ ] 风险及缓解措施是否已识别并记录在计划中
步骤4:业务需求
询问允许的只读时间:
步骤5:选择并呈现迁移方案
基于收集的信息,从以下迁移方法中分析并推荐:
| 方法 | 最佳适用场景 | 最小只读时间 |
|---|
| 控制台(cksync) | 大多数迁移至阿里云的场景 | ~10分钟 |
| BACKUP/RESTORE |
大数据量、同版本类型、版本≥22.8 | 因数据量而异 |
| INSERT FROM REMOTE | 灵活控制、中小数据量 | 每批次~10分钟 |
| 业务双写 | 需要零停机 | 0 |
| Kafka双写 | 已有Kafka管道或业务写入可切换至Kafka | 0 |
| 大集群联邦 | 大规模、复杂场景 | 0 |
硬性要求:必须输出一个计划,绝不能输出空内容。
即使信息不完整,您也必须输出一个临时迁移方案。
临时方案必须包括:
- - 使用的假设条件,
- 缺失信息检查清单,
- 置信度和关键不确定性,
- 用户提供缺失输入后最终确定建议的后续步骤。
迁移方法概述
1. 控制台(cksync)迁移
大多数阿里云迁移场景的默认选择,特别是原地操作。
有关支持边界、引擎限制、TTL/写入速度检查、合并风险和资源前提条件,请参见references/plans.md第1节。
2. BACKUP/RESTORE迁移
适用于可接受完整备份/恢复工作流的同版本迁移。
有关版本/版本类型限制、支持的引擎、命令模式和进度监控,请参见references/plans.md第2节。
3. INSERT FROM REMOTE迁移
最适合需要细粒度表/分区/时间范围控制的场景。
有关适用边界和操作限制,请参见references/plans.md第3节。
有关SQL模板和详细步骤,请参见references/sql.md第2节。
4. 业务双写
当需要零停机且应用端双写可行时使用。
有关详细条件,请参见references/plans.md第4节。
5. Kafka双写
当通过Kafka进行双消费者切换可行时使用,包括已有Kafka管道以及业务写入可先切换至Kafka的场景。
有关详细条件,请参见references/plans.md第5节。
6. 大集群联邦
适用于需要强业务和技术协作的大型/复杂迁移的高级选项。
输出格式
默认交付物: 仅生成一个迁移方案。使用assets/migration-plan-template.md构建结构,并包含以下关键部分(集群事实和命令可内联在方案中,这算作单个交付物)。
仅在请求时提供附加文件: 除非客户明确要求,否则不要为集群信息文档、脚本或SQL创建单独文件。当客户要求时,使用assets/cluster-info-template.md作为集群文档,并按请求将脚本/SQL放在命名清晰的文件中。
迁移方案中的关键部分:
- 1. 执行摘要 - 方法、数据量、持续时间、停机时间
- 源集群分析 - 数据库、表、兼容性检查
- 迁移方法选择 - 理由和备选方案
- 迁移步骤 - 前置/执行/后置步骤及命令
- 风险及缓解措施 - 含概率和影响
- 回滚计划 - 触发条件和步骤
- 时间线 - 各阶段时间表及负责人
- 参考链接 - 文档URL
方法选择参考
有关快速场景到方法的映射以及方法特定限制(包括原地迁移优先级和企业版到企业版选项),请参见references/plans.md中的方法选择优先级及相关方法章节。
附加资源