Data Cog - Your Data Has Answers, CellCog Finds Them
Data analysis and visualization from uploaded files.
Most AI tools return code when you ask about data. CellCog returns answers — actual charts, clean datasets, statistical reports, and visual dashboards. Upload messy CSVs with a minimal prompt, and CellCog's coding agent explores your data, finds the patterns, and presents them beautifully. Full Python access for everything from data cleaning to ML model evaluation.
How to Use
For your first CellCog task in a session, read the cellcog skill for the full SDK reference — file handling, chat modes, timeouts, and more.
OpenClaw (fire-and-forget):
CODEBLOCK0
All agents except OpenClaw (blocks until done):
from cellcog import CellCogClient
client = CellCogClient(agent_provider="openclaw|cursor|claude-code|codex|...")
result = client.create_chat(
prompt="[your task prompt]",
task_label="my-task",
chat_mode="agent",
)
print(result["message"])
What Makes Data-Cog Different
Code as Tool, Not as Output
Other AI tools give you Python code and say "run this." CellCog runs the code for you and delivers the results:
| Other AI Tools | Data-Cog |
|---|
| "Here's a pandas script to analyze your data" | Here are your actual insights with charts |
| "Run this matplotlib code to see the chart" |
Here's the chart, annotated with findings |
| "This SQL query will find outliers" | Found 23 outliers, here's what they mean |
| "You'll need scikit-learn for this" | Model trained, here's accuracy and feature importance |
You upload data. You get answers. The code runs behind the scenes.
What Data Work You Can Do
Exploratory Data Analysis
Understand your data fast:
- - Dataset Profiling: "Analyze this CSV — distributions, missing values, outliers, correlations, and data quality summary"
- Pattern Discovery: "What patterns and trends exist in this sales data? Surprise me."
- Anomaly Detection: "Find unusual patterns in this server log data — what looks abnormal?"
- Relationship Analysis: "What factors most strongly correlate with customer churn in this dataset?"
Example prompt:
"Analyze this dataset:
FILE>/path/to/customerdata.csv
I don't know much about this data yet. Give me:
- - Overview: rows, columns, data types, missing values
- Key distributions and summary statistics
- Most interesting correlations
- Any outliers or data quality issues
- 3-5 insights that jump out
Present findings as an interactive HTML report with charts."
Data Cleaning & Transformation
Wrangle messy data into shape:
- - Clean Messy Data: "Clean this CSV — fix inconsistent date formats, handle missing values, remove duplicates, standardize column names"
- Data Transformation: "Pivot this transaction data into a monthly summary by product category"
- Data Merging: "Join these three CSV files on customer_id and create a unified dataset"
- Feature Engineering: "Create useful features from this raw data for predicting house prices"
Example prompt:
"Clean and transform this dataset:
FILE>/path/to/messydata.csv
Issues I know about:
- - Dates are in mixed formats (MM/DD/YYYY and YYYY-MM-DD)
- 'Revenue' column has some values with $ signs and commas
- Duplicate rows exist
- Missing values in 'Region' column
Clean it up and give me back a clean CSV plus a summary of what you changed."
Statistical Analysis
Rigorous analysis with real numbers:
- - Hypothesis Testing: "Is there a statistically significant difference in conversion rates between our A and B variants?"
- Regression Analysis: "What factors predict employee salary in this HR dataset? Build a regression model."
- Time Series Analysis: "Analyze this monthly revenue data — trend, seasonality, and forecast next 6 months"
- Cohort Analysis: "Create a cohort analysis showing user retention by signup month"
Example prompt:
"I ran an A/B test on our checkout page:
FILE>/path/to/abtestresults.csvFILE>
Columns: user_id, variant (A or B), converted (0/1), revenue, timestamp
Tell me:
- - Is variant B statistically better? (p-value, confidence interval)
- Conversion rate difference
- Revenue per user difference
- Sample size adequacy check
- My recommendation: ship B or keep testing?
Present with clear charts and a plain-English conclusion."
Visualization & Reporting
Turn data into visual stories:
- - Chart Generation: "Create a set of charts showing our quarterly performance from this data"
- Dashboard Reports: "Build an interactive dashboard from this sales dataset with filters by region and product"
- Presentation-Ready Visuals: "Create publication-quality charts from this research data"
- Comparison Visuals: "Visualize how our metrics compare to industry benchmarks"
Machine Learning
Applied ML without the setup:
- - Classification: "Predict which customers will churn based on this dataset — train a model, show feature importance"
- Clustering: "Segment these customers into groups based on behavior — how many natural clusters exist?"
- Forecasting: "Forecast next quarter's sales using this historical data"
- Model Evaluation: "I trained a model — here are the predictions. Evaluate: accuracy, precision, recall, confusion matrix, ROC curve"
Example prompt:
"Predict customer churn from this dataset:
FILE>/path/to/customerfeatures.csv
Target column: 'churned'
- - Train a model, try at least 2 algorithms
- Show feature importance — what drives churn?
- Confusion matrix and ROC curve
- Plain-English summary: 'The top 3 reasons customers churn are...'
- Actionable recommendations based on findings
I want insights, not just metrics."
Supported Data Formats
| Format | How to Send |
|---|
| CSV | Upload via SHOWFILE |
| Excel (XLSX) |
Upload via SHOWFILE |
|
JSON | Upload via SHOW_FILE |
|
Parquet | Upload via SHOW_FILE |
|
SQL exports | Upload the dump via SHOW_FILE |
|
Inline data | Describe small datasets directly in prompt |
Output Formats
| Format | Best For |
|---|
| Interactive HTML Dashboard | Explorable charts, filters, drill-downs |
| PDF Report |
Shareable analysis reports with charts and findings |
|
Clean CSV/XLSX | Cleaned or transformed data files for downstream use |
|
Markdown | Quick insights for integration into docs |
Chat Mode for Data
| Scenario | Recommended Mode |
|---|
| Quick data cleaning, simple charts, basic statistics | INLINECODE0 |
| Deep analysis with multiple techniques, ML modeling, comprehensive reports |
"agent team" |
Use "agent" for most data work. Data cleaning, EDA, chart generation, and standard statistical analysis execute well in agent mode.
Use "agent team" for complex analytical projects — multi-technique analysis, ML model comparisons, or when you need deep domain reasoning about what the data means.
Example Prompts
Minimal prompt, maximum insight:
"Analyze this:
FILE>/path/to/data.csvFILE>
Tell me everything interesting."
That's it. CellCog's coding agent will profile the data, run exploratory analysis, find patterns, and present findings with charts. You don't need to know what to ask — the agent figures it out.
Business analysis:
"Analyze our e-commerce data:
FILE>/path/to/orders.csvFILE>
I need:
- - Revenue trends (daily, weekly, monthly)
- Best and worst performing products
- Customer purchase frequency distribution
- Average order value trends
- Seasonal patterns
- Top 5 actionable insights for growing revenue
Interactive HTML dashboard with all charts."
Research data analysis:
"Analyze this survey data from 500 respondents:
FILE>/path/to/survey.csvFILE>
Research questions:
- 1. Is there a significant relationship between age group and product preference?
- Do satisfaction scores differ by region? (ANOVA)
- What factors best predict likelihood to recommend? (regression)
Include: statistical tests, p-values, effect sizes, and publication-ready charts.
PDF report format."
Tips for Better Data Analysis
- 1. Just upload and ask: You don't need to describe every column. CellCog reads the data and figures out what's there.
- 2. State your question: "What drives churn?" is more focused than "Analyze this data." Both work, but the first gets faster results.
- 3. Mention the audience: "For my CEO" means executive summary. "For the data team" means show the methodology.
- 4. Specify what you'll do with it: "I need to present this to the board" vs "I need clean data for my ML pipeline" — context shapes the output.
- 5. Don't over-specify methods: Let CellCog choose the right statistical approach. Say what you want to learn, not which algorithm to use.
- 6. Iterate: Upload data → get initial analysis → ask follow-up questions → go deeper. CellCog maintains context across messages.
If CellCog is not installed
Run /cellcog-setup (or /cellcog:cellcog-setup depending on your tool) to install and authenticate.
OpenClaw users: Run clawhub install cellcog instead.
Manual setup: pip install -U cellcog and set CELLCOG_API_KEY. See the cellcog skill for SDK reference.
Data Cog - 你的数据自有答案,CellCog 负责找到它们
对上传文件进行数据分析和可视化。
大多数AI工具在你询问数据时只会返回代码。CellCog 返回的是答案——实际的图表、干净的数据集、统计报告和可视化仪表盘。只需上传杂乱的CSV文件并给出简单的提示,CellCog 的编码代理就会探索你的数据,发现其中的模式,并以精美的形式呈现出来。从数据清洗到机器学习模型评估,全程支持完整的Python访问。
使用方法
在会话中首次执行CellCog任务时,请阅读 cellcog 技能以获取完整的SDK参考——包括文件处理、聊天模式、超时设置等。
OpenClaw(即发即忘模式):
python
result = client.create_chat(
prompt=[你的任务提示],
notifysessionkey=agent:main:main,
task_label=my-task,
chat_mode=agent,
)
除OpenClaw外的所有代理(阻塞直到完成):
python
from cellcog import CellCogClient
client = CellCogClient(agent_provider=openclaw|cursor|claude-code|codex|...)
result = client.create_chat(
prompt=[你的任务提示],
task_label=my-task,
chat_mode=agent,
)
print(result[message])
Data-Cog 的独特之处
代码是工具,而非输出
其他AI工具给你Python代码,然后说运行这个。CellCog 替你运行代码并交付结果:
| 其他AI工具 | Data-Cog |
|---|
| 这里有一个用于分析数据的pandas脚本 | 这是你的实际洞察,附有图表 |
| 运行这段matplotlib代码来查看图表 |
这是图表,并附有发现注释 |
| 这个SQL查询可以找出异常值 | 找到23个异常值,以下是它们的含义 |
| 你需要scikit-learn来完成这个 | 模型已训练,以下是准确率和特征重要性 |
你上传数据。你得到答案。代码在后台运行。
你可以进行哪些数据工作
探索性数据分析
快速了解你的数据:
- - 数据集概况分析:分析这个CSV文件——分布情况、缺失值、异常值、相关性以及数据质量总结
- 模式发现:这些销售数据中存在哪些模式和趋势?给我一些惊喜。
- 异常检测:在这个服务器日志数据中找出异常模式——哪些看起来不正常?
- 关系分析:在这个数据集中,哪些因素与客户流失的相关性最强?
示例提示:
分析这个数据集:
FILE>/path/to/customerdata.csv
我对这些数据还不太了解。请提供:
- - 概览:行数、列数、数据类型、缺失值
- 关键分布和汇总统计
- 最有趣的相关性
- 任何异常值或数据质量问题
- 3-5个突出的洞察
以交互式HTML报告的形式呈现结果,并附上图表。
数据清洗与转换
将杂乱的数据整理成形:
- - 清洗杂乱数据:清洗这个CSV文件——修复不一致的日期格式、处理缺失值、删除重复项、标准化列名
- 数据转换:将这个交易数据透视成按产品类别划分的月度汇总
- 数据合并:以customer_id为键,合并这三个CSV文件,创建一个统一的数据集
- 特征工程:从这些原始数据中创建用于预测房价的有用特征
示例提示:
清洗并转换这个数据集:
FILE>/path/to/messydata.csv
我知道的问题:
- - 日期格式混杂(MM/DD/YYYY 和 YYYY-MM-DD)
- Revenue列中有些值带有$符号和逗号
- 存在重复行
- Region列中有缺失值
请进行清洗,返回一个干净的CSV文件以及你所做更改的摘要。
统计分析
基于真实数据的严谨分析:
- - 假设检验:A和B变体之间的转化率是否存在统计上的显著差异?
- 回归分析:在这个HR数据集中,哪些因素预测员工薪资?构建一个回归模型。
- 时间序列分析:分析这个月度收入数据——趋势、季节性,并预测未来6个月
- 同期群分析:创建一个按注册月份划分的用户留存同期群分析
示例提示:
我在结账页面上进行了一次A/B测试:
FILE>/path/to/abtestresults.csvFILE>
列:user_id, variant (A or B), converted (0/1), revenue, timestamp
请告诉我:
- - 变体B在统计上是否更好?(p值、置信区间)
- 转化率差异
- 每用户收入差异
- 样本量充分性检查
- 我的建议:上线B还是继续测试?
用清晰的图表和通俗易懂的结论来呈现。
可视化与报告
将数据转化为视觉故事:
- - 图表生成:根据这些数据创建一组展示我们季度业绩的图表
- 仪表盘报告:根据这个销售数据集构建一个交互式仪表盘,支持按地区和产品筛选
- 演示级可视化:根据这些研究数据创建出版质量的图表
- 对比可视化:可视化我们的指标与行业基准的对比情况
机器学习
无需设置即可应用机器学习:
- - 分类:根据这个数据集预测哪些客户会流失——训练一个模型,展示特征重要性
- 聚类:根据行为将这些客户分成若干群体——存在多少个自然聚类?
- 预测:利用历史数据预测下个季度的销售额
- 模型评估:我训练了一个模型——以下是预测结果。评估:准确率、精确率、召回率、混淆矩阵、ROC曲线
示例提示:
根据这个数据集预测客户流失:
FILE>/path/to/customerfeatures.csv
目标列:churned
- - 训练一个模型,尝试至少2种算法
- 展示特征重要性——什么驱动了流失?
- 混淆矩阵和ROC曲线
- 通俗易懂的总结:客户流失的三大原因是……
- 基于发现的可操作建议
我要的是洞察,而不仅仅是指标。
支持的数据格式
| 格式 | 发送方式 |
|---|
| CSV | 通过SHOWFILE上传 |
| Excel (XLSX) |
通过SHOWFILE上传 |
|
JSON | 通过SHOW_FILE上传 |
|
Parquet | 通过SHOW_FILE上传 |
|
SQL导出文件 | 通过SHOW_FILE上传转储文件 |
|
内联数据 | 直接在提示中描述小数据集 |
输出格式
| 格式 | 最佳用途 |
|---|
| 交互式HTML仪表盘 | 可探索的图表、筛选器、下钻分析 |
| PDF报告 |
可分享的分析报告,包含图表和发现 |
|
干净的CSV/XLSX | 清洗或转换后的数据文件,供下游使用 |
|
Markdown | 快速洞察,便于集成到文档中 |
数据聊天模式
| 场景 | 推荐模式 |
|---|
| 快速数据清洗、简单图表、基本统计 | agent |
| 多技术深度分析、机器学习建模、综合报告 |
agent team |
大多数数据工作使用 agent。 数据清洗、EDA、图表生成和标准统计分析在代理模式下执行良好。
复杂分析项目使用 agent team——多技术分析、机器学习模型比较,或者当你需要对数据含义进行深入领域推理时。
示例提示
最简提示,最大洞察:
分析这个:
FILE>/path/to/data.csvFILE>
告诉我所有有趣的信息。
就这样。CellCog的编码代理会对数据进行概况分析,运行探索性分析,发现模式,并用图表呈现发现。你不需要知道该问什么——代理会自己弄清楚。
商业分析:
分析我们的电商数据:
FILE>/path/to/orders.csvFILE>
我需要:
- - 收入趋势(日、周、月)
- 表现最好和最差的产品
- 客户购买频率分布
- 平均订单价值趋势
- 季节性模式
- 提升收入的5个可操作洞察
交互式HTML仪表盘,包含所有图表。
研究数据分析:
分析这份来自500名受访者的调查数据:
FILE>/path/to/survey.csvFILE>
研究问题:
- 1. 年龄段与产品偏好之间是否存在显著关系?
- 不同地区的满意度得分是否存在差异?(方差