一键在 Manus 中运行任何 Skill

data-analyst

数据分析专业指南。当用户提到"分析"、"看看"、"查看"、"统计"、"计算"、"图表"、"报表"、"数据"、"趋势"、"对比"、"汇总"、"平均"、"占比"、"增长"等任何数据相关词汇时必须加载。包含统计方法选择、加权平均计算、数据质量检查、业务洞察提取等最佳实践。适用于 Excel、CSV 等所有数据源的专业分析。

在 Manus 中运行

概览

安装命令

npx skills add https://github.com/hlbbbbbbb/No-worries --skill data-analyst

复制此命令并粘贴到 Claude Code 中以安装该技能

来源

hlbbbbbbb/No-worries

星标5

分支0

更新时间2026年2月3日 08:10

文件资源管理器

2 个文件

SKILL.md

readonly

同仓库更多 Skills

同仓库

ask-user-question

hlbbbbbbb/No-worries

Ask users questions via the UI. Use when you need clarification, user preferences, or confirmation before proceeding. The user CANNOT see CLI output - this tool is the ONLY way to communicate with them.

2026-02-035

dev-browser

hlbbbbbbb/No-worries

Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

2026-02-035

word-python

hlbbbbbbb/No-worries

Word automation using Python + python-docx. Generate, read, and modify Word documents (.docx) with proper formatting.

2026-02-035

first-awakening

hlbbbbbbb/No-worries

首次激活时的苏醒仪式。当检测到 IDENTITY.md 为空或默认值时触发，引导用户为 agent 取名并建立交流基调。这是最高优先级的流程，必须在任何任务之前完成。

2026-02-025

pdf-operations

hlbbbbbbb/No-worries

PDF 文件操作技能。使用内置 Python 库读取、提取、生成 PDF。支持表格提取、文本读取、PDF 合并/拆分等。

2026-02-025

excel-python

hlbbbbbbb/No-worries

Excel 修改操作使用 Python + openpyxl

2026-02-015

来源

hlbbbbbbb

hlbbbbbbb/No-worries

打开 GitHub 仓库查看创作者相关仓库

安装命令

下载

在 Manus 中运行

适用职业SOC

数据科学家计算机与数学类职业15-2051L4

name

data-analyst

description

# Professional Data Analyst Guidelines

##############################################################################

🚨🚨🚨 MANDATORY RULES - 必须严格遵循 🚨🚨🚨

##############################################################################

你已加载此 skill，现在必须像法律一样严格遵循以下所有规则！

⛔ 忽略这些规则将导致统计错误（如用错平均方法、忽略数据权重等） ✅ 严格遵循这些规则才能得出正确的分析结论

##############################################################################

⚠️ CRITICAL: 必须加载此 Skill 的场景

当用户提到以下任何词汇时，你必须先加载此 skill：

分析、查看、看看、统计、计算、汇总
平均、均值、占比、增长、趋势、变化
对比、比较、相关、影响因素
图表、报表、可视化
任何涉及数据处理、数值计算的任务

IMPORTANT: This skill provides data analysis best practices and guidelines. Use it alongside excel-python for actual data processing.

##############################################################################

🔴 STOP! 在执行任何计算之前，先检查以下关键点：

##############################################################################

分母是否相同？ 不同分母的百分比不能简单平均！
权重是否相等？ 不同规模的数据需要加权平均！
数据类型是什么？ 百分比、绝对值、时间序列各有不同处理方法！

##############################################################################

Core Principles

When users ask to "analyze" (分析) data or request data insights, follow these professional standards:

1. Data Type Recognition

Identify Data Types Before Analysis

Data Type	Description	Analysis Approach
Percentage/Rate	Ratios like completion rate, pass rate	Check if denominators are the same
Absolute Value	Amounts, quantities, revenue	Check for scale differences
Time Series	Annual, quarterly, monthly data	Use trend analysis
Categorical	Regions, product lines, departments	Use grouping/segmentation

Chinese Unit Conversions

万 (wan) = 10,000
亿 (yi) = 100,000,000
Always convert to consistent units for analysis

2. Statistical Method Selection

When to Use Mean (Average)

✅ Use Simple Average:

Same denominator percentages (e.g., consecutive quarter growth rates)
Same population scores (e.g., same employee's quarterly performance)
Directly comparable values with same scale

❌ DO NOT Use Simple Average:

Different denominator percentages (e.g., different class pass rates)
Different scale absolute values (e.g., different department sizes)

Weighted Average Formula

def weighted_average(values, weights):
    """Calculate weighted average"""
    if sum(weights) == 0:
        return None
    return sum(v * w for v, w in zip(values, weights)) / sum(weights)

# Example: Calculate weighted average pass rate
pass_rates = [0.80, 0.90, 0.75]  # Pass rates
student_counts = [50, 20, 40]     # Student counts (weights)

weighted_rate = weighted_average(pass_rates, student_counts)
print(f"全校及格率: {weighted_rate*100:.2f}%")

Statistical Measure Selection

Measure	Best For	Not Suitable For
Mean	Normal distribution, no outliers	Skewed data, extreme values
Median	Outliers present, skewed	Calculating totals
Mode	Categorical data, most common	Numerical calculations
Std Dev	Measuring dispersion	Different scale comparison

3. Business Analysis Dimensions

Trend Analysis

YoY Growth (同比):

增长率 = (本期 - 同期) / 同期 × 100%

Compare current period to same period last year
Used for year-over-year performance tracking

MoM Growth (环比):

增长率 = (本期 - 上期) / 上期 × 100%

Compare current month/quarter to previous month/quarter
Used for short-term trend tracking

Comparative Analysis

By Product Line: Compare performance across products
By Region: Compare regional performance
By Time Period: Compare Q1 vs Q2 vs Q3 vs Q4
By Customer Segment: Compare different customer groups

4. Data Quality Checks

Before analysis, ALWAYS check for:

import pandas as pd

df = pd.read_excel('/path/to/file.xlsx')

# 1. Data overview
print("数据形状:", df.shape)
print("数据类型:\n", df.dtypes)
print("前5行:\n", df.head())

# 2. Missing values
print("\n空值统计:\n", df.isnull().sum())

# 3. Basic statistics
print("\n数值列统计:\n", df.describe())

# 4. Check for anomalies
for col in df.select_dtypes(include=[np.number]).columns:
    if (df[col] < 0).any():
        print(f"警告: {col} 包含负数")

Data Quality Checklist

Check for null/missing values
Check for unreasonable values (e.g., negative numbers where not expected)
Check for duplicate records
Check for unit consistency
Check for date format consistency

5. Percentage/Rate Handling Rules

CRITICAL: The Percentage Trap

WRONG Example:

班级A: 80% 及格率 (40/50 = 80%)
班级B: 90% 及格率 (18/20 = 90%)
简单平均: (80% + 90%) / 2 = 85%  ← WRONG!

CORRECT Example:

实际及格率 = (40 + 18) / (50 + 20) = 58/70 ≈ 82.86%  ← CORRECT!

When to Use Which Method

Same Denominator → Simple Average OK:

Q1增长率: 10%
Q2增长率: 15%
Q3增长率: 12%
平均增长率: (10% + 15% + 12%) / 3 = 12.33%  ← OK

Different Denominators → Must Use Weighted Average:

班级A: 80% 及格率 (50人)
班级B: 90% 及格率 (20人)
班级C: 75% 及格率 (40人)
全校及格率: (50×0.8 + 20×0.9 + 40×0.75) / (50+20+40) ≈ 80%  ← CORRECT

6. Business Insight Framework

Four-Layer Output Structure

Layer 1 - Data Presentation:

Display tables/charts with raw data
Show key metrics clearly

Layer 2 - Basic Conclusions:

Rising/Falling/Stable trend
High/Low performance
Anomalies detected

Layer 3 - Business Insights:

Root cause analysis
Impact assessment
Pattern recognition

Layer 4 - Actionable Recommendations:

Specific next steps
Risk mitigation strategies
Opportunity identification

Example Output Format

📊 销售数据分析报告

【数据概览】
2024年总营收: 5000万元
同比增长: +15.2%
各季度分布: Q1=1200万, Q2=1100万, Q3=1300万, Q4=1400万

【基本结论】
✅ 整体呈上升趋势
✅ Q4达到全年峰值
✅ 同比增长显著

【业务洞察】
🔍 Q4增长主要受节假日促销带动
🔍 Q2受市场影响略有下滑
🔍 线上渠道占比从40%提升至55%

【行动建议】
1. 加强Q1-Q3淡季营销力度
2. 继续扩大线上渠道投入
3. 提前备货应对Q4旺季

7. Common Analysis Templates

Excel Data Analysis Template

import pandas as pd

# Load data
df = pd.read_excel('/path/to/file.xlsx')

# Data quality check
print("=== 数据概览 ===")
print(df.info())

# Basic statistics
print("\n=== 基本统计 ===")
print(df.describe())

# Time series analysis
if '日期' in df.columns:
    df['日期'] = pd.to_datetime(df['日期'])
    df.sort_values('日期', inplace=True)
    df['环比增速'] = df['金额'].pct_change() * 100
    print("\n=== 环比增速 ===")
    print(df[['日期', '金额', '环比增速']])

# Group analysis
if '产品线' in df.columns:
    grouped = df.groupby('产品线')['金额'].agg(['sum', 'count', 'mean'])
    print("\n=== 按产品线汇总 ===")
    print(grouped)

Percentage Analysis Template

import pandas as pd

df = pd.read_excel('/path/to/file.xlsx')

# Calculate actual rate from numerator and denominator
if '及格人数' in df.columns and '总人数' in df.columns:
    df['实际及格率'] = df['及格人数'] / df['总人数'] * 100
    print("=== 各班及格率 ===")
    print(df[['班级', '及格人数', '总人数', '实际及格率']])

    # Overall rate (weighted average)
    total_pass = df['及格人数'].sum()
    total_students = df['总人数'].sum()
    overall_rate = total_pass / total_students * 100
    print(f"\n全校及格率: {overall_rate:.2f}%")

Trend Analysis Template

import pandas as pd

df = pd.read_excel('/path/to/file.xlsx')

# Ensure date sorting
df['日期'] = pd.to_datetime(df['日期'])
df.sort_values('日期', inplace=True)

# Calculate growth
df['同比增速'] = df.groupby(df['日期'].dt.month)['金额'].pct_change(periods=12) * 100
df['环比增速'] = df['金额'].pct_change() * 100

# Display trend
print("=== 趋势分析 ===")
print(df[['日期', '金额', '环比增速', '同比增速']])

# Identify patterns
latest_growth = df['环比增速'].iloc[-1]
if latest_growth > 10:
    print("\n📈 近期增长强劲")
elif latest_growth > 0:
    print("\n📊 稳步增长")
elif latest_growth > -10:
    print("\n📉 小幅下滑")
else:
    print("\n⚠️ 需关注下滑趋势")

8. Self-Check Before Output

Before presenting analysis results, ask yourself:

Did I use the correct statistical method for this data type?
Do percentages have the same denominator? If not, did I use weighted average?
Do my conclusions have data support?
Can the user understand my presentation?
Did I provide actionable insights?
Are Chinese terms used correctly (同比, 环比, 营收, 毛利)?
Did I check for data quality issues?

9. Common Pitfalls to Avoid

Wrong Analysis Examples

❌ WRONG: Averaging different denominator percentages

"各班及格率平均为 85%"

Why wrong: Different classes have different student counts.

✅ CORRECT:

"全校及格率为 82.86%（计算方式：总及格人数/总人数）"

❌ WRONG: Not explaining methodology

"平均增长率为 15%"

Why wrong: User doesn't know what was averaged.

✅ CORRECT:

"Q1-Q4平均环比增长率为 15%（基于4个季度的环比增速计算）"

❌ WRONG: No context

"销售额下降"

Why wrong: No timeframe or magnitude.

✅ CORRECT:

"2024年Q2销售额下降8.5%，环比Q1的5000万降至4575万"

10. Output Style Guidelines

Professional Communication

Data-Driven: State facts first, then analysis
Layered: Present → Analyze → Insight → Action
Clear: Use tables, bold text, and structure
Actionable: Provide concrete next steps
Contextual: Explain methodology when needed

Chinese Terminology

Term	Usage
同比	Compare to same period last year
环比	Compare to previous period
营收	Revenue
毛利	Gross profit
净利	Net profit
及格率	Pass rate
完成率	Completion rate

Formatting for Readability

# 📊 销售数据分析报告

## 数据概览
| 指标 | 数值 | 同比 |
|------|------|------|
| 总营收 | 5000万 | +15.2% |

## 主要结论
✅ 整体呈上升趋势
⚠️ Q2略有下滑

## 业务洞察
1. Q4受节假日带动增长显著
2. 线上渠道持续扩张

## 行动建议
1. 加强淡季营销
2. 继续扩大线上投入

11. Analysis Workflow

When analyzing data, follow this sequence:

Read Data - Load and inspect structure
Quality Check - Identify nulls, outliers, issues
Select Methods - Choose appropriate statistical approach
Calculate - Perform analysis with Python
Verify - Check results for reasonableness
Interpret - Explain business meaning
Present - Structure output with insights and recommendations

文件操作安全规则

分析时的文件处理：

只读分析：直接读取原文件（data_only=True），不做修改
需要修改时：先复制原文件 → 在副本上修改 → 原文件保持不变
保存分析结果：保存到新文件（如 原文件名_分析结果.xlsx），不覆盖原文件

# 只读分析（推荐）
wb = openpyxl.load_workbook('/path/to/原文件.xlsx', data_only=True)

# 需要修改/保存结果时
import shutil
shutil.copy('/path/to/原文件.xlsx', '/path/to/原文件_分析结果.xlsx')
wb = openpyxl.load_workbook('/path/to/原文件_分析结果.xlsx')
# ... 进行修改 ...
wb.save('/path/to/原文件_分析结果.xlsx')

汇报时说明文件位置：

分析结果保存在哪里
原文件是否保持不变

12. Advanced Analysis Topics

Outlier Detection

import pandas as pd
import numpy as np

df = pd.read_excel('/path/to/file.xlsx')

# IQR method for outliers
Q1 = df['金额'].quantile(0.25)
Q3 = df['金额'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

outliers = df[(df['金额'] < lower_bound) | (df['金额'] > upper_bound)]
print("异常值:", outliers)

Correlation Analysis

# Correlation matrix
correlation_matrix = df.corr()
print("相关性分析:\n", correlation_matrix)

# Focus on high correlations
high_corr = correlation_matrix[correlation_matrix.abs() > 0.7]
print("高相关性 (>0.7):\n", high_corr)

Summary Checklist

Before finalizing any data analysis:

✅ Identified data types correctly ✅ Used appropriate statistical methods ✅ Handled percentages with correct weighting ✅ Checked data quality ✅ Provided clear methodology ✅ Structured output (Data → Conclusion → Insight → Action) ✅ Used professional Chinese terminology ✅ Included actionable recommendations ✅ Verified results are reasonable ✅ 原文件保持不变（如需修改，先复制再改） ✅ 告诉用户结果文件在哪里

Runtime

No additional dependencies needed. Use with:

pandas for data analysis
numpy for statistical calculations
Excel files via openpyxl (excel-python skill)