with one click
从Excel提取多类型数据,并生成包含可视化图表与下载链接的综合分析报告。
npx skills add https://github.com/OpenSenseNova/SenseNova-Skills --skill excel-data-analysis-and-report-generationCopy and paste this command into Claude Code to install the skill
从Excel提取多类型数据,并生成包含可视化图表与下载链接的综合分析报告。
npx skills add https://github.com/OpenSenseNova/SenseNova-Skills --skill excel-data-analysis-and-report-generationCopy and paste this command into Claude Code to install the skill
| name | excel-data-analysis-and-report-generation |
| description | 从Excel提取多类型数据,并生成包含可视化图表与下载链接的综合分析报告。 |
This sub-skill covers one capability of the Excel workflow. For reading/counting/Parquet optimization, see the parent workflow SKILL.md.
Step1 基于指定列提取有效代码或进行分类映射,生成包含占比与总计行的统计表,并支持交叉分析。
# 分类映射函数骨架
def categorize_item(item_name):
category_a_keywords = ['keyword1', 'keyword2'] # 占位示例
if pd.isna(item_name):
return '未知'
if any(kw in str(item_name) for kw in category_a_keywords):
return '类别A'
return '其他'
target_col = '项目名称' # 替换为实际列名
group_col = '所属区域' # 替换为实际分组列名
if target_col in combined_df.columns:
combined_df['分类'] = combined_df[target_col].apply(categorize_item)
# value_counts + 占比 + 总计行
category_counts = combined_df['分类'].value_counts().reset_index()
category_counts.columns = ['类别', '数量']
total = category_counts['数量'].sum()
category_counts['占比'] = (category_counts['数量'] / total).apply(lambda x: f'{x:.2%}')
total_row = pd.DataFrame([{'类别': '总计', '数量': total, '占比': '100.00%'}])
category_counts = pd.concat([category_counts, total_row], ignore_index=True)
# 交叉分析 crosstab
if group_col in combined_df.columns:
cross_tb = pd.crosstab(combined_df[group_col], combined_df['分类'], margins=True, margins_name='总计')
print("交叉分析结果:\n", cross_tb)
Step2 识别目标值超过限值的行,基于关键字定位并反向搜索限值以确保数据关联。
import re
exceed_rows = []
df_target = combined_df.copy()
for i, row in df_target.iterrows():
if '共计' in str(row.iloc[0]):
try:
target_val = float(row.iloc[8]) # 目标值所在列索引
except (ValueError, TypeError):
continue
limit_val = None
structure_name = "未知结构"
# 反向搜索限值
for j in range(i-1, max(0, i-15), -1):
check_row = df_target.iloc[j, :]
check_str = ' '.join([str(x) for x in check_row.values if pd.notna(x)])
if '限值' in check_str:
# 数据清洗正则表达式
match = re.search(r'限值([\d.]+)', check_str)
if match:
limit_val = float(match.group(1))
for k in range(j-1, max(0, j-5), -1):
name_row = df_target.iloc[k, 0]
if pd.notna(name_row) and '关键字' in str(name_row):
structure_name = str(name_row)
break
break
# 多维度评分/分级算法结构
if limit_val is not None and target_val > limit_val:
severity = '高' if (target_val - limit_val) > 10 else '中'
exceed_rows.append({
'row_index': i,
'structure_name': structure_name,
'target_val': target_val,
'limit': limit_val,
'exceed_value': target_val - limit_val,
'severity': severity
})
Step3 生成高分辨率可视化图表展示分类占比,保存统计结果并生成沙箱下载链接。
import matplotlib.pyplot as plt
import matplotlib
# 中英文字体配置
matplotlib.rcParams['font.sans-serif'] = ['SimHei', 'DejaVu Sans']
matplotlib.rcParams['axes.unicode_minus'] = False
# 准备图表数据 (排除总计行)
plot_data = category_counts[category_counts['类别'] != '总计']
categories = plot_data['类别'].tolist()
counts = plot_data['数量'].tolist()
# 图表美化(dpi、颜色方案、标签位置)
fig, ax = plt.subplots(figsize=(10, 8))
colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#96CEB4', '#F9A826']
explode = [0.05] * len(categories)
wedges, texts, autotexts = ax.pie(
counts,
labels=categories,
autopct='%1.1f%%',
startangle=90,
colors=colors[:len(categories)],
explode=explode,
shadow=True,
textprops={'fontsize': 12}
)
ax.set_title('各类别数量占比分析', fontsize=16, fontweight='bold', pad=20)
ax.legend(wedges, categories, title="类别", loc="center left", bbox_to_anchor=(1, 0, 0.5, 1))
# 保存图表
chart_path = os.path.join(output_dir, 'category_analysis.png')
plt.savefig(chart_path, dpi=150, bbox_inches='tight')
# 保存统计结果并生成下载链接
output_path = os.path.join(output_dir, 'analysis_result.xlsx')
category_counts.to_excel(output_path, index=False)
print(f"统计结果已保存至: {output_path}")
print(f"下载链接: [下载统计结果](sandbox:{output_path})")
print(f"图表下载链接: [下载图表](sandbox:{chart_path})")
Creative-mode PPT pipeline. One full-page 16:9 PNG per slide. LLM / VLM calls go through sn-ppt-standard/lib/model_client.py (shared thin client). Text-to-image (the actual png rendering) goes through sn-image-base/scripts/sn_agent_runner.py. Expects task_pack.json + info_pack.json already written by sn-ppt-entry.
Standard-mode PPT pipeline. All LLM / VLM / T2I calls are wrapped in a single CLI entry (scripts/run_stage.py). The main agent's job is simple: emit ONE shell command per stage, never write loops, never write prompts.
USE FOR Google-backed image discovery via Serper.dev. Returns image URLs, page URLs, titles, and source domains.
Base-layer skill for the SenseNova-Skills project, providing low-level APIs for image generation, recognition (VLM), and text optimization (LLM). This skill does not preprocess inputs; it only calls backend services and returns results. This skill is not user-facing and is intended for upper-layer skills only.
Generates professional infographics with various layout types and visual styles. Analyzes content, recommends layout and style, and generates publication-ready infographics. Use when user asks to create "infographic", "信息图", "visual summary", or "可视化".
Environment diagnostic for the PPT family. Validates sn-image-base, API keys, Node runtime, and optional deps; interactively writes .env for required vars. Runs before sn-ppt-entry; does not modify sn-image-* skills.