تشغيل أي مهارة في Manus بنقرة واحدة

wrds-data

Search and download financial data from WRDS (Wharton Research Data Services). Use when asked to "download from WRDS", "search WRDS", "get CRSP data", "download Compustat", "find WRDS table", "get stock returns", "download IBES", or any WRDS/financial database task involving CRSP, Compustat, IBES, TAQ, OptionMetrics, Fama-French, BoardEx, DealScan, or other WRDS datasets.

تشغيل في Manus

نظرة عامة

أمر التثبيت

npx skills add https://github.com/FuZhiyu/AgentContract --skill wrds-data

انسخ والصق هذا الأمر في Claude Code لتثبيت المهارة

المصدر

FuZhiyu/AgentContract

النجوم٠

التفرعات٠

آخر تحديث٨ أبريل ٢٠٢٦ في ١٩:٢٥

مستكشف الملفات

4 ملفات

SKILL.md

readonly

name	wrds-data
description	Search and download financial data from WRDS (Wharton Research Data Services). Use when asked to "download from WRDS", "search WRDS", "get CRSP data", "download Compustat", "find WRDS table", "get stock returns", "download IBES", or any WRDS/financial database task involving CRSP, Compustat, IBES, TAQ, OptionMetrics, Fama-French, BoardEx, DealScan, or other WRDS datasets.
user-invocable	true

WRDS Data

Download and query data from WRDS (Wharton Research Data Services) using the wrds Python package.

Step 0: Check Credentials

Before any WRDS operation, verify the environment is set up:

uv run --with wrds python -c "import wrds; print('wrds package: OK')"

Also verify that WRDS credentials exist in the expected PostgreSQL password file:

macOS/Linux: ~/.pgpass
Windows: %APPDATA%/postgresql/pgpass.conf

If any check fails, guide the user through setup:

No WRDS account: Direct to https://wrds-www.wharton.upenn.edu/register/ — requires institutional affiliation.
wrds package missing: Run uv pip install wrds.
No .pgpass credentials: Create the credentials file manually with:
```
wrds-pgdata.wharton.upenn.edu:9737:wrds:USERNAME:PASSWORD
```
Then chmod 600 ~/.pgpass on Unix/macOS.

If the user is working from a local checkout of this plugin and wants the helper script, it lives at:

plugins/wrds-data/skills/wrds-data/scripts/wrds_setup.py

Do NOT proceed with queries until --check passes.

Step 1: Identify the Dataset

If the user specifies a library/table, proceed directly. Otherwise, help them find it.

Browse libraries and tables

import wrds
db = wrds.Connection()

# List all accessible libraries
libs = db.list_libraries()

# List tables in a library
tables = db.list_tables(library='crsp')

# Describe a table (columns, types)
schema = db.describe_table(library='crsp', table='dsf')

# Approximate row count
count = db.get_row_count('crsp', 'dsf')

# Preview data
sample = db.get_table('crsp', 'dsf', rows=5)

For common datasets (CRSP, Compustat, IBES, etc.), consult the reference:

Common datasets and query recipes — libraries, tables, standard filters, and variable glossary

Step 2: Build and Execute the Query

Use db.raw_sql() for all queries — it is the most flexible method.

data = db.raw_sql(
    "SELECT permno, date, ret FROM crsp.dsf WHERE date >= '2020-01-01'",
    date_cols=['date']
)

Key parameters

Parameter	Usage
`date_cols`	List of columns to parse as dates — always specify
`params`	Dict for parameterized queries: `%(name)s` syntax
`chunksize`	Process in chunks (default 500k rows); set `None` to disable
`return_iter`	`True` to get an iterator for very large downloads
`dtype_backend`	`"pyarrow"` for memory-efficient Arrow-backed DataFrames

Parameterized queries (prevent SQL injection)

params = {'tickers': ('AAPL', 'MSFT'), 'start': '2023-01-01'}
data = db.raw_sql("""
    SELECT a.permno, a.date, a.ret, b.ticker
    FROM crsp.dsf a
    JOIN crsp.stocknames b ON a.permno = b.permno
        AND a.date >= b.namedt AND a.date <= b.nameendt
    WHERE b.ticker IN %(tickers)s
        AND a.date >= %(start)s
""", params=params, date_cols=['date'])

Large downloads

For datasets exceeding ~1M rows, use chunked iteration:

chunks = db.raw_sql(
    "SELECT * FROM crsp.dsf WHERE date >= '2000-01-01'",
    chunksize=500000, return_iter=True
)
for i, chunk in enumerate(chunks):
    chunk.to_parquet(f'data/crsp_dsf_{i}.parquet')

Or download in one shot and save:

data = db.raw_sql("...", date_cols=['date'])
data.to_parquet('data/output.parquet')
# or
data.to_csv('data/output.csv', index=False)

Step 3: Save Output

Default conventions:

Save to Data/ directory in the project root (create if needed)
Prefer Parquet for large datasets (faster, smaller, typed)
Use CSV if the user requests it or for small datasets
Name files descriptively: crsp_daily_2020_2023.parquet, compustat_annual.csv

Always close the connection when done:

db.close()

Error Handling

NotSubscribedError — user lacks subscription to this library. Inform them to request access via their institution's WRDS coordinator.
SchemaNotFoundError — library name is wrong. Use db.list_libraries() to find the correct name.
Timeout on large queries — add date filters or use chunked downloads.

Notes

WRDS uses PostgreSQL under the hood; any valid PostgreSQL SQL works in raw_sql().
Always include date filters to avoid accidentally downloading entire multi-decade datasets.
The wrds package connects to wrds-pgdata.wharton.upenn.edu:9737 over SSL.
Credentials in ~/.pgpass are never exposed to the LLM — the wrds package reads them directly.

Resources

scripts/

wrds_setup.py — Check environment and interactively configure WRDS credentials

references/

common_datasets.md — Common WRDS libraries, tables, query recipes, and variable glossary

المزيد من هذا المستودع

نفس المستودع

econ-data-analysis

FuZhiyu/AgentContract

Guide for rigorous economic data analysis. Use PROACTIVELY whenever performing data analysis on economic or financial datasets — importing, cleaning, merging, constructing variables, or producing summary statistics. Three core principles: (1) describe before and after every transformation, (2) document in jupytext percent format with interleaved code/narrative/outputs, (3) validate against economic intuition, literature, and cross-variable relationships. Includes pitfall checklists for merges, aggregations, filtering, and variable construction. Language-agnostic (Python, Julia). Trigger: any data analysis task involving economic, financial, or panel data.

2026-04-080

draft-review

FuZhiyu/AgentContract

Comprehensive academic paper review covering mathematical correctness, writing clarity, consistency, argumentation, proofreading, and citations. Use when user asks to 'review draft', 'check paper', 'proofread manuscript', or requests feedback on academic writing. Can also verify code-paper consistency when source code is available. Defaults to comprehensive + standard review, with optional deep parallel review when Codex multi-agent support is available.

2026-04-070

mistral-pdf-to-markdown

FuZhiyu/AgentContract

Convert PDFs to Markdown using Mistral OCR API with image extraction. Use when you need to extract structured text and images from PDFs, especially for scanned documents or documents with complex formatting. Outputs Markdown with embedded images.

2026-04-070

research-project-template

FuZhiyu/AgentContract

Create new academic research projects with two-folder architecture. Use when user wants to create a new research project, start a new paper, set up a new analysis project, or mentions needing a project structure for research.

2026-04-070

work-journal

FuZhiyu/AgentContract

Create formal, fact-checked work journal entries after completing analysis work. Use when user asks to "summarize work", "document results", or "create work journal entry". Ensures code is committed, copies figures to attachments, and creates objective summaries with mandatory citations plus a report-quality verification pass. For quick reports without fact-checking, use the `report-in-markdown` skill.

2026-04-070

worktree-data-sync

FuZhiyu/AgentContract

Sync non-git data between existing git worktrees. Supports seed, diff, and apply modes using explicit --from/--to endpoints. Does not create/remove worktrees or manage sandbox settings.

2026-04-070

المصدر

FuZhiyu

FuZhiyu/AgentContract

فتح مستودع GitHub عرض مستودعات المنشئ

أمر التثبيت

تنزيل

تشغيل في Manus

مفيد لـSOC

علماء البياناتمهن الحاسوب والرياضيات15-2051L4

name	wrds-data
description	Search and download financial data from WRDS (Wharton Research Data Services). Use when asked to "download from WRDS", "search WRDS", "get CRSP data", "download Compustat", "find WRDS table", "get stock returns", "download IBES", or any WRDS/financial database task involving CRSP, Compustat, IBES, TAQ, OptionMetrics, Fama-French, BoardEx, DealScan, or other WRDS datasets.
user-invocable	true

WRDS Data

Download and query data from WRDS (Wharton Research Data Services) using the wrds Python package.

Step 0: Check Credentials

Before any WRDS operation, verify the environment is set up:

uv run --with wrds python -c "import wrds; print('wrds package: OK')"

Also verify that WRDS credentials exist in the expected PostgreSQL password file:

macOS/Linux: ~/.pgpass
Windows: %APPDATA%/postgresql/pgpass.conf

If any check fails, guide the user through setup:

No WRDS account: Direct to https://wrds-www.wharton.upenn.edu/register/ — requires institutional affiliation.
wrds package missing: Run uv pip install wrds.
No .pgpass credentials: Create the credentials file manually with:
```
wrds-pgdata.wharton.upenn.edu:9737:wrds:USERNAME:PASSWORD
```
Then chmod 600 ~/.pgpass on Unix/macOS.

If the user is working from a local checkout of this plugin and wants the helper script, it lives at:

plugins/wrds-data/skills/wrds-data/scripts/wrds_setup.py

Do NOT proceed with queries until --check passes.

Step 1: Identify the Dataset

If the user specifies a library/table, proceed directly. Otherwise, help them find it.

Browse libraries and tables

import wrds
db = wrds.Connection()

# List all accessible libraries
libs = db.list_libraries()

# List tables in a library
tables = db.list_tables(library='crsp')

# Describe a table (columns, types)
schema = db.describe_table(library='crsp', table='dsf')

# Approximate row count
count = db.get_row_count('crsp', 'dsf')

# Preview data
sample = db.get_table('crsp', 'dsf', rows=5)

For common datasets (CRSP, Compustat, IBES, etc.), consult the reference:

Common datasets and query recipes — libraries, tables, standard filters, and variable glossary

Step 2: Build and Execute the Query

Use db.raw_sql() for all queries — it is the most flexible method.

data = db.raw_sql(
    "SELECT permno, date, ret FROM crsp.dsf WHERE date >= '2020-01-01'",
    date_cols=['date']
)

Key parameters

Parameter	Usage
`date_cols`	List of columns to parse as dates — always specify
`params`	Dict for parameterized queries: `%(name)s` syntax
`chunksize`	Process in chunks (default 500k rows); set `None` to disable
`return_iter`	`True` to get an iterator for very large downloads
`dtype_backend`	`"pyarrow"` for memory-efficient Arrow-backed DataFrames

Parameterized queries (prevent SQL injection)

params = {'tickers': ('AAPL', 'MSFT'), 'start': '2023-01-01'}
data = db.raw_sql("""
    SELECT a.permno, a.date, a.ret, b.ticker
    FROM crsp.dsf a
    JOIN crsp.stocknames b ON a.permno = b.permno
        AND a.date >= b.namedt AND a.date <= b.nameendt
    WHERE b.ticker IN %(tickers)s
        AND a.date >= %(start)s
""", params=params, date_cols=['date'])

Large downloads

For datasets exceeding ~1M rows, use chunked iteration:

chunks = db.raw_sql(
    "SELECT * FROM crsp.dsf WHERE date >= '2000-01-01'",
    chunksize=500000, return_iter=True
)
for i, chunk in enumerate(chunks):
    chunk.to_parquet(f'data/crsp_dsf_{i}.parquet')

Or download in one shot and save:

data = db.raw_sql("...", date_cols=['date'])
data.to_parquet('data/output.parquet')
# or
data.to_csv('data/output.csv', index=False)

Step 3: Save Output

Default conventions:

Save to Data/ directory in the project root (create if needed)
Prefer Parquet for large datasets (faster, smaller, typed)
Use CSV if the user requests it or for small datasets
Name files descriptively: crsp_daily_2020_2023.parquet, compustat_annual.csv

Always close the connection when done:

db.close()

Error Handling

NotSubscribedError — user lacks subscription to this library. Inform them to request access via their institution's WRDS coordinator.
SchemaNotFoundError — library name is wrong. Use db.list_libraries() to find the correct name.
Timeout on large queries — add date filters or use chunked downloads.

Notes

WRDS uses PostgreSQL under the hood; any valid PostgreSQL SQL works in raw_sql().
Always include date filters to avoid accidentally downloading entire multi-decade datasets.
The wrds package connects to wrds-pgdata.wharton.upenn.edu:9737 over SSL.
Credentials in ~/.pgpass are never exposed to the LLM — the wrds package reads them directly.

Resources

scripts/

wrds_setup.py — Check environment and interactively configure WRDS credentials

references/

common_datasets.md — Common WRDS libraries, tables, query recipes, and variable glossary