Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

alignagent-adaptive-learner-intelligence

Name: Alignagent Adaptive Learner Intelligence
Author: ndpvt-web

// Build multi-agent adaptive learning systems that diagnose knowledge gaps and recommend targeted resources. Implements the ALIGNAgent framework: Skill Gap Agent (proficiency estimation + concept-level diagnostic reasoning) and Recommender Agent (preference-aware resource retrieval aligned to deficiencies). Trigger phrases: - "Build an adaptive learning system" - "Create a personalized tutoring agent" - "Diagnose student knowledge gaps from quiz data" - "Build a skill gap analyzer for learners" - "Create an educational recommender that adapts to student performance" - "Implement a multi-agent pipeline for personalized education"

Ejecutar en Manus

$ git log --oneline --stat

stars:4

forks:0

updated:13 de febrero de 2026, 13:35

SKILL.md

readonly

related-skills.json

mismo repositorio

a2rag-adaptive-agentic-graph.md

from "ndpvt-web/arxiv-claude-skills"

Build adaptive, cost-aware Graph-RAG pipelines that route queries through escalating retrieval stages (local -> bridge -> global) with triple-check verification and provenance map-back. Use when: 'build a graph RAG pipeline', 'implement adaptive retrieval for knowledge graphs', 'cost-aware multi-hop question answering', 'add evidence verification to RAG', 'handle mixed-difficulty queries efficiently', 'graph retrieval with source text grounding'.

2026-02-134

adaptbpe-general-purpose-specialized.md

from "ndpvt-web/arxiv-claude-skills"

Adapt general-purpose BPE tokenizers into domain- or language-specialized tokenizers using the AdaptBPE post-training strategy. Replaces low-utility tokens with high-frequency domain-specific tokens to improve tokenization efficiency without retraining from scratch. Trigger phrases: "adapt tokenizer to domain", "specialize BPE for medical text", "optimize tokenizer for French", "reduce token fertility for code", "adapt vocabulary for legal documents", "domain-specific tokenizer"

2026-02-134

addressing-explainability-generative-ai.md

from "ndpvt-web/arxiv-claude-skills"

Explain generative AI outputs using the gSMILE perturbation-based attribution framework. Builds local surrogate models from controlled input perturbations and Wasserstein distance to produce token-level or word-level importance scores for LLM and diffusion model outputs. Triggers: 'explain why the model generated this', 'token attribution for prompt', 'which words in my prompt matter most', 'interpret generative model output', 'build explainability for my LLM pipeline', 'debug prompt influence on generation'

2026-02-134

agent-based-software-artifact-evaluation.md

from "ndpvt-web/arxiv-claude-skills"

Automatically evaluate software research artifacts (code repositories with READMEs) by constructing dependency-aware command graphs, building containerized environments, and executing instructions with structured error recovery. Use when asked to: 'evaluate this artifact', 'reproduce this paper's results', 'run this repo's README instructions', 'check if this artifact builds and runs', 'automate artifact evaluation', 'verify research reproducibility'.

2026-02-134

agentcgroup-understanding-controlling-os.md

from "ndpvt-web/arxiv-claude-skills"

Design and implement OS-level resource controls for sandboxed AI agents using hierarchical cgroups, eBPF enforcement, and tool-call-level resource management. Use when: 'set up cgroups for AI agent containers', 'control memory for coding agents', 'isolate tool-call resources with eBPF', 'manage multi-tenant agent resource limits', 'prevent OOM kills in agent sandboxes', 'configure agent resource policies with cgroup v2'.

2026-02-134

ai-agent-systems-supply.md

from "ndpvt-web/arxiv-claude-skills"

Build LLM-based multi-agent systems for supply chain inventory management using structured decision prompts and memory-retrieval (AIM-RM). Implements the beer game multi-echelon supply chain simulation with per-stage agents that use stepwise ordering prompts, safety-stock calculations, and Euclidean-distance memory retrieval of similar historical episodes. Use when asked to: "build a supply chain agent", "implement inventory management with LLMs", "create a beer game simulation with AI agents", "multi-agent ordering system", "AIM-RM memory retrieval agent", "supply chain decision prompt design".

2026-02-134

package.json

"author": "ndpvt-web"

"repository": "ndpvt-web/arxiv-claude-skills"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

TutoresEducación y bibliotecas25-3041L4

name

alignagent-adaptive-learner-intelligence

description

Build multi-agent adaptive learning systems that diagnose knowledge gaps and recommend targeted resources. Implements the ALIGNAgent framework: Skill Gap Agent (proficiency estimation + concept-level diagnostic reasoning) and Recommender Agent (preference-aware resource retrieval aligned to deficiencies). Trigger phrases: - "Build an adaptive learning system" - "Create a personalized tutoring agent" - "Diagnose student knowledge gaps from quiz data" - "Build a skill gap analyzer for learners" - "Create an educational recommender that adapts to student performance" - "Implement a multi-agent pipeline for personalized education"

ALIGNAgent: Adaptive Learner Intelligence for Gap Identification and Next-step Guidance

This skill enables Claude to build multi-agent educational systems that close the loop between assessment, diagnosis, and intervention. Based on the ALIGNAgent framework, the approach processes learner performance data (quiz scores, gradebook records, preferences) through a Skill Gap Agent that computes topic-level proficiency scores and performs concept-level diagnostic reasoning to identify specific misconceptions, then routes deficiencies to a Recommender Agent that retrieves preference-aware learning materials. The key innovation is the continuous feedback loop: interventions occur before advancing to subsequent topics, creating an iterative remediation cycle rather than a one-shot recommendation.

When to Use

When building a personalized learning platform that needs to diagnose what a student doesn't know and why
When the user wants to analyze quiz or assessment data to compute per-topic proficiency scores
When creating a tutoring system that adapts resource recommendations to individual learning gaps
When implementing a multi-agent pipeline where one agent diagnoses and another recommends
When the user needs to classify learner proficiency (High/Medium/Low) from gradebook data and validate against exam performance
When designing formative assessment cycles that loop diagnosis into intervention before progression

Key Technique

Two-Agent Architecture with Continuous Feedback Loop. ALIGNAgent separates concerns into a Skill Gap Agent and a Recommender Agent connected by a structured handoff. The Skill Gap Agent computes topic-level proficiency as rho_t = (1/|I_t|) * sum(g_i for i in I_t) where I_t is the set of assessment items for topic t and g_i is the normalized score on item i. It then applies a mastery threshold tau to produce a gap set G(s) = {(t, rho_t) | rho_t < tau}, sorted by ascending proficiency so the weakest areas get attention first. Beyond this quantitative layer, the agent performs concept-level diagnostic reasoning -- analyzing patterns in incorrect answers, distractor selection frequency, and cross-item reasoning to produce natural-language explanations of why a learner is struggling (e.g., "confuses AVL rotation direction" rather than just "low score on AVL trees").

Preference-Aware Resource Retrieval. The Recommender Agent constructs structured queries from the gap set and learner preferences (video vs. text, self-paced vs. structured), retrieves candidate resources, validates them against modality preferences and topic relevance, and returns up to K resources per gap. This is diagnostic-driven retrieval -- the query is shaped by the specific misconception, not just the broad topic.

Formative Loop, Not One-Shot. The critical design principle is that the pipeline re-executes after intervention. New assessment data updates the proficiency vector, and topics that remain below threshold trigger fresh recommendations. This makes ALIGNAgent a cycle, not a pipeline -- each iteration refines the learner model.

Step-by-Step Workflow

Define the domain schema. Create a topic taxonomy for your subject area as a structured map: {topic_id: {name, prerequisite_topics[], concept_tags[]}}. Each assessment item must be tagged with its topic and concept(s).
Ingest learner data. Parse three input streams into a unified learner profile: (a) gradebook records with per-item scores normalized to [0, 1], (b) quiz/assessment responses including selected answers (not just correctness), and (c) learner preferences (modality, pacing, prior stated goals).
Compute topic-level proficiency. For each topic t, calculate rho_t = mean(normalized_scores for items tagged with t). Assemble the full proficiency vector rho(s) = [rho_t1, rho_t2, ..., rho_tn].
Classify proficiency levels. Map each rho_t to a categorical label using thresholds: High (rho_t >= 0.8), Medium (0.5 <= rho_t < 0.8), Low (rho_t < 0.5). These thresholds should be configurable.
Identify skill gaps. Apply mastery threshold tau (default: 0.7) to produce the gap set G(s) = {(t, rho_t) | rho_t < tau}. Sort gaps by ascending rho_t so the weakest topics are prioritized.
Run concept-level diagnostic reasoning. For each topic in the gap set, analyze the learner's incorrect responses: which distractors were selected, which concepts those distractors test, and whether errors cluster around a specific misconception. Produce a natural-language diagnostic per gap (e.g., "Consistently confuses DFS with BFS traversal order; correctly identifies base cases but fails on recursive step for graph cycles").
Construct retrieval queries. For each diagnosed gap, build a structured query combining: topic name, specific misconception description, and learner modality preference. Example: {topic: "AVL Trees", misconception: "rotation direction after double rotation", preference: "video tutorial"}.
Retrieve and filter resources. Search for candidate resources matching each query. Filter by: (a) topic relevance to the specific misconception, (b) modality alignment with learner preferences, (c) content quality signals. Return up to K resources per gap, ranked by alignment score.
Generate learner summary. Produce a structured output containing: strengths (topics with High proficiency), gaps (topics below threshold with diagnostic explanations), recommended actions (ordered resource list per gap), and next assessment targets.
Close the feedback loop. After the learner engages with resources and completes a new assessment, re-run steps 3-9 with updated data. Track proficiency delta per topic to measure intervention effectiveness and adjust recommendations accordingly.

Concrete Examples

Example 1: Analyzing quiz performance in a Python programming course

User: "I have quiz data for 30 students across 5 Python topics. Help me build a system that identifies each student's weak areas and recommends resources."

Approach:

Parse the quiz CSV with columns: student_id, quiz_id, topic, question_id, score, max_score
Compute per-student, per-topic proficiency scores
Apply threshold to identify gaps
Generate diagnostic reasoning per gap
Match resources to gaps and preferences

Output structure:

{
  "student_id": "S042",
  "proficiency_vector": {
    "variables_types": 0.92,
    "control_flow": 0.78,
    "functions": 0.45,
    "data_structures": 0.33,
    "file_io": 0.60
  },
  "proficiency_labels": {
    "variables_types": "High",
    "control_flow": "Medium",
    "functions": "Low",
    "data_structures": "Low",
    "file_io": "Medium"
  },
  "skill_gaps": [
    {
      "topic": "data_structures",
      "proficiency": 0.33,
      "diagnostic": "Correctly uses lists but fails when indexing nested structures. Confuses dict key access with list indexing. Missed all questions involving dictionary comprehensions.",
      "resources": [
        {"title": "Python Data Structures - Nested Collections", "type": "video", "url": "..."},
        {"title": "Dictionary Comprehension Practice Problems", "type": "interactive", "url": "..."}
      ]
    },
    {
      "topic": "functions",
      "proficiency": 0.45,
      "diagnostic": "Understands function definition syntax but struggles with return vs print distinction. Errors cluster around scope-related questions -- treats local variables as global.",
      "resources": [
        {"title": "Python Scoping Rules Explained", "type": "article", "url": "..."},
        {"title": "Functions and Return Values - Step by Step", "type": "video", "url": "..."}
      ]
    }
  ],
  "strengths": ["variables_types"],
  "next_assessment": ["data_structures", "functions"]
}

Example 2: Building the multi-agent system in code

User: "Implement the ALIGNAgent two-agent pipeline as a Python module."

Approach:

Define data models for learner profiles, assessments, and proficiency vectors
Implement the Skill Gap Agent as a class with proficiency computation and LLM-based diagnostic reasoning
Implement the Recommender Agent with query construction and resource retrieval
Wire them together in a pipeline that accepts assessment data and returns a learner summary

from dataclasses import dataclass, field
from typing import Optional
import json

@dataclass
class AssessmentItem:
    question_id: str
    topic: str
    concepts: list[str]
    score: float          # normalized to [0, 1]
    selected_answer: str  # the actual answer chosen
    correct_answer: str

@dataclass
class LearnerProfile:
    student_id: str
    preferences: dict     # e.g., {"modality": "video", "pacing": "self-paced"}
    assessments: list[AssessmentItem] = field(default_factory=list)

@dataclass
class SkillGap:
    topic: str
    proficiency: float
    diagnostic: str       # natural-language explanation
    misconceptions: list[str]

class SkillGapAgent:
    def __init__(self, mastery_threshold: float = 0.7, llm_client=None):
        self.tau = mastery_threshold
        self.llm = llm_client

    def compute_proficiency(self, profile: LearnerProfile) -> dict[str, float]:
        topic_scores: dict[str, list[float]] = {}
        for item in profile.assessments:
            topic_scores.setdefault(item.topic, []).append(item.score)
        return {t: sum(s) / len(s) for t, s in topic_scores.items()}

    def identify_gaps(self, proficiency: dict[str, float]) -> list[tuple[str, float]]:
        gaps = [(t, rho) for t, rho in proficiency.items() if rho < self.tau]
        return sorted(gaps, key=lambda x: x[1])  # weakest first

    def diagnose(self, profile: LearnerProfile, gaps: list[tuple[str, float]]) -> list[SkillGap]:
        diagnosed = []
        for topic, rho in gaps:
            incorrect = [a for a in profile.assessments
                         if a.topic == topic and a.score < 1.0]
            # Use LLM for concept-level diagnostic reasoning
            prompt = self._build_diagnostic_prompt(topic, incorrect)
            reasoning = self.llm.generate(prompt, temperature=0)
            diagnosed.append(SkillGap(
                topic=topic, proficiency=rho,
                diagnostic=reasoning["explanation"],
                misconceptions=reasoning["misconceptions"]
            ))
        return diagnosed

    def _build_diagnostic_prompt(self, topic: str, incorrect_items: list) -> str:
        items_desc = json.dumps([{
            "question_id": i.question_id,
            "concepts": i.concepts,
            "selected": i.selected_answer,
            "correct": i.correct_answer
        } for i in incorrect_items], indent=2)
        return f"""Analyze these incorrect responses for topic "{topic}".
Identify specific misconceptions by examining which distractors were chosen and what conceptual errors they reveal.

Incorrect responses:
{items_desc}

Return JSON with:
- "explanation": one paragraph describing the pattern of errors
- "misconceptions": list of specific misconception strings"""


class RecommenderAgent:
    def __init__(self, max_resources_per_gap: int = 3, search_fn=None):
        self.k = max_resources_per_gap
        self.search = search_fn

    def recommend(self, gaps: list[SkillGap], preferences: dict) -> dict[str, list]:
        recommendations = {}
        for gap in gaps:
            query = self._build_query(gap, preferences)
            candidates = self.search(query)
            filtered = [r for r in candidates
                        if self._matches_preference(r, preferences)]
            recommendations[gap.topic] = filtered[:self.k]
        return recommendations

    def _build_query(self, gap: SkillGap, preferences: dict) -> str:
        modality = preferences.get("modality", "any")
        return f"{gap.topic} {gap.misconceptions[0]} {modality} tutorial"

    def _matches_preference(self, resource: dict, preferences: dict) -> bool:
        modality = preferences.get("modality")
        if not modality or modality == "any":
            return True
        return resource.get("type", "").lower() == modality.lower()


class ALIGNAgentPipeline:
    def __init__(self, mastery_threshold=0.7, max_resources=3,
                 llm_client=None, search_fn=None):
        self.skill_gap_agent = SkillGapAgent(mastery_threshold, llm_client)
        self.recommender_agent = RecommenderAgent(max_resources, search_fn)

    def run(self, profile: LearnerProfile) -> dict:
        proficiency = self.skill_gap_agent.compute_proficiency(profile)
        gaps = self.skill_gap_agent.identify_gaps(proficiency)
        diagnosed = self.skill_gap_agent.diagnose(profile, gaps)
        resources = self.recommender_agent.recommend(diagnosed, profile.preferences)

        return {
            "student_id": profile.student_id,
            "proficiency": proficiency,
            "strengths": [t for t, rho in proficiency.items() if rho >= 0.8],
            "gaps": [{
                "topic": g.topic,
                "proficiency": g.proficiency,
                "diagnostic": g.diagnostic,
                "misconceptions": g.misconceptions,
                "resources": resources.get(g.topic, [])
            } for g in diagnosed],
            "next_assessment_targets": [g.topic for g in diagnosed]
        }

Example 3: Retrofitting an existing quiz app with adaptive diagnostics

User: "I have a Flask quiz app that stores results in PostgreSQL. Add adaptive gap analysis after each quiz submission."

Approach:

Read the existing quiz submission endpoint and database schema
Add a post-submission hook that triggers the Skill Gap Agent
Store proficiency vectors in a new learner_proficiency table
Add a /api/student/<id>/gaps endpoint that returns diagnosed gaps with recommendations
Implement proficiency delta tracking across quiz attempts

# Added to existing Flask app -- new endpoint
@app.route("/api/student/<student_id>/gaps", methods=["GET"])
def get_student_gaps(student_id):
    # Fetch assessment history from DB
    assessments = db.session.query(QuizResponse).filter_by(
        student_id=student_id
    ).all()

    profile = LearnerProfile(
        student_id=student_id,
        preferences=get_learner_preferences(student_id),
        assessments=[to_assessment_item(a) for a in assessments]
    )

    pipeline = ALIGNAgentPipeline(
        mastery_threshold=0.7,
        llm_client=openai_client,
        search_fn=search_educational_resources
    )
    result = pipeline.run(profile)

    # Store updated proficiency for delta tracking
    store_proficiency_snapshot(student_id, result["proficiency"])

    return jsonify(result)

Best Practices

Do: Tag every assessment item with both a topic and fine-grained concept labels. The diagnostic reasoning quality depends entirely on concept-level metadata -- without it, you get "low score on Topic X" instead of actionable misconception identification.
Do: Set LLM temperature to 0 for proficiency estimation and diagnostic reasoning. The paper found deterministic outputs produce more consistent and reproducible classifications.
Do: Sort gaps by ascending proficiency and address them in order. The weakest areas benefit most from early intervention, and prerequisite topics often gate understanding of downstream topics.
Do: Track proficiency deltas across assessment cycles. A gap that persists after intervention signals that the recommended resources were ineffective and the diagnostic may need revision.
Avoid: Using only aggregate scores without analyzing selected answers. The concept-level diagnostic reasoning requires knowing which wrong answer was chosen, not just that the answer was wrong.
Avoid: Recommending resources based solely on topic match. The Recommender Agent must incorporate the specific misconception and the learner's modality preference -- a generic "learn about trees" video won't address a specific AVL rotation confusion.

Error Handling

Sparse assessment data: When a topic has fewer than 3 items, proficiency estimates are unreliable. Flag these with a confidence indicator and require additional assessment before diagnosing specific misconceptions.
LLM diagnostic hallucination: The diagnostic reasoning agent may invent misconceptions not supported by the response data. Validate diagnostics by checking that cited concepts actually appear in the incorrect items' concept tags.
Broken or irrelevant resources: Web-retrieved resources may be dead links or off-topic. Implement URL validation and content-relevance checking before presenting resources to learners.
Threshold sensitivity: A mastery threshold that is too high floods learners with gaps; too low misses real deficiencies. Start with tau=0.7 and calibrate by comparing against held-out exam performance.
Cold start problem: New learners with no assessment history cannot be profiled. Provide a brief diagnostic pre-assessment or default to the most common misconceptions for the cohort.

Limitations

The framework requires assessment items tagged with topic and concept metadata. Unstructured or untagged quiz data cannot be processed without a manual or LLM-assisted tagging step.
Validated only on small cohorts (11-14 students) in undergraduate CS courses. Effectiveness in other domains, age groups, or at scale is unverified.
Resource recommendation quality depends on the availability of modality-specific materials for niche misconceptions. Obscure topics may yield poor or no matches.
The proficiency formula uses simple averaging, which doesn't account for question difficulty, recency of attempts, or item discrimination. More sophisticated models (IRT, BKT) may be needed for high-stakes applications.
Concept-level diagnostic reasoning via LLMs can be inconsistent across models. GPT-4o significantly outperformed other models in the paper's evaluation.

Reference

ALIGNAgent: Adaptive Learner Intelligence for Gap Identification and Next-step guidance -- Focus on Section 3 (Framework Architecture), Algorithm 1 (Skill Gap identification), Algorithm 2 (Resource Recommendation), and Tables 1-2 (evaluation results showing 0.87-0.90 precision with GPT-4o agents).

alignagent-adaptive-learner-intelligence

Más de este repositorio

Más de este repositorio

ALIGNAgent: Adaptive Learner Intelligence for Gap Identification and Next-step Guidance

When to Use

Key Technique

Step-by-Step Workflow

Concrete Examples

Best Practices

Error Handling

Limitations

Reference

ALIGNAgent: Adaptive Learner Intelligence for Gap Identification and Next-step Guidance

When to Use

Key Technique

Step-by-Step Workflow

Concrete Examples

Best Practices

Error Handling

Limitations

Reference