This guide covers how to make DAAF your own: onboarding new datasets, authoring custom skills and agents, customizing your Python environment, and even creating entirely new workflow modes. Everything here is for your own use -- but please do see sharing extensions if you're interested in sharing what you build back with the broader DAAF community.
Guided framework modification: For any of the extension tasks below, you can use DAAF's Framework Development mode -- just tell DAAF you want to create or modify a skill, agent, mode, template, or workflow, and it will scope the work, follow canonical templates, execute integration checklists, and run a multi-angle review pass to ensure consistency. You can even design entirely new engagement modes from scratch. Framework Development mode is especially useful for complex changes that touch multiple files.
The Extension Model -- Skills, Agents, and Data Onboarding
Here's the fundamental insight behind DAAF's extensibility: the framework is intended to separate what it knows from how it behaves. This is a really important distinction that makes the whole extension model work.
- Skills are structured knowledge documents. They tell DAAF's agents what they need to know about a specific topic -- a data source, a Python library, a visualization framework, a domain of expertise. Think of skills as extremely thorough, well-organized reference guides that an agent loads into its context when it needs specialized knowledge, and that can be easily shared or transferred across multiple agents.
- Agents are behavioral protocols. They tell a subagent how to behave -- what steps to follow, what to validate, when to stop, how to format output. Think of agents as detailed job descriptions that define a specific role in the pipeline (the code reviewer, the data planner, the report writer, etc.).
This separation is what makes DAAF extensible without being fragile. When you want DAAF to work with a new dataset, you generally don't need to touch the workflow, the validation logic, or the agent protocols at all. You just add a new skill that teaches the existing agents about the new data. The agents already know how to fetch, clean, transform, and analyze data -- they just need to be told the specifics of your data.
The Three Extension Paths
| Extension Type | What You're Adding | Tool to Use | Result |
|---|---|---|---|
| Data source | Knowledge about a specific dataset | Data Onboarding Mode | A new data-source-skill |
| Methodology | Knowledge about a statistical or analytical method | Framework Development Mode | A new methodology-skill |
| Domain expertise | Knowledge about a content area or field | Framework Development Mode | A new context-skill |
The most common extension path by far is adding new data sources. DAAF has a dedicated engagement mode for this: Data Onboarding Mode, which orchestrates a thorough profiling protocol and generates the skill documentation automatically. For methodology and domain expertise skills, the process is lighter-weight -- you point DAAF at documentation or literature and it drafts a skill for review.
Data Onboarding Mode (Step-by-Step)
Data Onboarding Mode is DAAF's built-in workflow for turning a raw dataset (or online data source) into a comprehensive data source skill. It automates the tedious but critical work of profiling every column, detecting coded values, checking data quality, and reconciling documentation against actual data. A 10-minute video tutorial ↗ gives you the intuition and overview for how Data Onboarding works.
Before You Start
- A data file or API access -- either a file in a supported format (Parquet, CSV, Excel, or TSV) or access to an API that serves the data. Public data sources are strongly preferred. If working with proprietary or sensitive data, be careful to abide by your organization's AI policy and data protection standards -- Claude will be examining the actual contents of the data, and analytical outputs (sample rows, statistics, summaries) will be sent to Anthropic's servers as part of the conversation. For data protected by FERPA, HIPAA, or other regulations, consult your institution's compliance team and review your specific Anthropic license terms before proceeding.
- Any available documentation -- codebooks, data dictionaries, README files, or documentation website URLs. Not strictly required, but they dramatically improve the resulting skill because the agent can cross-reference what the documentation says against what the data actually shows.
- A sense of how the data will be used -- what research questions it might inform, what domain it belongs to, and which columns are most important.
Onboarding Data from an API
If your data source is available via a REST API rather than a downloadable file, DAAF can handle the acquisition during Data Onboarding. You'll need:
- API documentation -- a URL to the API docs, or a description of how it works. DAAF will research the API, but documentation dramatically improves the resulting fetch scripts.
- An API key -- add your key to the
environment_settings.txtfile in yourdaaf-dockerfolder on the host machine. - A sense of what to download -- which endpoint, what filters (date range, geography, etc.), and roughly how much data to expect.
DAAF will research the API, write a fetch script for your approval, download the data, and proceed with the standard profiling workflow. The fetch script is saved as a reproducible artifact.
Local vs. live access: During setup, DAAF will ask whether to download data once and work with a local copy (simpler, works offline) or always query the API live (keeps data current). You can change this preference later.
Complex APIs: If the API offers many endpoints and datasets (like Harvard Dataverse, or a large government open data platform), DAAF will suggest whether the API access documentation should live inside the data source skill (simpler, fine for most cases) or in a separate query/connector skill (better if you plan to onboard multiple datasets from the same API over time). This is always your call.
OAuth-protected APIs: DAAF handles simple API key authentication natively. If your API requires OAuth, you'll need to obtain a bearer/access token manually first and provide it as an environment variable.
Onboarding Multiple Related Files
If your data source comes as multiple related files -- for example, one file per year, or a set of files at different levels of aggregation (schools, districts, states) -- DAAF can profile them all together.
- Same structure or different? Files with the same columns (one per year, one per state) are combined into one dataset for profiling. Files with different structures (schools vs. districts) are profiled separately with cross-file relationship testing.
- One skill or many? By default, DAAF creates one unified skill covering all files. You can also opt for one skill per entity type for more granular documentation.
Provide all related files at intake rather than profiling them one at a time -- this lets DAAF test the relationships between files (join coverage, key integrity, temporal alignment) and document those relationships in the resulting skill.
Preparing Your Data
Place your data file anywhere accessible inside /daaf/ -- the ingest process will copy it into the research project's data/raw/ folder during setup. A common convention is /daaf/data/.
Getting files into the Docker volume: The easiest way is with the browser-based code editor -- run bash run_vscode.sh (or .\run_vscode.ps1 on Windows) from your daaf-docker folder, then drag and drop files. You can also use Docker Desktop's GUI (Containers → Files tab → right-click → Import).
- File size: Up to about 1GB without special handling. For larger files, DAAF will ask about a sampling strategy.
- File format: Parquet is ideal (fast, preserves types). CSV works fine but may have type inference quirks. Excel files work using
openpyxl, included with DAAF.
Where Your New Skill Will Fit In
When DAAF works with any data source, the orchestrator dispatches a subagent that loads the relevant skill into its context. The skill tells the subagent everything it needs: what variables exist, what coded values mean, known pitfalls, and how to access the data. The subagent uses that knowledge to do its job.
When you add a new data source skill, you're primarily adding knowledge at two points:
- Exploration (Stage 2-3): Your skill tells agents what data is available, what variables exist, and what caveats to watch for
- Context application (Stage 6): Your skill tells agents how to handle coded values, missing data patterns, and source-specific quirks during cleaning
Running Data Onboarding Mode
Just ask DAAF to ingest or profile a new dataset conversationally -- it will classify the request as Data Onboarding Mode automatically. Something like:
The Four Profiling Phases
- Structural Discovery -- Basic shape of the data (rows, columns, memory footprint, column types) and initial column-level profiling. This gives the agent a bird's-eye view including null rates, unique value counts, and basic distributions. It identifies which columns uniquely identify each row and which could be used to link to other datasets.
- Statistical Deep Dive -- Full distribution analysis for numeric columns, category enumeration for categorical columns, temporal pattern analysis, outlier detection. If the data has date/year columns or geographic identifiers, this phase also analyzes temporal coverage gaps and entity coverage.
- Relational Analysis -- Identifying potential key columns (high uniqueness suggests an identifier), foreign keys (naming patterns like
_idsuffixes), hierarchical relationships, cross-column dependencies, and detection of coded values -- those suspicious negative numbers like -1, -2, -9 that often mean "missing" or "suppressed" rather than being real values. - Interpretation & Reconciliation -- The agent uses column names, value patterns, and domain conventions to make educated guesses about what each column means. Every interpretation is explicitly marked as
[PRELIMINARY]-- the agent knows it's hypothesizing, not asserting. Column namedfips? Probably a FIPS geographic code. Column with values 0 and 1? Probably a binary indicator, but is 1 "Yes" or "Male" or "Urban"? The agent will flag the ambiguity.
If documentation was provided, the profiling also runs Documentation Reconciliation: it parses the codebook or data dictionary, extracts every claim it can find, and verifies each claim against the actual data. Documentation says there are 50 columns? The agent checks. Codebook says state_code should be a string? The agent confirms or flags the mismatch. This catches the disturbingly common case where documentation is outdated or describes a different version of the data.
Reviewing the Profile Output
The agent returns a structured report with:
- Structural summary -- row/column counts, memory size, format
- Column summary -- type, null rate, unique count, and notes for every column
- Coded values detected -- which columns have potential coded values, and whether documentation confirms their meaning
- Quality assessment -- scores for completeness, documentation accuracy, and coded value coverage
- Preliminary interpretations -- the agent's best guesses for what columns mean, each flagged with a confidence level and basis
- Discrepancies found -- every case where documentation contradicted observed data, with evidence for both sides
- User review requested -- explicit questions for you to answer: which interpretations are correct, how to handle undocumented values, whether missing columns are expected
This review step is not optional. The whole point of marking interpretations as [PRELIMINARY] is that you need to confirm or correct them. The agent has done the mechanical work of profiling, but the semantic understanding -- what these columns actually mean in context -- requires your domain expertise. Take the time to go through the review questions carefully. Your answers directly determine the quality of the resulting skill.
Once you've provided feedback, the agent finalizes the skill and writes it to .claude/skills/[skill-name]/. From there, you can start a fresh session and ask DAAF to analyze it alongside other datasets. Running it through some simple paces first (see Testing) is strongly recommended.
Registering Your New Skill
Skills are automatically discovered via their YAML frontmatter -- every skill with a SKILL.md file in .claude/skills/{skill-name}/ appears in the system message at conversation start. No manual registration is needed. The key to good discoverability is writing a clear, descriptive description field in your skill's YAML frontmatter.
Authoring Other Types of New Skills
Methodology Skills
For adding knowledge about a statistical method, Python library, or analytical technique, use Framework Development Mode and mention the skill-authoring skill directly. It may be helpful to refer DAAF to existing skills that yours will be most like -- a Python library skill? Reference the polars or plotnine skills as models. Something more methodological? Reference the data-scientist skill.
Domain Expertise Skills
Domain expertise skills capture knowledge about a content area rather than tooling -- the nuances of interpreting graduation rate data, the policy context around school funding formulas, or the methodological considerations for panel data in education research:
What the skill-authoring skill guides you through
- Frontmatter requirements -- the YAML header every skill needs, including naming conventions (lowercase-hyphenated, 1-64 chars) and description best practices
- Body structure patterns -- different organizing patterns depending on whether the skill is workflow-based (sequential steps), task-based (tool collection), reference-based (standards/specs), or capabilities-based (features)
- Progressive disclosure -- how to keep the main SKILL.md under 500 lines by splitting detailed content into
references/files that load on demand - Decision trees -- how to write effective navigation trees that help agents find what they need quickly
- Content limits -- SKILL.md body should stay under 500 lines and 5,000 words. Reference files have different economics: they load on-demand, so thoroughness is preferred over brevity
Adding a New Agent
Adding a new agent is an advanced task. Agents are deeply wired into the DAAF ecosystem -- they have producer/consumer relationships with other agents, they reference shared protocols, and they need to be discoverable by the orchestrator. If you're not comfortable with software architecture concepts, you can safely skip this section -- most customization needs are covered by data sources and skills above.
Adding a new agent is more complex than adding a skill, because agents define behavior rather than knowledge. Framework Development Mode with the agent-authoring skill guides the process through five phases:
- Design -- Get crystal clear on fundamentals before writing anything. The agent-authoring skill ensures you can answer five critical questions:
- What does this agent do and why does it exist? (one sentence)
- Which pipeline stage(s) does it operate in?
- Which existing agents are most similar, and how does yours differ?
- Does it need file-write access (
general-purpose) or is it read-only (Plan)? - Will it need to invoke any skills?
- Author -- Write the agent definition following the canonical 12-section template (defined in
agent_reference/AGENT_TEMPLATE.md). Required sections: Identity, Inputs, Core Behaviors, Protocol, Output Format, Boundaries, STOP Conditions, Anti-Patterns, Quality Standards, Invocation, References, and Consumers. The skill provides a self-validation checklist and targets 400-700 lines. - Integrate -- Register the agent across the DAAF ecosystem. This is where the most things can go wrong. Integration follows three tiers:
- Tier 1 (Mandatory): Register in
.claude/agents/README.md(Agent Index, When to Use, Coordination Matrix, and Agent catalog) - Tier 2 (Conditional): Additional updates if the agent maps to a specific pipeline stage
- Tier 3 (Conditional): Additional updates if the agent affects specific workflow areas
- Tier 1 (Mandatory): Register in
- Validate -- Verification checks to confirm cross-agent consistency and completeness.
- Human Review -- This is non-negotiable. Review the agent file yourself for accuracy, intention, completeness, and value before it's considered done.
Key Resources
| Resource | Purpose |
|---|---|
agent-authoring skill | Full workflow with integration checklist |
agent_reference/AGENT_TEMPLATE.md | Canonical 12-section template |
.claude/agents/README.md | Current agent landscape, commonly confused pairs, coordination matrix |
Agent definitions are stored in .claude/agents/. See the existing agents on GitHub ↗ for reference. For changes to existing agents, see the Contributing guide.
Testing Your New Extension End-to-End
Five test types to verify your extension works correctly, ordered from lightest to heaviest:
- Data Discovery Test -- Can DAAF find your new skill and understand what it's for?
What data sources does DAAF know about? Can you tell me about [your new data source]?
If DAAF can't find the skill, verify the YAML frontmatter has a clear
descriptionfield and thatSKILL.mdis in.claude/skills/{skill-name}/. Note: this may also be a non-deterministic loading issue -- LLMs don't always load what they're told to load. Try the query again in a fresh session before assuming the skill content is wrong. - Fetch Test -- Test that DAAF can actually retrieve and load the data.
Can you fetch [your data source] for [year] and show me the first few rows and basic summary statistics?
This tests the data access pathway. The fetch should complete with a CP1 validation (shape, types, missingness checks). If CP1 fails, the dataset path in your skill may not match what's actually available.
- Context Test -- Test whether coded value mappings, missing data codes, and caveats are applied correctly during cleaning.
Can you fetch and clean [your data source] for [year], making sure to handle any coded missing values and apply the source-specific caveats documented in the skill?
Watch the cleaning script that DAAF produces. It should reference the specific coded values and pitfalls documented in your skill.
- Full Pipeline Test -- The gold standard: run a simple research question through the entire pipeline.
Using [your new data source], can you analyze [simple, well-defined research question]? Keep the scope narrow -- just verify the data flows through correctly.
Pick a deliberately simple question. You're testing integration, not analytical sophistication.
- Methodology/Domain Skill Test -- For non-data-source skills, test that DAAF references your skill's guidance correctly.
I'd like to run a [method from your new skill] analysis on [some existing DAAF data]. Can you walk me through the approach?
Sharing Extensions with the Community (Optional)
Everything above works entirely locally -- your extensions are yours, and they don't need to go anywhere else. But if you've created something useful and want to share it with other DAAF users, this section covers how.
Before You Submit
- At least a Discovery Test and a Fetch Test pass for your extension
- You've thoroughly reviewed Data Onboarding output and corrected any preliminary interpretations -- skills with
[PRELIMINARY]markers still in place aren't ready for sharing - The skill follows the appropriate template, has at least 2 decision trees, and a substantive Common Pitfalls section
- The skill references only publicly accessible data -- if it was built from proprietary data, make sure it doesn't leak confidential information
- The extension follows DAAF's naming conventions
How to Submit
To share your extension with the DAAF community: fork the repository, add your skill files, update the registration entries, and submit a pull request. See the Contributing guide ↗ for the full workflow, quality standards, and review process.
If you're not comfortable with the pull request process, you can also open an issue ↗ describing your new skill and sharing the files, or post in the DAAF community channels ↗ -- the community can help get it integrated.
LEARNINGS.md: The Other Way to Contribute
Even without creating new skills, there's a contribution path that requires almost zero effort: sharing your LEARNINGS.md files. Every time DAAF completes a Full Pipeline project, it produces a LEARNINGS.md documenting everything it learned about data quirks, process issues, and methodology edge cases. These learnings are written to be immediately actionable.
You can also incorporate learnings directly into your own DAAF instance: start a new session and say "incorporate learnings" -- Framework Development mode will scan your project LEARNINGS.md files, present a consolidated backlog of framework improvements, and walk you through implementing them.
To share learnings with the broader community, open an issue ↗ with your LEARNINGS.md content. This is genuinely one of the most impactful things you can do -- every project run generates practical knowledge that benefits every future project.
Customizing Your Python Environment
DAAF ships with 50+ data science packages pre-installed (Polars, Plotnine, Plotly, scikit-learn, and more). Most users will never need to add packages. This section is for users who need a specific Python library that isn't already included.
The Recommended Path: Modify the Dockerfile
The best way to add packages is to ask DAAF to edit the Dockerfile and then rebuild the container. This is a multi-step process with one step that's easy to miss, so this section walks through it carefully.
Why this approach: packages are reproducible (anyone building from the Dockerfile gets the same environment), persistent (they survive container rebuilds and restarts), and permission-safe (Dockerfile installs run as root during build).
The Container-Host Boundary: There are two copies of the Dockerfile:
| Copy | Location | Who edits it | Who reads it |
|---|---|---|---|
| Volume copy | /daaf/Dockerfile inside the container | DAAF / Claude Code during sessions | Nothing -- working copy |
| Host copy | daaf-docker/ on your computer | You, via the rebuild script | docker compose up -d --build |
When DAAF edits the Dockerfile, it edits the volume copy. You need to copy it back to the host before rebuilding.
Step-by-Step:
- Ask DAAF to edit the Dockerfile. Inside your session, ask Claude to add your package (e.g., "I'd like to add
networkx==3.4.2to the Dockerfile"). DAAF will pick the appropriate install block and pause for your approval. - Exit and rebuild. After DAAF finishes, follow this sequence:
# Inside Claude Code /exit # Back in the container shell (appuser@xxxx:/daaf$) exit # From your host terminal, in your daaf-docker folder: bash rebuild_daaf.sh # macOS / Linux .\rebuild_daaf.ps1 # Windows
The rebuild script handles the tricky part automatically: it copies the updated Dockerfile from inside the container back to your host build directory, then rebuilds the Docker image.
Why is this step needed? The Dockerfile lives in two places -- inside the Docker volume (where DAAF edited it) and in your
daaf-docker/folder (wheredocker composereads it for builds). The rebuild script bridges this gap so the two copies stay in sync. - Re-enter and verify. After rebuild completes, re-enter the container and confirm:
bash run_daaf.sh bash pip list | grep networkx
Adding System-Level Dependencies
Some Python packages require system-level libraries (e.g., libgdal for geospatial work). Add these to the apt-get install section in the Dockerfile, before the Python packages.
Common scenarios:
- I need a specific version of an already-installed package -- Change the version pin in the Dockerfile (e.g.,
polars==1.30.0topolars==1.31.0). Be cautious: version changes can affect downstream dependencies. - A package requires compilation and it's failing -- It likely needs system-level build dependencies. Check the package's installation docs for required system libraries and add them to the
apt-get installblock. - Can I use
apt-getorsudoinside the running container? -- No. System-level changes must go through the Dockerfile. This is a deliberate security constraint.
Dockerfile syntax tips:
- Every line in a
RUNblock except the last ends with a backslash (\) - Versions are pinned with
==(e.g.,networkx==3.4.2) for reproducibility - If you're unsure which version to pin, DAAF can look up the latest on PyPI
Runtime Installation for Quick Testing
For quick testing, you can install packages at runtime inside the container using uv pip install --user [package]. The --user flag is needed because the container runs as a non-root user. These installations are ephemeral -- they will be lost when the container is rebuilt or restarted. Think of runtime installs as a test drive:
- Install at runtime to test:
uv pip install --user <package> - Verify it works for your use case
- Add it to the Dockerfile and rebuild to make it permanent
Understanding the uv Package Manager
DAAF uses uv instead of pip for package management. uv is a fast, Rust-based Python package installer. Use uv pip install anywhere you'd normally use pip install.
The Dockerfile uses uv pip install --system (installs system-wide during build, when running as root). At runtime, since you're a non-root user, use uv pip install --user instead. Regular pip also works -- pip install --user <package> is equally valid.
You can ask DAAF to run a uv pip compile dry-run to test compatibility between all package versions before committing to the rebuild, especially for packages with many transitive dependencies.
Checking What's Already Installed
Run uv pip list inside the container to see all installed Python packages. DAAF ships with 50+ data science packages pre-installed, including Polars, Plotnine, Plotly, scikit-learn, and more.
Notes on R, Stata, and Container Permissions
- Using R or Stata? DAAF is Python-based and does not include R. However, DAAF includes translation skills (
r-python-translationandstata-python-translation) that can help you find Python equivalents for R or Stata operations you're familiar with. - Can you use
sudoinside the container? No. The container runs as a non-root user with all Linux capabilities dropped and privilege escalation blocked. All system-level software must be installed through the Dockerfile and built into the image.