GitHub Agentic Workflows is a Go-based GitHub CLI extension for writing agentic workflows in natural language using markdown files, running them as GitHub Actions.
gh-aw is a GitHub CLI extension (gh aw) that compiles markdown workflows into GitHub Actions. It is not the GitHub Copilot CLI (copilot command). While workflows can use the Copilot CLI as an AI engine, gh-aw itself is a separate tool for workflow management and compilation.
- Use
gh awcommands (e.g.,gh aw compile,gh aw run) to work with agentic workflows - Use
/agentin GitHub Copilot Chat to invoke the unifiedagentic-workflowscustom agent (specify your intent: create/debug/update/upgrade) - The
copilotCLI command is only used internally within workflows when specified as the engine
BE LAZY: Skills in skills/ provide detailed, specialized knowledge about specific topics. Only reference skills when you actually need their specialized knowledge. Do not load or reference skills preemptively.
When to use skills:
- You encounter a specific technical challenge that requires specialized knowledge
- You need detailed guidance on a particular aspect of the codebase (e.g., console rendering, error messages)
- You're working with a specific technology integration (e.g., GitHub MCP server, Copilot CLI)
When NOT to use skills:
- For general coding tasks that don't require specialized knowledge
- When the information is already available in this AGENTS.md file
- For simple, straightforward changes
Available Skills Directory: skills/
Each skill provides focused guidance on specific topics. Reference them only as needed rather than loading everything upfront.
🚨 BEFORE EVERY COMMIT - NO EXCEPTIONS:
make agent-finish # Runs build, test, recompile, fmt, lintWhy this matters:
- CI WILL FAIL if you skip this step - this is automatic and non-negotiable
- Unformatted code causes immediate CI failures that block all other work
- This has caused 5 CI failures in a single day - don't be the 6th!
- The formatting check (
go fmt) is strict and cannot be disabled
If you're in a hurry and make agent-finish takes too long, at minimum run:
make fmt # Format Go, JavaScript, and JSON files
make test-unit # Fast unit tests (~25s)After making Go code changes (*.go files):
make fmt # REQUIRED - formats Go code with go fmtAfter making workflow changes (*.md files):
make recompile # REQUIRED - recompile all workflow files after code changesAfter making JavaScript changes (*.cjs files):
make fmt-cjs # REQUIRED - ensures JavaScript is properly formattedNEVER ADD LOCK FILES TO .GITIGNORE - .lock.yml files are compiled workflows that must be tracked.
ALWAYS REBUILD AFTER SCHEMA CHANGES:
make build # Rebuild gh-aw after modifying JSON schemas in pkg/parser/schemas/Schema files are embedded in the binary using //go:embed directives, so changes require rebuilding the binary.
ALWAYS ADD BUILD TAGS TO TEST FILES:
Every test file (*_test.go) must have a build tag at the very top of the file:
//go:build !integration // For unit tests (default)
//go:build integration // For integration tests (files with "integration" in name)Rules:
- Files with "integration" in the filename get
//go:build integration - All other test files get
//go:build !integration - The build tag must be the first line of the file, followed by an empty line
To add build tags to all test files:
./scripts/add-build-tags.shALWAYS RUN LINTERS AFTER ADDING TEST FILES:
When adding new test files (*_test.go), the unused linter may catch helper functions that are defined but never called. Always run linters after creating test files to catch these issues early.
make lint # Catches unused, testifylint, misspell, unconvert issuesCommon linting issues in test files:
-
unused: Helper functions defined but never called
- ❌ BAD: Defining
func hasInternalPrefix(key string) bool { ... }but never using it - ✅ GOOD: Either use the function in tests or remove it
- ❌ BAD: Defining
-
testifylint: Assertion best practices
- Always provide descriptive assertion messages
- Use
require.*for setup assertions that must pass - Use
assert.*for test validations - Use
assert.Error(t, err, "msg")notassert.NotNil(t, err) - Use
assert.NoError(t, err, "msg")notassert.Nil(t, err)
Before committing test files:
make agent-finish # REQUIRED - Full validation including lintExample of correct test code:
// ✅ CORRECT - Using helper functions
func TestCompile(t *testing.T) {
compiler := NewCompiler()
require.NotNil(t, compiler, "Compiler should be created")
// Use helper function if defined
err := compiler.Compile("test.md")
assert.NoError(t, err, "Should compile valid workflow")
}ALWAYS USE GITHUB MCP FOR GITHUB API ACCESS WITH COPILOT ENGINE:
The Copilot coding agent cannot directly access api.github.com. When using the copilot engine, you must configure the GitHub MCP server to access GitHub information (repositories, issues, pull requests, etc.).
CORRECT - Using GitHub MCP:
engine: copilot
tools:
github:
mode: remote # or "local" for Docker-based
toolsets: [default] # Enables repos, issues, pull_requests, etc.INCORRECT - Trying to access api.github.com directly:
engine: copilot
network:
allowed:
- "api.github.com" # ❌ This will NOT work - Copilot cannot access api.github.comKey points:
- The GitHub MCP server provides all necessary GitHub API functionality
- Use
toolsets: [default]for common operations, or specify toolsets like[repos, issues, pull_requests] - Both
mode: remote(hosted) andmode: local(Docker) work with Copilot - Never rely on direct
api.github.comaccess in Copilot workflows
See GitHub MCP Server Documentation for complete configuration details.
When instructed to "merge main", follow these steps WITHOUT asking for confirmation:
-
Start the merge from origin/main:
git fetch origin main git merge origin/main
-
Fix merge conflicts in .go and .cjs files:
- Manually resolve conflicts in Go files (
.go) - Manually resolve conflicts in CommonJS files (
.cjs) - Stage resolved files with
git add <file>
- Manually resolve conflicts in Go files (
-
Build and recompile lock files:
make build # Rebuild the binary make recompile # Recompile all workflow lock files
-
Finish the merge:
git commit # Complete the merge (accept default merge message)
Important: Do NOT ask for confirmation when performing a merge - execute all steps automatically.
# Fresh clone setup
make deps # ~1.5min first run
make deps-dev # +5-8min for linter
make build # ~1.5s
./gh-aw --helpJavaScript and shell script files are NOT embedded in the binary.
The architecture uses runtime file copying instead of embedded scripts:
Source of truth: actions/setup/js/*.cjs (manually edited, committed to git)
When modifying JavaScript files:
- Edit files in
actions/setup/js/(source of truth) - Test files (
*.test.cjs) are co-located with source code inactions/setup/js/ - Run
make fmt-cjsto format JavaScript files - Run
make lint-cjsto validate JavaScript files - Files are used directly at runtime (no sync or embedding required)
Runtime flow:
- The
actions/setupaction copies files fromactions/setup/js/to/tmp/gh-aw/actionsat runtime - Workflow jobs use these runtime files via
require()statements - No embedding via
//go:embed- files are accessed directly from the actions directory
Source of truth: actions/setup/sh/*.sh (manually edited, committed to git)
When modifying shell scripts:
- Edit files in
actions/setup/sh/(source of truth) - Files are used directly at runtime (no sync or embedding required)
Runtime flow:
- The
actions/setupaction copies files fromactions/setup/sh/to/tmp/gh-aw/actionsat runtime - Workflow jobs execute these shell scripts directly from
/tmp/gh-aw/actions - No embedding via
//go:embed- files are accessed directly from the actions directory
Key points:
actions/setup/js/*.cjsandactions/setup/sh/*.sh= Source of truth (manually edited, committed)pkg/workflow/js/= Contains onlysafe_outputs_tools.json(not synced .cjs files)pkg/workflow/sh/= NOT used for shell scripts (may contain generated files)- Runtime copying:
actions/setup/→/tmp/gh-aw/actions→ used by workflows - No
make sync-js-scriptsormake sync-shell-scriptstargets (not needed)
make fmt # Format code (run before linting)
make lint # ~5.5s
make test-unit # All unit tests (~3 min) - prefer selective tests
make test # Full test suite (>5 min, very slow) - avoid locally
make recompile # Recompile workflows
make agent-finish # Complete validation
# Selective testing (preferred during development)
go test -v -run "TestName" ./pkg/package/ # Single test (BEST)
go test -v -run "TestFoo|TestBar" ./pkg/cli/ # Group of related tests
go test -v -run "Test.*Compile" ./pkg/workflow/ # Pattern matching./gh-aw --help
./gh-aw compile
./gh-aw mcp list # MCP server management
./gh-aw logs # Download and analyze workflow logs
./gh-aw audit 123456 # Audit a specific workflow runFor comprehensive testing guidelines, patterns, and conventions, see scratchpad/testing.md.
Key testing principles:
- Use
require.*for critical setup (stops test on failure) - Use
assert.*for test validations (continues checking) - Write table-driven tests with
t.Run()and descriptive names - No mocks or test suites - test real component interactions
- Always include helpful assertion messages
Running all tests is slow. Always run the most selective tests possible to validate your changes quickly:
# ✅ BEST - Run specific test(s) by name
go test -v -run "TestMyFunction" ./pkg/cli/
go test -v -run "TestCompile" ./pkg/workflow/
# ✅ GOOD - Run related tests using pattern matching
go test -v -run "TestCompile|TestValidate" ./pkg/workflow/
go test -v -run "TestAudit.*" ./pkg/cli/ # All TestAudit* tests
go test -v -run "Test.*Validation" ./pkg/workflow/ # All validation tests
# ⚠️ SLOW - Entire package tests (avoid for large packages like cli/, workflow/)
go test -v ./pkg/cli/ # ~60-90s for large packages
go test -v ./pkg/workflow/ # Can be very slow
# ⚠️ SLOW (~3 min) - Only run when needed
make test-unit # All unit tests
# 🐌 VERY SLOW (>5 min) - Avoid during development
make test # Full test suite including integration testsWhen to use each approach:
- Individual tests: While developing/debugging a specific feature (PREFERRED)
- Test pattern groups: When changes affect multiple related tests (PREFERRED)
- Entire package tests: Rarely needed - only for small packages or final validation
make test-unit: Before committing (or usemake agent-finish)make test: Rarely needed locally - CI runs this
Quick reference:
make test-unit # All unit tests (~3 min)
make test # Full test suite (>5 min, very slow)
make test-security # Security regression tests
make agent-finish # Complete validation before committingcmd/gh-aw/ # CLI entry point
pkg/
├── cli/ # Command implementations
├── parser/ # Markdown frontmatter parsing
└── workflow/ # Workflow compilation
.github/workflows/ # Sample workflows (*.md + *.lock.yml)
Target size: 100-200 lines per validator
Hard limit: 300 lines (refactor if exceeded)
When to split a validator:
- File exceeds 300 lines
- File contains 2+ unrelated validation domains
- Complex cross-dependencies require separate testing
- Error messages span multiple concern areas
Naming convention: {domain}_{subdomain}_validation.go
Documentation: Minimum 30% comment coverage
Tests: Separate test file with integration tests for complex validators
Decision tree for splitting:
File > 300 lines? ──YES──> Should split
│
NO
│
▼
Contains 2+ distinct domains? ──YES──> Should split
│
NO
│
▼
Keep as-is
See scratchpad/validation-refactoring.md for step-by-step refactoring guide and examples.
ALWAYS use console formatting for user output:
import "github.com/github/gh-aw/pkg/console"
// Success, info, warning, error messages
fmt.Fprintln(os.Stderr, console.FormatSuccessMessage("Success!"))
fmt.Fprintln(os.Stderr, console.FormatInfoMessage("Info"))
fmt.Fprintln(os.Stderr, console.FormatWarningMessage("Warning"))
fmt.Fprintln(os.Stderr, console.FormatErrorMessage(err.Error()))
// Other types: CommandMessage, ProgressMessage, PromptMessage,
// CountMessage, VerboseMessage, LocationMessageError handling:
// WRONG
fmt.Fprintln(os.Stderr, err)
// CORRECT
fmt.Fprintln(os.Stderr, console.FormatErrorMessage(err.Error()))Logging Guidelines:
- ALWAYS use
fmt.Fprintln(os.Stderr, ...)orfmt.Fprintf(os.Stderr, ...)for CLI logging - NEVER use
fmt.Println()orfmt.Printf()directly - all output should go to stderr - Use console formatting helpers with
os.Stderrfor consistent styling - For simple messages without console formatting:
fmt.Fprintf(os.Stderr, "message\n") - Exception: Structured output (JSON, hashes, graphs) goes to stdout for piping/redirection
Examples:
// ✅ CORRECT - Diagnostic output to stderr
fmt.Fprintln(os.Stderr, console.FormatInfoMessage("Processing..."))
fmt.Fprintf(os.Stderr, "Warning: %s\n", msg)
// ✅ CORRECT - Structured output to stdout
fmt.Println(string(jsonBytes)) // JSON output
fmt.Println(hash) // Hash output
fmt.Println(mermaidGraph) // Graph output
// ❌ INCORRECT - Diagnostic output to stdout
fmt.Println("Processing...") // Should use stderr
fmt.Printf("Warning: %s\n", msg) // Should use stderrALWAYS use the logger package for debug logging:
import "github.com/github/gh-aw/pkg/logger"
// Create a logger with namespace following pkg:filename convention
var log = logger.New("pkg:filename")
// Log debug messages (only shown when DEBUG environment variable matches)
log.Printf("Processing %d items", count)
log.Print("Simple debug message")
// Check if logging is enabled before expensive operations
if log.Enabled() {
log.Printf("Expensive debug info: %+v", expensiveOperation())
}Category Naming Convention:
- Follow the pattern:
pkg:filename(e.g.,cli:compile_command,workflow:compiler) - Use colon (
:) as separator between package and file/component name - Be consistent with existing loggers in the codebase
Debug Output Control:
# Enable all debug logs
DEBUG=* gh aw compile
# Enable specific package
DEBUG=cli:* gh aw compile
# Enable multiple packages
DEBUG=cli:*,workflow:* gh aw compile
# Exclude specific loggers
DEBUG=*,-workflow:test gh aw compile
# Disable colors (auto-disabled when piping)
DEBUG_COLORS=0 DEBUG=* gh aw compileKey Features:
- Zero overhead: Logs only computed when DEBUG matches the logger's namespace
- Time diff: Shows elapsed time between log calls (e.g.,
+50ms,+2.5s) - Auto-colors: Each namespace gets a consistent color in terminals
- Pattern matching: Supports wildcards (
*) and exclusions (-pattern)
When to Use:
- Non-essential diagnostic information
- Performance insights and timing data
- Internal state tracking during development
- Detailed operation flow for debugging
When NOT to Use:
- Essential user-facing messages (use console formatting instead)
- Error messages (use
console.FormatErrorMessage) - Success/warning messages (use console formatting)
- Final output or results (use stdout/console formatting)
For developing new CLI commands, follow these patterns and conventions. See scratchpad/cli-command-patterns.md for comprehensive guidance.
package cli
import (
"github.com/github/gh-aw/pkg/console"
"github.com/github/gh-aw/pkg/logger"
"github.com/spf13/cobra"
)
var commandLog = logger.New("cli:command_name")
// NewCommandNameCommand creates the command-name command
func NewCommandNameCommand() *cobra.Command { ... }
// RunCommandName executes the command logic (testable)
func RunCommandName(config Config) error { ... }
// Internal implementation
func validateInputs(...) error { ... }| Element | Pattern | Example |
|---|---|---|
| Command file | *_command.go |
audit_command.go |
| Test file | *_command_test.go |
audit_command_test.go |
| Logger | cli:command_name |
logger.New("cli:audit") |
| Functions | NewXCommand(), RunX() |
NewAuditCommand(), RunAuditWorkflowRun() |
| Config struct | XConfig |
AuditConfig, CompileConfig |
Common flags with helper functions (defined in flags.go):
addEngineFlag(cmd) // --engine/-e (Override AI engine)
addRepoFlag(cmd) // --repo/-r (Target repository)
addOutputFlag(cmd, dir) // --output/-o (Output directory)
addJSONFlag(cmd) // --json/-j (JSON output)Reserved Short Flags: -v (verbose), -e (engine), -r (repo), -o (output), -j (json), -f (force/file), -w (watch)
// ✅ CORRECT - Console formatted, error wrapping
if err != nil {
fmt.Fprintln(os.Stderr, console.FormatErrorMessage(err.Error()))
return fmt.Errorf("failed to process workflow: %w", err)
}
// ❌ INCORRECT - Plain error, no wrapping
if err != nil {
fmt.Fprintln(os.Stderr, err)
return err
}// ✅ CORRECT - All diagnostic output to stderr with console formatting
fmt.Fprintln(os.Stderr, console.FormatSuccessMessage("Compiled successfully"))
fmt.Fprintln(os.Stderr, console.FormatInfoMessage("Processing workflow..."))
fmt.Fprintln(os.Stderr, console.FormatWarningMessage("File has changes"))
fmt.Fprintln(os.Stderr, console.FormatErrorMessage(err.Error()))
// ✅ CORRECT - Structured output to stdout for piping/redirection
fmt.Println(string(jsonBytes)) // JSON output
fmt.Println(hash) // Hash output
fmt.Println(mermaidGraph) // Graph output
// ❌ INCORRECT - Diagnostic output to stdout, no formatting
fmt.Println("Success")
fmt.Printf("Status: %s\n", status)Output Routing Rules (Unix Conventions):
- Diagnostic output (messages, warnings, errors) →
stderr - Structured data (JSON, hashes, graphs) →
stdout - Rationale: Allows users to pipe/redirect data without diagnostic noise
cmd := &cobra.Command{
Use: "command-name <arg>",
Short: "Brief one-line description under 80 chars", // No period
Long: `Detailed description with context and examples.
This command:
- Validates workflow files
- Checks GitHub Actions compatibility
- Reports errors with suggestions
` + WorkflowIDExplanation + `
Examples:
gh aw command arg # Basic usage
gh aw command arg -v # Verbose output
gh aw command arg --option value # With options`,
Args: cobra.ExactArgs(1),
RunE: func(cmd *cobra.Command, args []string) error { ... },
}Minimum 3 examples: Basic usage, common options, advanced usage
Every command needs comprehensive tests:
func TestRunCommand(t *testing.T) {
tests := []struct {
name string
input string
expected string
shouldErr bool
}{
{
name: "valid input",
input: "test-workflow",
expected: "Success",
shouldErr: false,
},
{
name: "empty input",
input: "",
shouldErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result, err := RunCommand(tt.input)
if tt.shouldErr {
assert.Error(t, err)
} else {
assert.NoError(t, err)
assert.Equal(t, tt.expected, result)
}
})
}
}Test coverage: Valid inputs, invalid inputs, edge cases, flag handling, error paths
When developing a new command:
- File named
*_command.go - Logger:
logger.New("cli:command_name") -
NewXCommand()andRunX()functions defined - Short description < 80 chars, no period
- Long description with context and 3+ examples
- Flags use standard short flags where applicable
- Input validation implemented early
- Console formatting for all user output
- All output to stderr (except JSON)
- Error messages actionable with suggestions
- Test file
*_command_test.gocreated - Table-driven tests for multiple scenarios
- Valid, invalid, and edge case tests
See: scratchpad/cli-command-patterns.md for complete specification with examples and anti-patterns
- Prefer many smaller files grouped by functionality
- Add new files for new features rather than extending existing ones
- Use console formatting instead of plain fmt.* for CLI output
- ALWAYS use
anyinstead ofinterface{}- Use the modernanytype alias (Go 1.18+) for consistency across the codebase
Go channels require explicit lifecycle management to prevent goroutine leaks and resource exhaustion. Follow these guidelines when working with channels:
Ownership Rules:
- Document ownership - Add a comment stating who closes the channel (required for every channel)
- Sender closes - The goroutine that sends on the channel must close it after the last send (use
defer close(ch)) - Never close on receiver side - Closing on the receiver side risks panic if the sender is still writing
- Exception: Broadcast channels - Signal channels used for coordination can be closed by the coordinator
Best Practices:
// ✅ CORRECT - Channel closed by sender with defer
done := make(chan struct{})
go func() {
defer close(done) // Sender closes after work completes
// ... do work ...
}()
<-done // Receiver blocks until channel closes
// ✅ CORRECT - Buffered channel for single result
result := make(chan error, 1)
go func() {
result <- doWork() // Send result (buffered, doesn't block)
// No close needed - receiver reads exactly 1 value
}()
err := <-result
// ✅ CORRECT - Broadcast signal by closing
start := make(chan struct{})
for i := 0; i < 10; i++ {
go func() {
<-start // All goroutines wait
// ... do work ...
}()
}
close(start) // Broadcast to all waiting goroutines
// ✅ CORRECT - Timeout pattern for safety
done := make(chan struct{})
go func() {
defer close(done)
// ... work ...
}()
select {
case <-done:
// Completed successfully
case <-time.After(5 * time.Second):
// Timeout - goroutine may still be running
}Signal Channels:
- Prefer
chan struct{}overchan boolfor signaling (zero memory overhead) - Use
chan struct{}when only the event matters, not the value - Use buffered channels (
make(chan T, 1)) when sender shouldn't block - Buffered channels with fixed synchronization: No close needed if receiver reads exactly N values and exits without waiting for channel closure (e.g.,
make(chan T, N)with exactly N sends and N receives in a counted loop pattern)
Signal Handling (os.Signal):
// ✅ CORRECT - Signal channels require signal.Stop(), not close()
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)
defer signal.Stop(sigChan) // Cleanup signal handlerAnti-Patterns to Avoid:
// ❌ WRONG - Channel never closed (goroutine leak)
done := make(chan struct{})
go func() {
// ... work ...
done <- struct{}{} // Goroutine blocks forever if receiver is gone
}()
// ❌ WRONG - Using chan bool for signaling (wastes memory)
done := make(chan bool) // Use chan struct{} instead
// ❌ WRONG - Closing on receiver side
done := make(chan struct{})
go func() {
done <- struct{}{}
}()
<-done
close(done) // Panic if sender tries to send again!
// ❌ WRONG - No timeout protection
done := make(chan struct{})
go func() {
// ... might hang forever ...
}()
<-done // Blocks forever if goroutine hangsTesting with Race Detector: Always run tests with the race detector to catch channel-related issues:
make test # Includes -race flag
go test -race ./...Automatic Activation Output Transformations:
The compiler automatically transforms certain needs.activation.outputs.* expressions to steps.sanitized.outputs.* for compatibility with the activation job context.
Why this transformation occurs:
The prompt is generated within the activation job, which means it cannot reference its own needs.activation.* outputs (a job cannot reference its own needs outputs in GitHub Actions). The compiler automatically rewrites these expressions to reference the sanitized step, which computes sanitized versions of the triggering content.
Transformations:
needs.activation.outputs.text→steps.sanitized.outputs.textneeds.activation.outputs.title→steps.sanitized.outputs.titleneeds.activation.outputs.body→steps.sanitized.outputs.body
Important notes:
- Only
text,title, andbodyoutputs are transformed - Other activation outputs (
comment_id,comment_repo,slash_command) are NOT transformed - Transformation uses word boundary checking to prevent incorrect partial matches (e.g.,
text_customis not transformed) - This is particularly important for runtime-import, where markdown can change without recompilation
Example:
Analyze this content: "${{ needs.activation.outputs.text }}"Is automatically transformed to:
Analyze this content: "${{ steps.sanitized.outputs.text }}"Implementation:
- Transformation happens in
pkg/workflow/expression_extraction.go::transformActivationOutputs() - Applied during expression extraction from markdown
- Transformations are logged for debugging when
DEBUG=workflow:expression_extraction
Primary YAML Library: goccy/go-yaml v1.19.1
gh-aw uses goccy/go-yaml for YAML 1.1/1.2 compatibility with GitHub Actions. See scratchpad/yaml-version-gotchas.md for details on YAML version differences.
Standard YAML Library: go.yaml.in/yaml/v3 v3.0.4
For simple YAML marshaling/unmarshaling operations where YAML 1.1/1.2 compatibility is not critical, use the canonical go.yaml.in/yaml/v3 import path (not the deprecated gopkg.in/yaml.v3).
Migration from gopkg.in/yaml.v3:
- The deprecated
gopkg.in/yaml.v3path has been migrated to the canonicalgo.yaml.in/yaml/v3path - Both paths provide identical APIs (
Marshal,Unmarshal,Encoder,Decoder) - The canonical path provides better supply chain security and aligns with modern Go ecosystem practices
- Transitive dependencies may still use
gopkg.in/yaml.v3(e.g.,github.com/cli/go-gh/v2) - this is acceptable
When to use each library:
goccy/go-yaml: Workflow frontmatter parsing, GitHub Actions YAML generation, any YAML 1.1/1.2 sensitive operationsgo.yaml.in/yaml/v3: Campaign specs, workflow statistics, simple configuration marshaling
Example:
import "go.yaml.in/yaml/v3" // ✅ Use canonical path
// Simple marshaling example
data := map[string]any{"key": "value"}
yamlBytes, err := yaml.Marshal(data)CRITICAL: When editing or generating YAML workflow files (.github/workflows/*.yml, *.lock.yml):
- NEVER copy-paste from colored terminal output - Always use
--no-coloror2>&1 | catto strip colors - Validate YAML before committing - The compiler automatically strips ANSI codes during workflow generation
- Check for invisible characters - Use
cat -A file.yml | grep '\[m'to detect ANSI escape sequences - Run make recompile - Always recompile workflows after editing .md files to regenerate clean .lock.yml files
Why this matters:
ANSI escape sequences (\x1b[31m, \x1b[0m, \x1b[m) are terminal color codes that break YAML parsing. They can accidentally be introduced through:
- Copy-pasting from colored terminal output
- Text editors that preserve ANSI codes
- Scripts that generate colored output
Example of safe command usage:
# ❌ BAD - May include ANSI color codes
npm view @github/copilot | tee output.txt
# ✅ GOOD - Strip colors before saving
npm view @github/copilot --no-color | tee output.txt
# OR
npm view @github/copilot 2>&1 | cat | tee output.txtPrevention layers:
- Compiler sanitization: The workflow compiler (
pkg/workflow/compiler_yaml.go) automatically strips ANSI codes from descriptions, sources, and comments usingstringutil.StripANSIEscapeCodes() - CI validation: The
validate-yamljob in.github/workflows/ci.ymlscans all YAML files for ANSI escape sequences before other jobs run - Detection command: Run
find .github/workflows -name "*.yml" -o -name "*.yaml" | xargs grep -P '\x1b\[[0-9;]*[a-zA-Z]'to check for ANSI codes
If you encounter ANSI codes in workflow files:
- Remove the ANSI codes from the source markdown file
- Run
make recompileto regenerate clean workflow files - The compiler will automatically strip any ANSI codes during compilation
Use appropriate type patterns to improve code clarity, maintainability, and type safety:
Semantic Type Aliases - Use for domain-specific primitives:
// ✅ GOOD - Semantic meaning
type LineLength int
type Version string
type FeatureFlag string
type WorkflowID string
type EngineName string
const MaxExpressionLineLength LineLength = 120
const DefaultCopilotVersion Version = "0.0.374"
const MCPGatewayFeatureFlag FeatureFlag = "mcp-gateway"
const CopilotEngine EngineName = "copilot"
// All semantic types in pkg/constants provide String() and IsValid() methods
if MaxExpressionLineLength.IsValid() {
fmt.Println(MaxExpressionLineLength.String()) // "120"
}
// Type-safe engine selection
engine := CopilotEngine
if engine.IsValid() {
// Use engine with confidence
}When to create semantic type aliases:
- ✅ DO use for domain concepts that are frequently used (versions, URLs, model names, job names)
- ✅ DO use when mixing different string/int types could cause bugs (prevents
JobNamebeing used asStepID) - ✅ DO use when the type name adds clarity that a comment alone wouldn't provide
- ❌ DON'T use for one-off values or when the primitive type is already clear from context
- ❌ DON'T use when it adds ceremony without clarity (e.g.,
type String stringis too generic)
Available semantic types in pkg/constants:
LineLength- Character counts for formatting (e.g.,MaxExpressionLineLength)Version- Software version strings (e.g.,DefaultCopilotVersion)FeatureFlag- Feature flag identifiers (e.g.,SafeInputsFeatureFlag)URL- URL strings (e.g.,DefaultMCPRegistryURL)ModelName- AI model names (e.g.,DefaultCopilotDetectionModel)JobName- GitHub Actions job identifiers (e.g.,AgentJobName)StepID- GitHub Actions step identifiers (e.g.,CheckMembershipStepID)CommandPrefix- CLI command prefixes (e.g.,CLIExtensionPrefix)WorkflowID- Workflow identifiers/basename without .md extension (user-provided workflow names)EngineName- AI engine names (e.g.,CopilotEngine,ClaudeEngine,CodexEngine,CustomEngine)
Available semantic types in pkg/workflow:
GitHubToolName- GitHub tool names (e.g., "issue_read", "create_issue")GitHubAllowedTools- Typed slice of GitHub tool names with conversion helpersGitHubToolset- GitHub toolset names (e.g., "default", "repos", "issues")GitHubToolsets- Typed slice of GitHub toolset names with conversion helpers
Dynamic Types - Use map[string]any for truly dynamic data:
// ✅ GOOD - Unknown structure at compile time
func ProcessFrontmatter(frontmatter map[string]any) error {
// YAML/JSON with dynamic structure
}
// ✅ GOOD - Document why any is needed
// githubTool uses any because tool configuration structure
// varies based on engine and toolsets
func ValidatePermissions(permissions *Permissions, githubTool any)When to use each pattern:
- Semantic type aliases: Domain concepts (lengths, versions, durations)
map[string]any: YAML/JSON parsing, dynamic configurations- Interfaces: Multiple implementations, polymorphism, testing
- Concrete types: Known structure, type safety
Avoid:
- Using
anywhen the type is known - Creating unnecessary type aliases that don't add clarity
- Large "god" interfaces with many methods
- Type name collisions (use descriptive, domain-qualified names)
See: scratchpad/go-type-patterns.md for detailed guidance and examples
The FrontmatterConfig struct in pkg/workflow/frontmatter_types.go is gradually migrating from map[string]any to strongly-typed fields:
Typed Configuration Fields:
Tools *ToolsConfig- Tool and MCP server configurationsNetwork *NetworkPermissions- Network access permissionsSafeOutputs *SafeOutputsConfig- Safe output configurationsSafeInputs *SafeInputsConfig- Safe input configurationsSandbox *SandboxConfig- Sandbox environment configurationRuntimesTyped *RuntimesConfig- Runtime version overrides (node, python, go, uv, bun, deno)PermissionsTyped *PermissionsConfig- GitHub Actions permissions (shorthand + detailed)
Legacy Map Fields (Deprecated but still supported):
MCPServers map[string]any- UseToolsinsteadRuntimes map[string]any- UseRuntimesTypedinsteadPermissions map[string]any- UsePermissionsTypedinsteadJobs map[string]any- Too dynamic to type (GitHub Actions job format)On map[string]any- Too complex to type (many trigger variants)Features map[string]any- Intentionally dynamic for feature flags
Example: Using Typed Runtimes
// Parsing frontmatter with runtimes
frontmatter := map[string]any{
"runtimes": map[string]any{
"node": map[string]any{"version": "20"},
"python": map[string]any{"version": "3.11"},
},
}
config, _ := ParseFrontmatterConfig(frontmatter)
// Access typed fields (no type assertions needed)
if config.RuntimesTyped != nil && config.RuntimesTyped.Node != nil {
version := config.RuntimesTyped.Node.Version // Direct access
}
// Legacy field still works
if nodeRuntime, ok := config.Runtimes["node"].(map[string]any); ok {
version := nodeRuntime["version"] // Requires type assertion
}Example: Using Typed Permissions
frontmatter := map[string]any{
"permissions": map[string]any{
"contents": "read",
"issues": "write",
},
}
config, _ := ParseFrontmatterConfig(frontmatter)
// Access typed fields
if config.PermissionsTyped != nil {
contents := config.PermissionsTyped.Contents // "read"
issues := config.PermissionsTyped.Issues // "write"
}Backward Compatibility:
- Both typed and legacy fields are populated during parsing
ToMap()prefers typed fields when converting back to map[string]any- Existing code using legacy fields continues to work
- New code should prefer typed fields for compile-time safety
For JavaScript files in pkg/workflow/js/*.cjs:
- Use
core.info,core.warning,core.error(not console.log) - Use
core.setOutput,core.getInput,core.setFailed - Avoid
anytype, use specific types orunknown - Run
make jsandmake lint-cjsfor validation
When modifying JSON schemas in pkg/parser/schemas/:
- Schema files are embedded using
//go:embeddirectives - MUST rebuild the binary with
make buildfor changes to take effect - Test changes by compiling a workflow:
./gh-aw compile test-workflow.md - Schema changes typically require corresponding Go struct updates
make agent-finish: ~10-15s (excluding test-unit)make deps: ~1.5minmake deps-dev: ~5-8minmake test-unit: ~3 min (prefer selective tests)make test: >5 min (very slow - avoid locally)make lint: ~5.5s- Selective test (single): ~1-5s
The documentation for this project is available in the docs/ directory. It includes information on how to use the CLI, API references, and examples.
It uses the Diátaxis framework and GitHub-flavored markdown with Astro Starlight for rendering.
See documentation skill for details.
This project is still in an experimental phase. When you are requested to make a change, do not add fallback or legacy support unless explicitly instructed.
When writing workflows that use cache-memory to persist data across runs, be aware of filename limitations:
Filename Requirements:
- No colons (
:): GitHub Actions artifacts don't support colons due to NTFS filesystem limitations - No special characters: Avoid quotes (
",'), pipes (|), angle brackets (<,>), asterisks (*), or question marks (?) - Use filesystem-safe formats: When including timestamps, use
YYYY-MM-DD-HH-MM-SS-sssinstead of ISO 8601
Examples:
# ✅ GOOD - Filesystem-safe timestamp
/tmp/gh-aw/cache-memory/investigation-2026-02-12-11-20-45-458.json
# ❌ BAD - Contains colons (will fail artifact upload)
/tmp/gh-aw/cache-memory/investigation-2026-02-12T11:20:45.458Z.jsonWhy this matters:
- Cache-memory data is uploaded as GitHub Actions artifacts when threat detection is enabled
- Artifacts are stored on Windows-compatible filesystems (NTFS) which restrict certain characters
- Filenames with invalid characters cause
actions/upload-artifactto fail
When writing workflow prompts:
- Explicitly instruct AI agents to use filesystem-safe timestamp formats
- Include examples of valid and invalid filenames
- Document this requirement in the "Cache Usage Strategy" section
gh aw mcp list # List workflows with MCP servers
gh aw mcp inspect workflow-name # Inspect MCP servers
gh aw mcp inspect --inspector # Web-based inspectorDefault MCP Registry: Uses GitHub's MCP registry at https://api.mcp.github.com/v0.1 by default.
---
engine: copilot # Options: copilot, claude, codex, custom
tools:
playwright:
version: "v1.41.0"
network:
allowed:
- github.com
---- Containerized browser automation
- Domain-restricted network access
- Accessibility analysis and visual testing
- Multi-browser support (Chromium, Firefox, Safari)
- Selective tests (PREFERRED): Run individual tests or package tests during development
go test -v -run "TestSpecificFunction" ./pkg/cli/ go test -v ./pkg/workflow/
- Unit tests (
make test-unit): ~3 minutes - run before committing or viamake agent-finish - Full test suite (
make test): >5 minutes, very slow - rarely needed locally, CI handles this - Integration tests: Included in
make test- command behavior and binary compilation - Workflow compilation tests: Markdown to YAML conversion
- Test agentic workflows: Should be added to
pkg/cli/workflowsdirectory
Recommended workflow:
- Run individual tests while developing:
go test -v -run "TestName" ./pkg/package/ - Run related test groups after changes:
go test -v -run "TestFoo|TestBar" ./pkg/package/ - Run
make agent-finishbefore committing (includesmake test-unit) - Let CI run
make test- don't wait for it locally
Avoid running entire package tests for large packages like pkg/cli/ or pkg/workflow/ during development - use selective test patterns instead.
make minor-release # Automated via GitHub ActionsBefore EVERY commit:
- ✅ Run
make agent-finish(or at minimummake fmt) - ✅ Verify no errors from the above command
- ✅ Only then commit and push
This is NOT optional - skipping this causes immediate CI failures.
- Go project with Makefile-managed build/test/lint
- Use
make test-unitfor fast development testing,make testfor full coverage - Use console formatting for user output
- Repository:
github/gh-aw - Include issue numbers in PR titles when fixing issues
- Read issue comments for context before making changes
- Use conventional commits for commit messages
- do NOT commit explanation markdown files about the fixes
For investigating and resolving workflow issues:
- Workflow Health Monitoring - Comprehensive runbook for diagnosing missing-tool errors, authentication failures, MCP configuration issues, and safe-input/output problems. Includes step-by-step investigation procedures, resolution examples, and case studies from real incidents.
Skills provide specialized, detailed knowledge on specific topics. Use them only when needed - don't load skills preemptively.
- developer - Developer instructions, code organization, validation architecture, security practices
- console-rendering - Struct tag-based console rendering system for CLI output
- error-messages - Error message style guide for validation errors
- error-pattern-safety - Safety guidelines for error pattern regex
- error-recovery-patterns - Error handling patterns, recovery strategies, and debugging techniques
- github-script - Best practices for GitHub Actions scripts using github-script
- javascript-refactoring - Guide for refactoring JavaScript code into separate .cjs files
- messages - Adding new message types to safe-output messages system
- github-mcp-server - GitHub MCP server documentation and configuration
- github-issue-query - Query GitHub issues with jq filtering
- github-pr-query - Query GitHub pull requests with jq filtering
- github-discussion-query - Query GitHub discussions with jq filtering
- github-copilot-agent-tips-and-tricks - Tips for working with GitHub Copilot coding agent PRs
- copilot-cli - GitHub Copilot CLI integration for agentic workflows
- custom-agents - GitHub custom agent file format
- gh-agent-session - GitHub CLI agent session extension
- adding-new-engines - Comprehensive guide for adding new agentic engines (AI coding agents)
- temporary-id-safe-output - Adding temporary ID support to safe output jobs
- http-mcp-headers - HTTP MCP header secret support implementation
- documentation - Documentation guidelines using Diátaxis framework and GitHub-flavored markdown
- reporting - Report format guidelines using HTML details/summary tags
- dictation - Fixing text-to-speech errors in dictated text
- agentic-chat - AI assistant for creating task descriptions
- skillz-integration - Skillz MCP server integration with Docker
Remember: Be LAZY - only load a skill when you actually need its specialized knowledge. Don't reference skills preemptively.