Identifying Failed Resources
Resources in a failed state appear with[FAILED] when listed:
Automatic Recovery
Stuck Resource Detection
pragma-os automatically detects resources stuck inprocessing state. If a provider fails to respond within the expected timeframe, the platform:
- Detects the unresponsive operation
- Retries the operation automatically
- If retries are exhausted, moves the resource to
failedstate with an error message
Idempotent Apply
Re-applying a resource with the same configuration is safe. If the resource is alreadyready with identical config, pragma-os returns the existing resource without re-processing. This makes it safe to re-run pragma resources apply in scripts without side effects.
Common Failure Scenarios
Configuration Errors
The most common cause of failure is invalid configuration:- Missing required fields — A required config value is missing
- Invalid values — A value doesn’t match the expected format or constraints
- Permission errors — The provider doesn’t have access to create or modify the resource
Dependency Failures
A resource can fail or wait if its dependencies aren’t satisfied:- Missing dependency — A referenced resource doesn’t exist yet (resource waits in
pending) - Dependency failed — A dependency exists but is in
failedstate - Invalid field reference — A FieldReference points to a field that doesn’t exist in the dependency’s outputs
ready state:
ready, pending dependents are automatically triggered.
Provider Errors
Sometimes the underlying provider rejects the operation:- Quota exceeded — You’ve hit a service limit
- Resource conflicts — A resource with that name already exists outside pragma
- Service unavailable — Temporary provider outage
Using the Dead Letter Queue
When a resource operation fails after retries, it moves to the dead letter queue. This prevents failed operations from blocking other work and preserves the failure details for investigation.List Failed Events
See all failed events:Inspect Event Details
Get the full error message and context:- The resource that failed
- The full error message
- When the failure occurred
- The operation that was attempted
Retry Failed Events
After fixing the underlying issue, retry the failed operation:Clear Resolved Events
Once you’ve addressed failures (or decided to abandon them), remove events from the queue:Dependency Failure Cascades
When a resource fails, it affects downstream resources:- Failed resources stay failed — They don’t retry automatically (but stuck
processingresources do get auto-detected) - Pending dependents wait — Resources waiting for a failed dependency remain in
pendinguntil the dependency becomesready - Ready dependents are notified — When you fix a dependency and it reaches
ready, all its dependents are automatically re-processed
agent resource can’t proceed because model is failed. To recover:
- Fix the
modelconfiguration - Re-apply to retry
- Once
modelreachesready,agentautomatically proceeds
Recovery Workflow
When you encounter failures, follow this workflow:Preventing Failures
Reduce failures by:- Using draft mode — Apply with
--draftfirst to validate configuration, then re-apply without--draftto deploy - Checking dependencies — Use
pragma resources describeto verify dependencies areready - Applying in bulk — Apply all resources in a single YAML file; pragma-os resolves the dependency order automatically
- Monitoring the dead letter queue regularly for early warning of issues
Next Steps
Common Issues
Solutions to frequent problems.
Resource Lifecycle
Understanding resource states.