Celery Task Debugging Made Easy
Celery with Redis as a broker is a popular combo. Reliable, fast, battle-tested.
Until you have 50,000 tasks stuck in an unknown state, workers consuming 100% CPU doing nothing useful, and Flower showing you everything is fine.
Time to look at what's actually in Redis.
How Celery Uses Redis
Celery stores tasks in Redis using this structure:
celery # Default queue (List)
celery-task-meta-<task-id> # Task result/state (String)
_kombu.binding.celery # Queue bindings (Set)
unacked # Unacknowledged tasks
unacked_index # Index for unacked tasks
unacked_mutex # Lock for unacked operations
If you're using multiple queues:
celery # Default queue
high-priority # Custom queue
notifications # Custom queue
Each queue is a Redis List. Tasks are JSON blobs pushed to the list.
Task Anatomy
A task in the queue looks like:
{
"body": "base64-encoded-payload",
"content-encoding": "utf-8",
"content-type": "application/json",
"headers": {
"id": "abc123-task-id",
"task": "myapp.tasks.process_order",
"lang": "py",
"root_id": "abc123-task-id",
"parent_id": null,
"argsrepr": "(123,)",
"kwargsrepr": "{}",
"eta": null,
"expires": null,
"retries": 0
},
"properties": {
"correlation_id": "abc123-task-id",
"reply_to": "result-queue-id",
"delivery_mode": 2,
"delivery_tag": "uuid"
}
}
The actual arguments are in the base64-encoded body. Yes, it's verbose. That's Celery's AMQP heritage showing.
Common Debugging Scenarios
Scenario 1: Tasks Not Processing
Tasks are sent but nothing happens. Workers are running. What's going on?
Check 1: Are tasks in the queue?
redis-cli LLEN celery
Returns 50000? Tasks are there but not being consumed.
Check 2: Are workers listening to the right queue?
Your task might be routed to a different queue:
@app.task(queue='high-priority')
def important_task():
pass
But workers are started with:
celery -A proj worker -Q celery
They're listening to celery, not high-priority. Tasks pile up.
In Redimo:
- Monitor pattern
celery*or your queue names - See queue lengths at a glance
- Click into queues to see actual task payloads
Scenario 2: Tasks Disappearing
Tasks are sent. Queue length stays at 0. No results. No errors.
Possible causes:
- Task expires before processing
@app.task(expires=60) # Expires in 60 seconds
def my_task():
pass
If workers are slow, tasks expire and vanish. Check headers.expires in task payloads.
- Acks late + worker crash
With acks_late=True, tasks stay in Redis until the worker finishes. If workers crash, tasks go to unacked. Eventually they're requeued or lost (depending on visibility timeout).
Check:
redis-cli ZCARD unacked
High number? Workers are crashing mid-task.
- Wrong serializer
Producer uses JSON, consumer expects pickle. Task gets consumed but fails silently.
# In celery config
task_serializer = 'json'
result_serializer = 'json'
accept_content = ['json']
Scenario 3: Memory Growing
Redis memory keeps increasing. Celery tasks seem normal.
Likely culprit: Task results not cleaned up
By default, Celery stores task results forever:
# Results expire after 1 day
result_expires = 86400
Without this, every celery-task-meta-* key sticks around forever.
Check in Redimo:
- Monitor
celery-task-meta-* - See how many result keys exist
- Check if they have TTL set
Fix:
# In celery config
result_expires = 3600 # 1 hour
# Or disable result storage entirely
task_ignore_result = True
Scenario 4: Finding Failed Tasks
Where do failed tasks go?
If task_acks_late=True and task raises an exception:
- Task stays in
unacked - Eventually returned to queue (with retry) or goes to dead letter queue
Check result state:
redis-cli GET celery-task-meta-<task-id>
State will be FAILURE with exception info.
In Redimo:
- Monitor
celery-task-meta-* - Filter results by state field
- See full exception traceback in the value
Deep Dive: The Unacked Problem
Celery's visibility timeout mechanism:
- Worker pops task from queue
- Task goes to
unackedsorted set (score = timestamp) - Worker processes task
- Worker acknowledges, task removed from
unacked
If worker dies between 2 and 4, task stays in unacked. A background process checks periodically and requeues expired unacked tasks.
Problems arise when:
visibility_timeouttoo short → tasks requeued while still processing → duplicatesvisibility_timeouttoo long → crashed worker leaves tasks stuck- Many workers dying →
unackedgrows huge
Check:
redis-cli ZCARD unacked
redis-cli ZRANGE unacked 0 10 WITHSCORES
Scores are timestamps. Old timestamps = stuck tasks.
Task Result Storage
Celery result keys look like:
celery-task-meta-3f2b3c4d-5678-90ab-cdef-1234567890ab
Value is JSON:
{
"status": "SUCCESS",
"result": {"order_id": 123, "total": 99.99},
"traceback": null,
"children": [],
"date_done": "2025-01-15T10:30:45.123456",
"task_id": "3f2b3c4d-..."
}
Or on failure:
{
"status": "FAILURE",
"result": {
"exc_type": "ValueError",
"exc_message": ["Invalid input"],
"exc_module": "builtins"
},
"traceback": "Traceback (most recent call last):\n..."
}
Flower vs Redis Direct
Flower is excellent for:
- Real-time worker monitoring
- Task success/failure rates
- Easy task inspection
Flower is limited for:
- Bulk operations on Redis data
- Finding tasks by custom criteria
- Investigating queue-level issues
- Debugging serialization problems
When Flower doesn't give you answers, go to Redis.
Practical Investigation Workflow
1. Overview
Pattern: *celery* OR specific queue names
Get counts for:
- Main queue (
celery) - Custom queues
- Result keys (
celery-task-meta-*) - Unacked tasks
2. Queue Health
For each queue:
- Length (LLEN)
- Sample tasks (LRANGE 0 10)
- Oldest task (LINDEX -1)
3. Result Storage
- Total result keys
- Keys without TTL
- Failed task results
4. Unacked Analysis
- Total unacked count
- Age of oldest unacked task
- Pattern in unacked task types
Configuration Tips
Set Result Expiry
result_expires = 3600 # 1 hour
# Or per-task
@app.task(result_expires=300)
def my_task():
pass
Tune Visibility Timeout
broker_transport_options = {
'visibility_timeout': 43200, # 12 hours for long tasks
}
Consider Result Backend Alternatives
If you don't need task results:
task_ignore_result = True
If you need results but not in Redis:
result_backend = 'db+postgresql://user:pass@host/db'
Keep your broker Redis clean for actual task queuing.
Use Priority Queues
task_queues = (
Queue('high', routing_key='high'),
Queue('default', routing_key='default'),
Queue('low', routing_key='low'),
)
task_default_queue = 'default'
Monitor each queue separately. Know which is backing up.
Quick Reference
| Celery Concept | Redis Key Pattern | Type |
|---|---|---|
| Default queue | celery |
List |
| Custom queue | <queue-name> |
List |
| Task result | celery-task-meta-<id> |
String |
| Unacked tasks | unacked |
Sorted Set |
| Queue bindings | _kombu.binding.* |
Set |
Celery abstracts Redis well. Until it doesn't. Knowing what's actually in your broker lets you debug issues that no amount of --loglevel=DEBUG will reveal.
Download Redimo and see your Celery tasks at the Redis level.