Scheduling & Error Handling
Learn to run Python automation scripts on a schedule and make them production-ready with error handling, logging, retry logic, and monitoring.
🔄 Recall Bridge: In the previous lesson, you automated email and notifications — sending reports and alerts from your scripts. Now let’s make your scripts run automatically on a schedule and survive errors gracefully.
A script that works when you run it manually is a tool. A script that runs on schedule, handles errors, and notifies you of problems is an automation system. This lesson bridges that gap.
Scheduling Options
| Tool | Platform | Best For |
|---|---|---|
| cron | macOS/Linux | Simple, reliable, built-in |
| Task Scheduler | Windows | Windows native scheduling |
| schedule (Python library) | All platforms | Readable schedules in Python code |
| APScheduler | All platforms | Advanced scheduling with persistence |
| launchd | macOS | macOS-specific, more features than cron |
Option 1: cron (macOS/Linux)
AI prompt:
Generate cron expressions for these schedules: (1) Every weekday at 8 AM, (2) Every Monday at 9 AM, (3) Every 6 hours, (4) First day of every month at midnight. Show me the full crontab entry for running a Python script at each schedule, including the full path to python3 and the script, and redirecting output to a log file.
Common cron patterns:
| Schedule | Cron Expression |
|---|---|
| Every day at 8 AM | 0 8 * * * |
| Weekdays at 8 AM | 0 8 * * 1-5 |
| Every Monday at 9 AM | 0 9 * * 1 |
| Every 6 hours | 0 */6 * * * |
| 1st of month at midnight | 0 0 1 * * |
Crontab entry:
# Edit crontab
crontab -e
# Entry format: minute hour day month weekday command
0 8 * * 1-5 /usr/bin/python3 /home/user/scripts/daily_report.py >> /home/user/logs/daily_report.log 2>&1
Option 2: Python schedule Library
pip install schedule
AI prompt:
Write a Python scheduler script using the schedule library that: (1) Runs daily_report() every weekday at 8 AM, (2) Runs weekly_summary() every Friday at 5 PM, (3) Runs price_check() every 6 hours, (4) Logs when each task starts and finishes, (5) Catches and logs errors without crashing the scheduler, (6) Sends an alert if any task fails. Keep the scheduler running indefinitely.
Production-Ready Error Handling
The error handling hierarchy:
| Level | Handles | Example |
|---|---|---|
| Try/except per operation | Individual failures | One file fails, others continue |
| Retry with backoff | Temporary failures | Network timeout → retry in 30s |
| Failure notification | Exhausted retries | Email alert: “Script failed after 3 retries” |
| Heartbeat monitoring | Silent failures | “Script didn’t report success by 8:15 AM” |
AI prompt for robust error handling:
Add production-ready error handling to my automation script: (1) Wrap each major operation in try/except with specific exception types (not bare except), (2) Add retry logic for network operations: 3 retries with exponential backoff (30s, 60s, 120s), (3) Log all errors with full traceback to a rotating log file (max 10MB, keep 5 rotations), (4) If the script fails completely, send an alert email with the error details, (5) If the script succeeds, write a success marker file (heartbeat) with timestamp.
Logging Setup
AI prompt for logging configuration:
Set up Python logging for my automation script: (1) Log to both console and file, (2) Console shows INFO and above, file shows DEBUG and above, (3) Log format includes timestamp, level, function name, and message, (4) Rotate log files daily, keep 30 days of logs, (5) Create a reusable setup_logging() function I can import into all my scripts.
Log level guidelines:
| Level | Use For | Example |
|---|---|---|
| DEBUG | Detailed troubleshooting info | “Processing row 142 of 5000” |
| INFO | Normal operation milestones | “Report generated: 500 rows, saved to output.xlsx” |
| WARNING | Something unexpected but handled | “3 rows had missing emails, filled with default” |
| ERROR | Operation failed but script continues | “Failed to fetch page 15, skipping” |
| CRITICAL | Script cannot continue | “Database connection failed after 3 retries” |
Monitoring Your Automations
Simple heartbeat monitoring script:
AI prompt:
Write a monitoring script that checks if my automation scripts ran successfully: (1) Each script writes a “heartbeat” file after success: {script_name}_heartbeat.json with {“last_success”: timestamp, “records_processed”: count}, (2) The monitor checks all heartbeat files and alerts if any script’s last success is older than its expected schedule (daily scripts → alert if > 25 hours, hourly scripts → alert if > 90 minutes), (3) Generate a daily status summary: which scripts ran, when, how many records processed, any failures. Run this monitor every 30 minutes.
✅ Quick Check: Your script uses a bare
except:clause that catches ALL exceptions, including KeyboardInterrupt and SystemExit. Why is this a problem? (Answer: Bareexcept:catches EVERYTHING, including exceptions that should stop the script: KeyboardInterrupt (Ctrl+C), SystemExit (sys.exit()), and MemoryError. This makes the script impossible to stop gracefully. Always catch specific exceptions:except (requests.RequestException, ValueError) as e:— or at minimum, useexcept Exception as e:which excludes KeyboardInterrupt and SystemExit.)
Key Takeaways
- Production automation needs three layers of protection: retry logic for temporary failures (network timeouts resolve themselves), failure notifications for exhausted retries (you learn about problems immediately), and heartbeat monitoring for silent failures (script didn’t even run) — without all three, you’ll discover failures hours or days late
- Use Python’s logging module instead of print() for automation scripts — logging provides timestamps, severity levels, file output, and rotation that you need to debug failures in scripts running unattended at 3 AM; the setup takes 5 lines and saves hours of investigation
- Centralize scheduling and error handling as your automation grows — one scheduler script with consistent logging, retries, and alerts across all tasks is more maintainable than five independent scripts each with their own ad-hoc error handling
Up Next
In the final lesson, you’ll build your personalized automation toolkit — identifying your highest-value automation opportunities, creating your script portfolio, and establishing a maintenance routine.
Knowledge Check
Complete the quiz above first
Lesson completed!