Lesson 7 12 min

Scheduling & Error Handling

Learn to run Python automation scripts on a schedule and make them production-ready with error handling, logging, retry logic, and monitoring.

🔄 Recall Bridge: In the previous lesson, you automated email and notifications — sending reports and alerts from your scripts. Now let’s make your scripts run automatically on a schedule and survive errors gracefully.

A script that works when you run it manually is a tool. A script that runs on schedule, handles errors, and notifies you of problems is an automation system. This lesson bridges that gap.

Scheduling Options

Tool	Platform	Best For
cron	macOS/Linux	Simple, reliable, built-in
Task Scheduler	Windows	Windows native scheduling
schedule (Python library)	All platforms	Readable schedules in Python code
APScheduler	All platforms	Advanced scheduling with persistence
launchd	macOS	macOS-specific, more features than cron

Option 1: cron (macOS/Linux)

AI prompt:

Generate cron expressions for these schedules: (1) Every weekday at 8 AM, (2) Every Monday at 9 AM, (3) Every 6 hours, (4) First day of every month at midnight. Show me the full crontab entry for running a Python script at each schedule, including the full path to python3 and the script, and redirecting output to a log file.

Common cron patterns:

Schedule	Cron Expression
Every day at 8 AM	`0 8 * * *`
Weekdays at 8 AM	`0 8 * * 1-5`
Every Monday at 9 AM	`0 9 * * 1`
Every 6 hours	`0 /6 * *`
1st of month at midnight	`0 0 1 * *`

Crontab entry:

# Edit crontab
crontab -e

# Entry format: minute hour day month weekday command
0 8 * * 1-5 /usr/bin/python3 /home/user/scripts/daily_report.py >> /home/user/logs/daily_report.log 2>&1

Option 2: Python schedule Library

pip install schedule

AI prompt:

Write a Python scheduler script using the schedule library that: (1) Runs daily_report() every weekday at 8 AM, (2) Runs weekly_summary() every Friday at 5 PM, (3) Runs price_check() every 6 hours, (4) Logs when each task starts and finishes, (5) Catches and logs errors without crashing the scheduler, (6) Sends an alert if any task fails. Keep the scheduler running indefinitely.

Production-Ready Error Handling

The error handling hierarchy:

Level	Handles	Example
Try/except per operation	Individual failures	One file fails, others continue
Retry with backoff	Temporary failures	Network timeout → retry in 30s
Failure notification	Exhausted retries	Email alert: “Script failed after 3 retries”
Heartbeat monitoring	Silent failures	“Script didn’t report success by 8:15 AM”

AI prompt for robust error handling:

Add production-ready error handling to my automation script: (1) Wrap each major operation in try/except with specific exception types (not bare except), (2) Add retry logic for network operations: 3 retries with exponential backoff (30s, 60s, 120s), (3) Log all errors with full traceback to a rotating log file (max 10MB, keep 5 rotations), (4) If the script fails completely, send an alert email with the error details, (5) If the script succeeds, write a success marker file (heartbeat) with timestamp.

Logging Setup

AI prompt for logging configuration:

Set up Python logging for my automation script: (1) Log to both console and file, (2) Console shows INFO and above, file shows DEBUG and above, (3) Log format includes timestamp, level, function name, and message, (4) Rotate log files daily, keep 30 days of logs, (5) Create a reusable setup_logging() function I can import into all my scripts.

Log level guidelines:

Level	Use For	Example
DEBUG	Detailed troubleshooting info	“Processing row 142 of 5000”
INFO	Normal operation milestones	“Report generated: 500 rows, saved to output.xlsx”
WARNING	Something unexpected but handled	“3 rows had missing emails, filled with default”
ERROR	Operation failed but script continues	“Failed to fetch page 15, skipping”
CRITICAL	Script cannot continue	“Database connection failed after 3 retries”

Monitoring Your Automations

Simple heartbeat monitoring script:

AI prompt:

Write a monitoring script that checks if my automation scripts ran successfully: (1) Each script writes a “heartbeat” file after success: {script_name}_heartbeat.json with {“last_success”: timestamp, “records_processed”: count}, (2) The monitor checks all heartbeat files and alerts if any script’s last success is older than its expected schedule (daily scripts → alert if > 25 hours, hourly scripts → alert if > 90 minutes), (3) Generate a daily status summary: which scripts ran, when, how many records processed, any failures. Run this monitor every 30 minutes.

✅ Quick Check: Your script uses a bare except: clause that catches ALL exceptions, including KeyboardInterrupt and SystemExit. Why is this a problem? (Answer: Bare except: catches EVERYTHING, including exceptions that should stop the script: KeyboardInterrupt (Ctrl+C), SystemExit (sys.exit()), and MemoryError. This makes the script impossible to stop gracefully. Always catch specific exceptions: except (requests.RequestException, ValueError) as e: — or at minimum, use except Exception as e: which excludes KeyboardInterrupt and SystemExit.)

Key Takeaways

Production automation needs three layers of protection: retry logic for temporary failures (network timeouts resolve themselves), failure notifications for exhausted retries (you learn about problems immediately), and heartbeat monitoring for silent failures (script didn’t even run) — without all three, you’ll discover failures hours or days late
Use Python’s logging module instead of print() for automation scripts — logging provides timestamps, severity levels, file output, and rotation that you need to debug failures in scripts running unattended at 3 AM; the setup takes 5 lines and saves hours of investigation
Centralize scheduling and error handling as your automation grows — one scheduler script with consistent logging, retries, and alerts across all tasks is more maintainable than five independent scripts each with their own ad-hoc error handling

Up Next

In the final lesson, you’ll build your personalized automation toolkit — identifying your highest-value automation opportunities, creating your script portfolio, and establishing a maintenance routine.

Knowledge Check

1. Your script runs every morning at 8 AM via cron. One day it fails because the target website is temporarily down (503 error). The script logs the error and exits. You don't notice until 3 PM because you weren't checking the logs. How do you fix this?

Check the logs every morning to make sure the script ran successfully Add monitoring at three levels: (1) Retry logic in the script — if the website returns a 503, retry 3 times with increasing delays (30s, 60s, 120s) before giving up. Most temporary outages resolve within minutes. (2) Failure notification — if all retries fail, send yourself an email/Slack alert: 'Daily report script failed at 8:03 AM. Error: 503 Service Unavailable after 3 retries.' (3) Heartbeat monitoring — use a service (or simple script) that expects a 'success' signal from your script. If no signal by 8:15 AM, alert you. This catches silent failures (script didn't even start — cron misconfigured, machine turned off, etc.) Schedule the script to run twice — at 8 AM and again at 10 AM as a backup

2. Your automation script uses `print()` statements throughout to show progress and errors. A colleague says you should use Python's `logging` module instead. Is this a meaningful difference or just a style preference?

Just a style preference — print() and logging are basically the same thing It's a meaningful difference for automation scripts. logging provides features that print() cannot: (1) Log levels — DEBUG, INFO, WARNING, ERROR, CRITICAL — filter by severity without changing code. During development, see everything. In production, see only warnings and above. (2) Automatic timestamps — every log entry includes when it happened. (3) File output — log to a file for post-mortem analysis, not just the console (which disappears when the script ends). (4) Rotation — automatically create new log files daily or when file size exceeds a limit, keeping disk space manageable. (5) Context — include the function name, line number, and exception traceback automatically. For scripts that run unattended (cron, scheduled tasks), logging is essential because print() output goes nowhere useful Only matters if you're writing a library — for scripts, print() is fine

3. You have 5 automation scripts that each run on different schedules. You're managing them with 5 separate cron entries. This is becoming hard to track. What's a better approach?

Keep using 5 cron entries — cron is the standard tool for scheduling Create a central scheduler script that manages all 5 tasks. Options: (1) Python's schedule library — define all schedules in one Python file: `schedule.every().day.at('08:00').do(run_daily_report)`. Easier to read than crontab syntax. (2) A configuration file — YAML or JSON listing each script, its schedule, retry policy, and notification settings. One place to see all your automations. (3) Keep using cron for the trigger, but add a wrapper script that handles logging, error notification, and success tracking for ALL scripts consistently. The key insight: managing 5 independent scripts becomes 15 once you add retries, notifications, and logging to each. A central system handles this once Combine all 5 scripts into one big script that runs everything in sequence

Answer all questions to check

Complete the quiz above first

Scheduling Options

Option 1: cron (macOS/Linux)

Option 2: Python schedule Library

Production-Ready Error Handling

Logging Setup

Monitoring Your Automations

Key Takeaways

Up Next

Knowledge Check

Related Skills