API Integration
Learn to connect Python scripts to REST APIs — authentication, pagination, error handling, and building reusable API wrappers with AI assistance.
🔄 Recall Bridge: In the previous lesson, you learned web scraping — extracting data from HTML pages. APIs are the structured, reliable alternative: instead of parsing HTML, you get clean JSON data directly from the service.
APIs are the backbone of modern automation. Instead of scraping a weather website, call the weather API. Instead of screen-scraping your project management tool, use its API. APIs give you structured data, stable interfaces, and explicit permission.
REST API Basics
pip install requests python-dotenv
The four HTTP methods you’ll use:
| Method | Purpose | Example |
|---|---|---|
| GET | Retrieve data | Get weather forecast, list users |
| POST | Send/create data | Create a task, submit a form |
| PUT | Update data | Update user profile, modify settings |
| DELETE | Remove data | Delete a record, cancel a subscription |
Core requests patterns:
import requests
# GET with parameters
response = requests.get(
"https://api.example.com/data",
params={"city": "Tokyo", "units": "metric"},
headers={"Authorization": f"Bearer {api_key}"}
)
data = response.json()
# POST with JSON body
response = requests.post(
"https://api.example.com/items",
json={"name": "New Item", "quantity": 5},
headers={"Authorization": f"Bearer {api_key}"}
)
Script 1: API Data Fetcher
AI prompt:
Write a Python script that fetches data from a REST API: (1) Read the API key from environment variables using python-dotenv, (2) Make GET requests with proper headers and parameters, (3) Handle common HTTP errors: 401 (unauthorized), 403 (forbidden), 404 (not found), 429 (rate limited), 500 (server error), (4) Parse the JSON response and save to CSV, (5) Add retry logic: retry failed requests up to 3 times with exponential backoff (1s, 2s, 4s). Include a .env.example file listing required environment variables.
Script 2: Paginated API Consumer
AI prompt:
Write a script that consumes a paginated REST API: (1) Start at the first page, (2) Follow pagination: the API returns a “next_page_token” field in each response — pass it as a query parameter to get the next page, (3) Collect all items across all pages into a single list, (4) Stop when there’s no “next_page_token” in the response, (5) Respect rate limits: maximum 60 requests per minute, (6) Save progress after each page (resume-safe if the script crashes), (7) Print progress: “Page 5 — 500 items collected so far”. Return the complete dataset as a pandas DataFrame.
Authentication Patterns
| Auth Type | How It Works | requests Code |
|---|---|---|
| API Key (header) | Key in request header | headers={"X-API-Key": key} |
| Bearer Token | OAuth-style token | headers={"Authorization": f"Bearer {token}"} |
| API Key (query) | Key in URL parameters | params={"api_key": key} |
| Basic Auth | Username + password | auth=("username", "password") |
Environment Variable Security
Create .env file (add to .gitignore):
WEATHER_API_KEY=your-key-here
GITHUB_TOKEN=ghp_xxxxxxxxxxxx
Create .env.example (commit this — shows required variables without values):
WEATHER_API_KEY=
GITHUB_TOKEN=
Load in your script:
from dotenv import load_dotenv
import os
load_dotenv()
api_key = os.environ.get("WEATHER_API_KEY")
if not api_key:
raise ValueError("WEATHER_API_KEY not set in .env file")
Script 3: Multi-API Data Pipeline
AI prompt:
Write a Python script that combines data from two APIs: (1) Fetch a list of cities from API_1, (2) For each city, fetch weather data from API_2, (3) Merge the results into a single dataset with columns from both APIs, (4) Handle: one API being down (use cached data if available), rate limits on both APIs (different limits), and missing data (some cities may not have weather data), (5) Save the combined result as CSV and Excel. This demonstrates the common pattern of orchestrating multiple API calls.
✅ Quick Check: An API returns this error:
{"error": "rate_limit_exceeded", "retry_after": 30}. What should your script do? (Answer: Wait the specified 30 seconds before retrying. Many APIs include aretry_afterfield or aRetry-AfterHTTP header telling you exactly how long to wait. Your error handling should check for this value and use it instead of a fixed backoff. AI prompt: “Add retry_after handling to my API error logic — check both the JSON response body and HTTP headers for retry timing.”)
Error Handling for API Scripts
| Error | Status Code | Your Script Should |
|---|---|---|
| Rate limited | 429 | Wait and retry (check Retry-After header) |
| Unauthorized | 401 | Check API key, raise clear error |
| Not found | 404 | Log and skip this resource |
| Server error | 500-503 | Retry with backoff (temporary issue) |
| Timeout | - | Retry with longer timeout |
| Network error | - | Retry, then fail with clear message |
Key Takeaways
- Never hardcode API keys in source code — use environment variables with python-dotenv, create a
.envfile (added to .gitignore) for your keys and a.env.example(committed) showing required variables; bots scan GitHub for leaked keys within minutes of accidental pushes - Implement smart rate limiting: track request timestamps to maximize throughput within limits, and use exponential backoff (1s, 2s, 4s) with retry logic for 429 errors — every API has limits, and your script must respect them automatically
- APIs are more reliable than web scraping because they provide structured JSON data, versioned interfaces, and explicit permission — always check if a site has an API before building a scraper
Up Next
In the next lesson, you’ll automate email and notifications — sending scheduled reports, alerts, and status updates from your Python scripts.
Knowledge Check
Complete the quiz above first
Lesson completed!