Automating Excel reports with Python takes four elements: pandas to process data, openpyxl to write the formatted file, smtplib to send it by email, and cron or schedule to run the script on a fixed schedule. What used to take your analyst two hours every Monday runs in 30 seconds with zero intervention. No retainers, no locked-in licenses, code in your name.
The Minimum Architecture
Four pieces, one per layer:
- Extraction: pandas reads sources (CSV, SQL, API, base Excel)
- Transformation: pandas groups, filters, calculates KPIs
- Writing: openpyxl or xlsxwriter generates the formatted Excel
- Distribution: smtplib or any email API sends to stakeholders
Recommended Stack
| Library | Purpose | When to Use It |
|---|---|---|
| pandas | Data manipulation | Always |
| openpyxl | Read and write xlsx with formatting | Executive reports with formatted cells |
| xlsxwriter | Write-only with dashboards and charts | Reports with embedded visualizations |
| matplotlib | Generate charts for embedding | Dynamic charts in Excel |
| schedule | Simple recurring tasks | Script running as a Python service |
| smtplib | Send email with attachment | Distribution to internal team |
Functional Script Skeleton
A weekly sales report that pulls from PostgreSQL, groups by channel, and sends by email every Monday at 7 AM:
import pandas as pd
from sqlalchemy import create_engine
from openpyxl import load_workbook
from openpyxl.styles import Font, PatternFill
import smtplib
from email.message import EmailMessage
from datetime import datetime, timedelta
# 1. Extract
engine = create_engine("postgresql://user:pass@host/db")
today = datetime.now().date()
last_7d = today - timedelta(days=7)
query = f"""
SELECT channel, SUM(amount) as sales, COUNT(*) as transactions
FROM sales
WHERE payment_date BETWEEN '{last_7d}' AND '{today}'
GROUP BY channel
ORDER BY sales DESC;
"""
df = pd.read_sql(query, engine)
# 2. Transform
df["avg_ticket"] = df["sales"] / df["transactions"]
df["pct_total"] = (df["sales"] / df["sales"].sum() * 100).round(1)
# 3. Write with formatting
file = f"weekly_report_{today}.xlsx"
df.to_excel(file, index=False, sheet_name="Sales")
wb = load_workbook(file)
ws = wb["Sales"]
header_fill = PatternFill(start_color="111827", fill_type="solid")
header_font = Font(color="FFFFFF", bold=True)
for cell in ws[1]:
cell.fill = header_fill
cell.font = header_font
wb.save(file)
# 4. Send by email
msg = EmailMessage()
msg["Subject"] = f"Weekly sales report - {today}"
msg["From"] = "reports@yourcompany.com"
msg["To"] = "ceo@yourcompany.com"
msg.set_content("Report attached.")
with open(file, "rb") as f:
msg.add_attachment(f.read(), maintype="application",
subtype="xlsx", filename=file)
with smtplib.SMTP_SSL("smtp.gmail.com", 465) as s:
s.login("reports@yourcompany.com", "APP_PASSWORD")
s.send_message(msg)
That script runs in cron like this (every Monday at 7 AM):
0 7 * * 1 /usr/bin/python3 /opt/scripts/weekly_report.py
The Real Case: KPIs in Code, Not in Cells
A multi-country distributor (case_028) needed advanced reporting for 100 franchises with 5 sections (Financials, Sales, Services, Complaints, System Usage). The typical mistake would have been letting the AI calculate KPIs from raw data — the right approach is the opposite.
What was built:
- 28 KPIs calculated in deterministic JavaScript code, not in AI responses
- AI only generates explanatory narrative on top of already-calculated numbers
- Browser-side compute (zero server CPU)
- Two-level pattern: headline KPI plus an AI-generated explanatory paragraph
- Immutable audit trail with verifiable SHA-256 hash chain
The principle applies identically to your weekly Python report: metrics are calculated in auditable code. If you ever add AI to generate narrative, feed it the already-calculated KPIs — don't ask it to calculate them. KPIs in code, not hallucinations.
When Python Scales vs. When You Need More
Python with cron solves it if:
- Your report runs less than once a day
- All sources live in one place (one database, one base Excel)
- Recipients are fewer than 20 internal people
- You don't need navigable interactive dashboards
You need more infrastructure if:
- Your pipeline has 5+ steps with dependencies (use Airflow or Prefect)
- You distribute to 100+ external users (use your own web portal)
- You need interactive dashboards (combine with Looker Studio or Metabase)
- The explanatory narrative requires AI (combine with Anthropic Claude API)
7 Mistakes That Break Automated Reports
- Hardcoding credentials in the script (use environment variables or a secrets manager)
- Not handling database connection errors (wrap in try/except and notify on failure)
- Forgetting timezone handling (use pytz or explicit zoneinfo)
- Generating the file without a timestamp in the name (you'll overwrite the previous version)
- Relying on Gmail SMTP without an app password (regular passwords fail with 2FA enabled)
- Not logging each run (when it fails, you won't know when)
- Not validating row counts before sending (if the query returns 0 rows, don't send an empty report)
What Does Not Automating Cost You?
A financial analyst at $30/hr loaded rate who builds a weekly sales report in 2 hours is $60 per report, $3,120 per year, $9,360 over 3 years. And that's only direct cost — the opportunity cost (the hour the analyst doesn't spend on deep analysis) is higher.
A Python script that runs on its own costs:
- 8 to 20 hours of initial development (one-time $240 to $600)
- $10 to $20/month in hosting if you deploy to the cloud
- 30 minutes per year in maintenance
It pays for itself in under 3 months. Your team gets back to real work.
Is Your Reporting Bleeding You Out?
If your team spends more than 10 hours a week generating manual Excel reports, you don't have a productivity problem — you have an architecture problem. What used to take 30 engineers and 18 months, we deliver in weeks. Automated reports are the easy part of the package.
Next Steps
If you have an internal technical team, start today with the Python skeleton above. In 4 hours you'll have your first report running. If your company has more than 15 recurring manual reports and your data lives across 5+ separate systems, you don't need more scripts — you need a unified Data Lake.
MAGIA / Core builds a Data Lake with automated reporting in 12 weeks — role-based dashboards, KPIs in code, and code in your name for a one-time $15,000. For individual professionals who need simple reporting integrated into their business, MAGIA / Solo delivers web, CRM, and analytics in 15 days for $4,500. 30-minute call, no pitch deck, real conversation about your operation.