1) Is the Agent to be created using Python?
2) Does the Python program run under a Python interpreter on my computer? 3) I need a complete Python program with all the "bells and whistles,"
4) Can I get ChatGPT write the entire Python program?
1) Does the agent have to be created using Python?
No.
But Python is a great choice because it’s easy to write, easy to run on your own computer, and good for “automation + AI” (monitor folders, parse files, schedule jobs, call APIs).
Other options: JavaScript/Node, Power Automate, Zapier/Make, Apple Shortcuts, etc. But Python is the simplest “serious” option.
2) Does the Python program run under a Python interpreter on my computer?
Yes.
Typically, you install Python on your PC. You run the agent with python agent.py. If you want it to run “autonomously,” you keep it running, or you schedule it (Windows Task Scheduler/cron).
3) Do you need a complete program with “bells and whistles”?
Not at first.
A practical path start small and reliable (watch a folder, detect new lab PDFs, summarize, save results). Add “bells and whistles” only after it works (trend tracking, alerts, thresholds, dashboards, email).
That said, I can give you a complete template with the common “real-world” stuff:
- configuration file- logging- a local database (so it remembers what it already processed)- folder monitoring- scheduled polling loop- OpenAI call- safe “no automatic portal login” design
4) Can ChatGPT write the entire Python program?
Yes—absolutely.
And you should expect an iterative process:
- You run it.- You paste any error messages.- I fix it.
That’s how you get from “template” to “rock solid.”
Also: OpenAI’s current recommended interface is the Responses API (not older chat-completions), and the official SDK docs show how to install/use it. (OpenAI Developers)
A complete “Lab Monitor Agent” (folder-based autonomy)
What this agent does (autonomously)
- Watches a folder like C:\Labs\Incoming- When a new PDF appears, it:
- extracts text,- sends it to the model,- writes a structured summary to C:\Labs\Reports,- records the file hash in SQLite so it won’t re-process it.
This is autonomous monitoring without needing portal access.
Install prerequisites
1) Install Python packages
</> Bash
pip install openai pypdf python-dotenv
2) Set your OpenAI key in a .env file
Create a file named .env in the same folder as the script:
</> Code
OPENAI_API_KEY=your_key_here
OpenAI’s Python quickstart covers installing the SDK and using an API key. (OpenAI Developers)
The complete program: lab_agent.py
import os
import time
import json
import hashlib
import logging
import sqlite3
from dataclasses import dataclass
from datetime import datetime
from pathlib import Path
from typing import Optional
from dotenv import load_dotenv
from pypdf import PdfReader
from openai import OpenAI
# ----------------------------
# Configuration
# ----------------------------
@dataclass
class Config:
incoming_dir: Path = Path(r"C:\Labs\Incoming")
reports_dir: Path = Path(r"C:\Labs\Reports")
db_path: Path = Path("lab_agent.sqlite3")
poll_seconds: int = 60
model: str = "gpt-4o" # choose any model available to your account
max_chars_to_send: int = 120_000 # guardrail for very large PDFs
SYSTEM_INSTRUCTIONS = """You are a Lab Monitoring Agent for an older adult patient.
Your task: extract key lab values, flag notable abnormalities, and write a clear summary.
Rules:
- Be conservative: do not diagnose.
- If units/reference ranges are present, include them.
- Highlight changes if the report itself contains previous values.
- Create a structured output with: Summary, Key Values, Flags, Questions for Doctor.
- If the PDF text is messy, say so explicitly.
"""
# ----------------------------
# Utility functions
# ----------------------------
def setup_logging() -> None:
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s | %(levelname)s | %(message)s"
)
def sha256_of_file(path: Path) -> str:
h = hashlib.sha256()
with path.open("rb") as f:
for chunk in iter(lambda: f.read(1024 * 1024), b""):
h.update(chunk)
return h.hexdigest()
def extract_text_from_pdf(pdf_path: Path) -> str:
reader = PdfReader(str(pdf_path))
parts = []
for i, page in enumerate(reader.pages):
try:
parts.append(page.extract_text() or "")
except Exception as e:
parts.append(f"\n[Error extracting page {i}: {e}]\n")
return "\n".join(parts).strip()
def ensure_dirs(cfg: Config) -> None:
cfg.incoming_dir.mkdir(parents=True, exist_ok=True)
cfg.reports_dir.mkdir(parents=True, exist_ok=True)
# ----------------------------
# Database (idempotency)
# ----------------------------
def db_connect(db_path: Path) -> sqlite3.Connection:
conn = sqlite3.connect(str(db_path))
conn.execute("""
CREATE TABLE IF NOT EXISTS processed_files (
file_hash TEXT PRIMARY KEY,
file_name TEXT,
processed_at TEXT
)
""")
return conn
def is_processed(conn: sqlite3.Connection, file_hash: str) -> bool:
cur = conn.execute("SELECT 1 FROM processed_files WHERE file_hash = ?", (file_hash,))
return cur.fetchone() is not None
def mark_processed(conn: sqlite3.Connection, file_hash: str, file_name: str) -> None:
conn.execute(
"INSERT OR REPLACE INTO processed_files (file_hash, file_name, processed_at) VALUES (?, ?, ?)",
(file_hash, file_name, datetime.now().isoformat(timespec="seconds"))
)
conn.commit()
# ----------------------------
# OpenAI call (Responses API)
# ----------------------------
def summarize_with_openai(client: OpenAI, cfg: Config, pdf_text: str, file_name: str) -> str:
# Guardrail: avoid sending huge documents
if len(pdf_text) > cfg.max_chars_to_send:
pdf_text = pdf_text[: cfg.max_chars_to_send] + "\n\n[TRUNCATED]\n"
prompt = f"""Lab report file: {file_name}
Here is the extracted text:
-------------------------
{pdf_text}
-------------------------
"""
# Responses API reference: https://platform.openai.com/docs/api-reference/responses
# The docs recommend using Responses for stateful and tool-extended interactions. :contentReference[oaicite:2]{index=2}
resp = client.responses.create(
model=cfg.model,
input=[
{"role": "system", "content": SYSTEM_INSTRUCTIONS},
{"role": "user", "content": prompt},
],
)
# SDK convenience field: output_text is referenced in the Responses API docs. :contentReference[oaicite:3]{index=3}
return getattr(resp, "output_text", "") or ""
# ----------------------------
# Main loop
# ----------------------------
def process_one_pdf(client: OpenAI, cfg: Config, conn: sqlite3.Connection, pdf_path: Path) -> Optional[Path]:
file_hash = sha256_of_file(pdf_path)
if is_processed(conn, file_hash):
return None
logging.info(f"New PDF detected: {pdf_path.name}")
text = extract_text_from_pdf(pdf_path)
if not text:
logging.warning(f"No text extracted from {pdf_path.name}. (Scanned PDF?)")
summary = (
"Could not extract readable text from this PDF. "
"It may be a scanned image. Consider exporting text or using OCR."
)
else:
summary = summarize_with_openai(client, cfg, text, pdf_path.name)
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
out_path = cfg.reports_dir / f"{pdf_path.stem}__summary__{timestamp}.txt"
out_path.write_text(summary, encoding="utf-8")
mark_processed(conn, file_hash, pdf_path.name)
logging.info(f"Wrote summary: {out_path}")
return out_path
def scan_for_pdfs(cfg: Config) -> list[Path]:
return sorted([p for p in cfg.incoming_dir.glob("*.pdf") if p.is_file()])
def main() -> None:
load_dotenv()
setup_logging()
cfg = Config()
ensure_dirs(cfg)
# OpenAI SDK quickstart shows installing and creating a client. :contentReference[oaicite:4]{index=4}
client = OpenAI()
conn = db_connect(cfg.db_path)
logging.info("Lab Agent started.")
logging.info(f"Watching: {cfg.incoming_dir}")
logging.info(f"Reports: {cfg.reports_dir}")
logging.info(f"Polling every {cfg.poll_seconds}s")
try:
while True:
pdfs = scan_for_pdfs(cfg)
for pdf_path in pdfs:
try:
process_one_pdf(client, cfg, conn, pdf_path)
except Exception as e:
logging.exception(f"Error processing {pdf_path.name}: {e}")
time.sleep(cfg.poll_seconds)
finally:
conn.close()
if __name__ == "__main__":
main()
How this answers your “autonomy” requirement
- You
still decide where the data comes from (you put PDFs into the
folder).
- But
you do not have to tell the agent “analyze now.”
- It
monitors and acts on its own schedule.
That’s “autonomous processing” without giving a program your
portal password.
Next “bells and whistles” you can add (in priority order)
- Email / text alerts when certain keywords appear (e.g., “High”, “Low”, “Critical”).
- Trend tracking: store extracted lab values (BNP, creatinine, eGFR, etc.) in SQLite and graph them.
- OCR support if your portal PDFs are scanned images (common).
- Windows Task Scheduler service mode (start at boot, run invisibly).