# PDF Timetable Conversion Workflow

This documentation guides you through using the Python command-line converter to transform your school PDF timetables into the standard `schedule.json` shape queryable by **Schedule Lens / 課表透鏡**.

---

## Why a Local Converter?

**Schedule Lens** is hosted as a **static website** on Cloudflare Pages. It is designed to preserve extreme privacy and runs entirely inside your browser without uploading any data to external servers.

Because of this static client-side sandbox architecture:
1. Browsers cannot run a native **Python interpreter**.
2. Upstream parser scripts written in Python (using libraries like `pdfplumber`) must be executed locally on your computer.
3. Once converted, the resulting lightweight `schedule.json` can be imported into the web UI with a single click.

---

## Setup & Running the Converter

The converter scripts are vendored and wrapped in the [class-schedule/converter/](file:///d:/Projects/Website%20Tools/website-tools/class-schedule/converter/) directory.

### Quick Start (Windows)

1. Open your terminal in the converter directory:
   ```bash
   cd class-schedule/converter
   ```
2. Set up the Python virtual environment and install dependencies:
   ```bash
   python -m venv .venv
   .venv\Scripts\activate
   pip install -r requirements.txt
   ```
3. Run the converter on your timetable:
   ```bash
   python run_converter.py ./samples/課表.pdf --out ./output/schedule.json
   ```

---

## Troubleshooting Guide

### 1. `ModuleNotFoundError: No module named 'pdfplumber'`
* **Cause**: The Python virtual environment is either not activated or the dependencies inside `requirements.txt` were not installed.
* **Solution**: Run `pip install -r requirements.txt` while your virtual environment (`.venv`) is activated.

### 2. `ERROR: No sessions found.` / `ValueError: 找不到授課教師姓名`
* **Cause**: This error occurs if the timetable layout does not match the coordinate parameters defined in the parser coordinates grid.
* **Solution**: Verify if the PDF is landscape A4. If your school's grid margins differ slightly, you may need to adjust the coordinate parameters (`DAY_COLUMNS` and teacher height metrics) inside [extract_schedule_v2.py](file:///d:/Projects/Website%20Tools/website-tools/class-schedule/converter/vendor/hkhorazon-classschedule/extract_schedule_v2.py).

### 3. `ERROR: Failed to parse PDF: ...` / Scanned Timetable Failure
* **Cause**: The parser cannot find vector text elements. The PDF is likely a scanned image or photo of the schedule.
* **Solution**: The converter does not support OCR. Run your document through a third-party OCR text recognition tool first, or manually create a compliant `schedule.json` using the **JSON Format Help** shown in the Schedule Lens footer.

### 4. `UnicodeEncodeError` in Windows Terminal
* **Cause**: The Windows command prompt default code page (e.g. CP950) cannot print certain special characters.
* **Solution**: The conversion succeeds despite output warnings. Simply ignore the terminal printing encoding error and check your [output/schedule.json](file:///d:/Projects/Website%20Tools/website-tools/class-schedule/converter/output/) directory for the completed JSON.
