Information Extraction
Pilot Schedule Optimization Tool
Client
Self Development
YeAr
2025
Category
Natural Language Processing (NLP)
Service
Optimizing Operations

Tools / Languages Used
- Python (pdfminer, Pandas, NumPy)
- Excel for data exploration and filtering
- Regex for text parsing
- Jupyter Notebook for development
Technical Skills
- PDF scraping and information extraction
- Data cleaning and transformation
- Combinatorial logic and constraint-based filtering
- Automation and data structuring for decision support
Soft Skills
- Translating a real-world manual process into an automated solution
- Problem-solving and creative application of data skills
- Clear documentation and user-oriented design
- Collaboration and communication with end user (pilot scheduling context)
Step 1: Exploratory Data ANalysis
- Reviewed monthly pilot bid packets provided in PDF format — each containing hundreds of possible flight pairings with dates, destinations, duty hours, and rest periods.
- Identified challenges: PDFs were inconsistent in format, data was embedded in text blocks, and not easily searchable.
- Determined the need for a structured dataset to filter flights by date and hour constraints.
Step 2: Solution Design
- Built a Python scraper using pdfminer and regex patterns to extract flight number, departure/arrival times, duty hours, and layover details from the PDFs.
- Structured the extracted data into a clean, tabular format using Pandas, then exported to Excel for accessibility.
- Designed logic to:
- Exclude flights overlapping specific “off” dates.
- Ensure compliance with FAA rest period rules (e.g., minimum 10 hours rest).
- Meet required monthly duty hour minimums.
Step 3: Model Assessment
- Validated extracted data against sample manual entries for accuracy ( >95% match).
- Tested multiple scenarios (e.g., different off-day combinations, partial weeks off).
- Ensured results returned valid flight pairings that met all rest and hour requirements.
- Added flags in Excel for quick visual validation (e.g., highlighting conflicts or insufficient hours).
Step 4: Results / How It’s Used
- The final tool automatically outputs flight combinations that satisfy personal preferences and scheduling constraints.
- Reduced the manual schedule selection process from several hours to just minutes.
- Provided flexibility to test multiple “what-if” scenarios quickly (e.g., alternative days off).
- The approach can be adapted for other pilots with similar scheduling constraints or even scaled into a user interface in the future.
- The final tool automatically outputs flight combinations that satisfy personal preferences and scheduling constraints.
- Reduced the manual schedule selection process from several hours to just minutes.
- Provided flexibility to test multiple “what-if” scenarios quickly (e.g., alternative days off).
- The approach can be adapted for other pilots with similar scheduling constraints or even scaled into a user interface in the future.

