Blame
| 1ceca3 | Anonymous | 2026-04-13 07:00:51 | 1 | # SharePoint Video Scraper - Implementation & Discussion |
| 2 | ||||
| 3 | ## Overview |
|||
| 4 | As of April 2025, the SharePoint video scraper for VISNA has been successfully deployed and is operational. This system scrapes and processes lecture videos uploaded to Canvas and hosted on SharePoint for VISNA. |
|||
| 5 | ||||
| 6 | ## Current Capabilities |
|||
| 7 | ||||
| 8 | ### Video Processing |
|||
| 9 | - **Direct MP4 Downloads**: Can download videos directly uploaded to SharePoint as MP4 files |
|||
| 10 | - **Canvas Integration**: Downloads videos hosted on SharePoint that are shared through Canvas weekly modules |
|||
| 11 | - **Automated Processing**: Downloads, transcribes, vectorizes, and indexes videos for VISNA access |
|||
| 12 | - **Nightly Updates**: VISNA's access to new content is refreshed nightly |
|||
| 13 | ||||
| 14 | ### Limitations |
|||
| 15 | - **No Panopto Support**: Cannot scrape videos from Panopto due to API limitations |
|||
| 16 | - **Panopto Integration**: Not planned for implementation at this time |
|||
| 17 | ||||
| 18 | ## Technical Implementation |
|||
| 19 | ||||
| 20 | ### Architecture |
|||
| 21 | - **Technology**: Async Playwright script |
|||
| 22 | - **Method**: Automated web browser agent navigation |
|||
| 23 | - **Access**: Uses credentials provided by Kytec: scraperbot@student.isn.edu.au |
|||
| 24 | ||||
| 25 | ### Security & Permissions |
|||
| 26 | - **Scoped Access**: Bot can only access SharePoint videos made available through Canvas |
|||
| 27 | - **No Direct Canvas Access**: Canvas access is delegated to a separate agent for permission separation |
|||
| 28 | - **Restricted Navigation**: ScraperBot cannot navigate outside of assigned SharePoint folders |
|||
| 29 | - **Document Restrictions**: Cannot access other documents unless explicitly granted permissions |
|||
| 30 | - **Institution-Level Access**: Videos have institution-wide access permissions |
|||
| 31 | ||||
| 32 | ### Workflow |
|||
| 33 | 1. Bot accesses Canvas links using Kytec-provided credentials |
|||
| 34 | 2. Navigates through SharePoint pages to reach video content |
|||
| 35 | 3. Downloads lecture videos |
|||
| 36 | 4. Processes videos (transcription, vectorization, indexing) |
|||
| 37 | 5. Updates VISNA's knowledge base for student queries |
|||
| 38 | ||||
| 39 | ## Recommended Usage Guidelines |
|||
| 40 | ||||
| 41 | ### For Lecturers |
|||
| 42 | - Host lectures on Microsoft Teams |
|||
| 43 | - Use SharePoint-hosted video recordings for student access |
|||
| 44 | - Follow the video hosting method used by Jian in ISN 402 (Psychological Assessment) as reference |
|||
| 45 | ||||
| 46 | ### Best Practices |
|||
| 47 | - Primary recommendation: Transition to SharePoint as the default video hosting provider |
|||
| 48 | - Consider phasing out Panopto dependency |
|||
| 49 | - Create dedicated Canvas courses/units with organized video collections |
|||
| 50 | - Ensure proper categorization by unit and week |
|||
| 51 | ||||
| 52 | ## Future Considerations |
|||
| 53 | ||||
| 54 | ### Proposed Improvements |
|||
| 55 | - **Centralized Video Management**: Create a dedicated Canvas course containing all videos for easier scraping |
|||
| 56 | - **Panopto Migration**: Copy Panopto videos to SharePoint for unified access |
|||
| 57 | - **Workflow Optimization**: Categorize videos automatically by unit and week |
|||
| 58 | ||||
| 59 | ### Maintenance Requirements |
|||
| 60 | - **Permission Auditing**: Periodic review with Kytec to ensure permission structure remains appropriate |
|||
| 61 | - **System Monitoring**: VISNA has been running with minimal intervention and no downtime |
|||
| 62 | - **Bug Fixes**: Ongoing maintenance for document reviewer and UI improvements |
|||
| 63 | ||||
| 64 | ## System Performance |
|||
| 65 | - **Reliability**: Operating with minimal intervention for several weeks |
|||
| 66 | - **Uptime**: No reported downtime since implementation |
|||
| 67 | - **Integration**: Fully integrated with VISNA ecosystem |
|||
| 68 | ||||
| 69 | ## Related Systems |
|||
| 70 | - **VISNA**: AI assistant that utilizes the indexed video content |
|||
| 71 | - **Canvas**: Learning management system integration |
|||
| 72 | - **SharePoint**: Primary video hosting platform |
|||
| 73 | - **L40S Server**: On-premises hosting infrastructure |
