Blame

1ceca3 Anonymous 2026-04-13 07:00:51 1
# SharePoint Video Scraper - Implementation & Discussion
2
3
## Overview
4
As of April 2025, the SharePoint video scraper for VISNA has been successfully deployed and is operational. This system scrapes and processes lecture videos uploaded to Canvas and hosted on SharePoint for VISNA.
5
6
## Current Capabilities
7
8
### Video Processing
9
- **Direct MP4 Downloads**: Can download videos directly uploaded to SharePoint as MP4 files
10
- **Canvas Integration**: Downloads videos hosted on SharePoint that are shared through Canvas weekly modules
11
- **Automated Processing**: Downloads, transcribes, vectorizes, and indexes videos for VISNA access
12
- **Nightly Updates**: VISNA's access to new content is refreshed nightly
13
14
### Limitations
15
- **No Panopto Support**: Cannot scrape videos from Panopto due to API limitations
16
- **Panopto Integration**: Not planned for implementation at this time
17
18
## Technical Implementation
19
20
### Architecture
21
- **Technology**: Async Playwright script
22
- **Method**: Automated web browser agent navigation
23
- **Access**: Uses credentials provided by Kytec: scraperbot@student.isn.edu.au
24
25
### Security & Permissions
26
- **Scoped Access**: Bot can only access SharePoint videos made available through Canvas
27
- **No Direct Canvas Access**: Canvas access is delegated to a separate agent for permission separation
28
- **Restricted Navigation**: ScraperBot cannot navigate outside of assigned SharePoint folders
29
- **Document Restrictions**: Cannot access other documents unless explicitly granted permissions
30
- **Institution-Level Access**: Videos have institution-wide access permissions
31
32
### Workflow
33
1. Bot accesses Canvas links using Kytec-provided credentials
34
2. Navigates through SharePoint pages to reach video content
35
3. Downloads lecture videos
36
4. Processes videos (transcription, vectorization, indexing)
37
5. Updates VISNA's knowledge base for student queries
38
39
## Recommended Usage Guidelines
40
41
### For Lecturers
42
- Host lectures on Microsoft Teams
43
- Use SharePoint-hosted video recordings for student access
44
- Follow the video hosting method used by Jian in ISN 402 (Psychological Assessment) as reference
45
46
### Best Practices
47
- Primary recommendation: Transition to SharePoint as the default video hosting provider
48
- Consider phasing out Panopto dependency
49
- Create dedicated Canvas courses/units with organized video collections
50
- Ensure proper categorization by unit and week
51
52
## Future Considerations
53
54
### Proposed Improvements
55
- **Centralized Video Management**: Create a dedicated Canvas course containing all videos for easier scraping
56
- **Panopto Migration**: Copy Panopto videos to SharePoint for unified access
57
- **Workflow Optimization**: Categorize videos automatically by unit and week
58
59
### Maintenance Requirements
60
- **Permission Auditing**: Periodic review with Kytec to ensure permission structure remains appropriate
61
- **System Monitoring**: VISNA has been running with minimal intervention and no downtime
62
- **Bug Fixes**: Ongoing maintenance for document reviewer and UI improvements
63
64
## System Performance
65
- **Reliability**: Operating with minimal intervention for several weeks
66
- **Uptime**: No reported downtime since implementation
67
- **Integration**: Fully integrated with VISNA ecosystem
68
69
## Related Systems
70
- **VISNA**: AI assistant that utilizes the indexed video content
71
- **Canvas**: Learning management system integration
72
- **SharePoint**: Primary video hosting platform
73
- **L40S Server**: On-premises hosting infrastructure