Developing custom scanner, when does Scan() recursion happen?
I’m developing a custom scanner/metadata agent as kind of an exercise to teach myself Python.
I’ve found a handful of resources floating around the web, but no comprehensive documentation about scanning, so I’m trying to reverse-engineer my scanner based on other scanners I’ve found.
I suppose my biggest question for the moment is, when/where/how many times is the Scan method called?
I’ve found, in a number of places, the following test code:
if __name__ == '__main__': #command line
path = sys.argv[1]
files = [os.path.join(path, file) for file in os.listdir(path)]
media = []
Scan(path[1:], files, media, [])
print("Files detected: ", media)
If I point this to my own library structure, I could potentially get varying results. I’m just trying to understand.
So given this folder structure:
/
$PLEX_HOME/Library/Application Support/Plex Media Server/ <-- Plex main folder
(%USERPROFILE%\Local Settings\Application Data\Plex Media Server\) <-- Plex main folder on Windows
(/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/) <-- Plex main folder on Debian
Plug-ins/
PlexSportsAgent.bundle/
Contents/
Code/
__init__.py <-- Metadata Agent script
Resources/
DefaultPrefs.json
info.plist
Scanners/
Series/
PlexSportsScanner/ <-- Scanner script (not sure if I can put in a folder structure like this, but I'd like to keep it organized)
Data/
Leagues/ <-- Should this be moved somewhere else?
MLB/
Teams.json <-- Programmatically cached teams info
NBA/
Teams.json <-- Programmatically cached teams info
NFL/
Teams.json <-- Programmatically cached teams info
NHL/
Teams.json <-- Programmatically cached teams info
__init__.py
SportsDataIO.py
Teams.py
TheSportsDB.py
__init__.py <-- Defines the Scan method
mnt/Media/
Video/
Sports/ <-- Sports library
MLB/
2021/
NBA/
NFL/
2004-2005/
NFL.Super Bowl.XXXIX.Patriots.vs.Eagles.720p.HD.TYT.mp4
NFL.Super Bowl.XXXIX.Patriots.vs.Eagles.720p.HD.TYT.ts
2017-2018/
Super.Bowl.LII.2018.02.04.Eagles.vs.Patriots.1080p.HDTV.x264.Merrill-Hybrid-5.1-PHillySPECIAL.mkv
NHL/
UFC/
.plexignore
Phillies vs. Red Sox Game Highlights (7_10_21) _ MLB Highlights.mp4
yt1s.com - Phillies vs Cubs Game Highlights 70821 MLB Highlights.mp4
yt1s.com - Phillies vs Red Sox Game Highlights 7921 MLB Highlights.mp4
… the aforementioned script yields me the following results, presuming that sys.argv[1] was the path to the root of my Sports Library
Scan(path, files, media, subdirs, root=None)
path = "mnt/Media/Sports"
files = [
"/mnt/Media/Sports/MLB",
"/mnt/Media/Sports/NBA",
"/mnt/Media/Sports/NFL",
"/mnt/Media/Sports/NHL",
"/mnt/Media/Sports/UFC",
"/mnt/Media/Sports/.plexignore",
"/mnt/Media/Sports/Phillies vs. Red Sox Game Highlights (7_10_21) _ MLB Highlights.mp4",
"/mnt/Media/Sports/yt1s.com - Phillies vs Cubs Game Highlights 70821 MLB Highlights.mp4",
"/mnt/Media/Sports/yt1s.com - Phillies vs Red Sox Game Highlights 7921 MLB Highlights.mp4"
]
subdirs = []
My assumptions here are that:
sys.argv[1]is fully-qualified (just for argument’s sake)filesonly has the surface depth ofpath. If so, when does the recursion take place? The way this is currently set up, this will script will yield 5 directories and 4 files (Phillies games and a.plexignorefile)
– If I pass this through to my ownScan(), which in turn, gets filtered byVideoFiles.Scan(), ultimately I am left with 3 video files to chomp on.
– Am I responsible for recursing the remaining folder structures myself? Or isScan()called multiple times from the outside for the next level of depth?
– If I am responsible for recursing myself, should thesubdirsparmeter be populated? Presumably with the 5 subfolders? Or should I take the test script at face value and discover the subfolders from thefilesparameter? Should thefilesparameter be filtered from the test script to only include files, not files AND folders (os.listdir(path) if os.path.isfile(file)- or something like that)
As it currently stands, Scan() will only yield me 3 Phillies games, and whack everything else out of existence. This is why I’m confused.
Can someone, perhaps a member of the Plex team, assist me in understanding?