i was recently made aware that scraping IMDb is not allowed according to their TOS. [http://xbmc.org/forum/showthread.php?p=268633](http://xbmc.org/forum/showthread.php?p=268633)
since the scraper in plex also originated in the work i and others did, i respectfully ask that it is disabled until legalities etc has been taken care of. i have already done this in xbmc; [http://xbmc.org/trac/changeset/17032](http://xbmc.org/trac/changeset/17032)
as a (possibly temporary) replacement, we switched to themoviedb.org as the default backend; [http://xbmc.org/trac/changeset/17035](http://xbmc.org/trac/changeset/17035)
if you do decide to honor my request, please register for your own api key on themoviedb.org and replace ours in tmdb.xml.
Not all the information that we (and thus you) scrape are facts. Simple examples are plots, reviews, ratings, votes and so on.
Given that, we’ve disabled the scraper, and have also contacted IMDb to both let them know of the situation, and to also request permission to have the scraper available to users.
I'd be sorry to see the imdb scraper go.. and... having a good look at their TOS, I don't think it needs to - they seem to be concerned about commercial use (highlighting is mine):
I am not even sure what the first line of the first paragraph means - it'd imply that you need a license to even view the site (but not to cache it!)... so I think we can ignore that safely enough. The implication in the second line is that to reproduce it they are only concerned about commercial use.
Similarly - if we consider the scraper to be a robot, then this only applies if it is for 'non-personal' or commercial use - which is not what we are doing.
This is rather interesting. I’d like to point out the IMDB provides their what looks like their database for download here. So it’s feasable that someone could download the databases and host them on their own site. Has Boxee also been notified also?
As a note to Ken this line does pop out:
Robots and Screen Scraping: You may not use data mining, robots, screen scraping, or similar data gathering and extraction tools on this site, except with our express written consent as noted below.
Allmusic.com also seem to have the same policy against this listed in their TOS. Was permission ever granted in that area?
Lastly there is the issue of user inconvenience, do we remove a critical feature from the software because uncertainty of IMDB’s TOS. IIRC the IMDB scraper has been around for a few years now and no issue has been brought up from IMDB at all. This is something we will have to discuss internally before taking any actions.
Also on top of all this is the issue of other scrapers included the scrape PARTS of IMDB. For example sratim scraper also scrapes IMDB for rating. So these, and any others that scrape parts of IMDB will also need to be modified or removed.
I remember back a while ago, Allmusic would actually lock you out for a day or two if it detected scrapping from your IP address. I used a scrapper plug-in in Jriver Media Center that did just that. It's a bummer, but I guess that'll move us towards more open solutions in the long run...
Removing the IMDB scraper prior to getting any actual response from IMDB would be a massive blow for this project. (XBMC included).
There is currently no site that offers the depth of information that IMDB does. While I’m not trying to rag on themoviedb.org, it’s got a proverbial everest in front of it in order to get to IMDB’s level.
So far it doesn't look like we'll be removing the IMDB scraper from our package. I agree that it would be a huge setback for our project, also we would have to remove/edit a number of other scrapers and to become fully "compliant" (some of the other scrapers also get info from IMDB as noted above). Even team XBMC hasn't taken all these steps to ensure they are 100% compliant with imdb and allmusic to the best of my knowledge. While Spiff (and others) have put a lot of work into writing the scrapers they are released under GPL so we will have to respectfully decline the request to remove them from our packages.
That sad I'm glad to see themoviedb.org is gaining some traction, hopefully we'll end up with a service as open and awesome as thetvdb.org in the near future.
Not with our help, though. The latest Plex version I have doesn't even have it included as an option.
I, for one, would rather support themoviedb.org than keep using IMDB, to say the truth. A proper API and an initially open database is way preferable. As someone that's support XBMC for years and Plex after that I can readily accept most of what we scrape from IMDB is there to feed our information addiction, but we seldom go further than art, fanart, name, year and, maybe, plot. In that order, too.
Moving to TMDB wouldn't be a "huge blow", as it hasn't been for XBMC (and those guys are even more obsessed about this than we could ever delude ourselves to think we are, they made the original scrapers). I have Plex on my iMac and XBMC on my original X-Box and I rather prefer what I'm getting there over IMDB (not even going into whether its data is not subject to license interpretations).
I don't mind Plex deciding not removing IMDB but I resent not being able to use TheMovieDB. I guess I could add the scrapers myself, but then I would lose them in every update.
Please, consider adding TheMovieDB. It can't hurt Plex and we already have some pretty useless scrapers there, so that can't be a reason.
TheMovieDB scraper was removed in 0.7.8 because it was causing problems. It’s been put back in to 0.7.9 (out now), along with a few other scraper & NFO related fixes.