Weird HTTP request hijinks.

TehCrucible · December 23, 2013, 11:15pm

Hey guys,

One of my channels has recently broken on its own overnight. Seems now whenever I do a HTTP.ElementFromURL request to that domain, I am only returned the following element:

Eg:

2013-12-24 09:01:27,261 (-4f863490) :  DEBUG (networking:172) - Requesting 'http://www.animeplus.tv/popular-anime/1'
2013-12-24 09:01:28,138 (-4f863490) :  INFO (__init__:57) - 
2013-12-24 09:01:28,139 (-4f863490) :  INFO (__init__:77) - No shows found! Check xpath queries.

Previously, the same code has worked fine. Considering this error appeared overnight and the meta tag is seemingly redirecting me to nothing, could this be the website attempting to circumvent my http requests? Or something else? Is there any way around this? The actual source page include that particular meta tag so I'm not even sure where that's coming from. Is there a tool I can use to see whats actually going on when we do these requests?

Channel code here. Thanks in advance.

dane22 · December 23, 2013, 11:22pm

The URL in a browser returned a list of Anime to me, so maybe you are blocked somehow.....

Only thing I can thing of to digest this, is low level analyzing, meaning Wireshark

But that will still only tell you about what is going on in your case, and not the back-end logic

/Tommy

TehCrucible · December 24, 2013, 2:16am

Thanks Tommy, yeah the URL works fine in a browser for me as well. Its only when using the HTTP request from Plex, which is what I find strange. The log excerpt above is the result of me calling HTML.ElementFromURL then using HTML.StringFromElement on that element. Then finally logging the resulting string. It appears the all I'm getting from the initial request is the quoted line above. I'll check out Wireshark. Thanks for the link.

TehCrucible · January 3, 2014, 11:52am

Ok, I'm completely stumped here. Would someone mind taking a look at my plugin (in the original post) and see if they can provide any insight? As far as I can tell, I'm doing everything right and I'm not getting anything back from my HTTP requests. Other channels/websites work fine using the same code? Is it possible that the devs behind the website can identify and block/redirect requests specifically from the Plex API?

meo · January 3, 2014, 12:55pm

Seems like the following is happening(logged from Safari):

The URL is requested
The website returns the above HTML together with a Set-Cookie request
The URL is requested again but now with the cookie set from step 2.
The website returns a redirect(302) together with a new cookie(session id)
The redirected URL is requested with both cookies set from step 2 and 4.
The "real" webpage is returned.

Using CURL gives the same result as from the Plex API. I guess you have to build some own logic around this, it seems like a DOS(Denial Of Service) protection mechanism?

TehCrucible · January 3, 2014, 1:44pm

Thanks for the quick reply! That would make sense. However I was under the impression that the Plex API would handle cookies for me? Could you recommend another plugin that uses session id cookies for me to experiment with?

meo · January 3, 2014, 2:32pm

Thanks for the quick reply! That would make sense. However I was under the impression that the Plex API would handle cookies for me? Could you recommend another plugin that uses session id cookies for me to experiment with?

Yes you're right, the Plex API will handle the cookie part(I've must have remembered it wrong). The problem is that the HTML code is telling the browser to refresh and therefore we must do the same in the plugin.

I added the following routine to __init__.py:

def HTMLElementFromURL(url):
	request = HTTP.Request(url)
if 'http-equiv="refresh"' in request.content:
	request = HTTP.Request(url, cacheTime = 0)

return HTML.ElementFromString(request.content)

and replaced all calls to the standard 'HTML.ElementFromURL' with the above.

Not the nicest piece of code but it did work at least :)

Gerk · January 3, 2014, 4:44pm

It seems like their server (whoever is hosting that site) is actually not configured correctly ... they are returning raw PHP code within that request when I do a direct cURL of it ...

$ curl "http://www.animeplus.tv/popular-anime/"

TehCrucible · January 3, 2014, 9:30pm

Thanks guys; awesome work Meo. Works great. You're right, it's not pretty, but it'll do. Thanks a million.

system · December 21, 2019, 12:46am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
HTML.ElementFromURL execute javascript Dev/API Corner plugin-dev	7	753	January 8, 2020
http headers for stream Dev/API Corner plugin-dev	23	1417	January 8, 2020
Plex networking on 0.9.5 Dev/API Corner other-dev	3	93	December 20, 2019
WebVideoItem and HTTP Response Status 301 Dev/API Corner plugin-dev	3	437	December 20, 2019
channel redirect external link Dev/API Corner plugin-dev	3	112	January 8, 2020

Weird HTTP request hijinks.

Related topics