python - How to prevent downloading HTML/text pages as .png -
my program generates random links service hosts images, , grabs , downloads random images. program makes lot of requests, , has go through proxies.
well, when program started, give path fresh large proxy list; however, proxies not connect website , return custom html page - or image service return message on page "you don't have permission view image." although, program still save request , download page .png extension
and html/text pages saved .png files:
is there way can prevent downloading of these pages, , download actual images?
thank you.
if self.proxy != false: #make our requests go through proxy self.opener.retrieve(url, filename) else: urllib.request.urlretrieve(url, filename)
i think should change logic.
if proxy returns error getting page asked, uses http status code != 200
you should check in order:
- the http status != 200
- the content-type header returned correct type (in case image/jpeg)
and type of tasks suggest using requests module.
Comments
Post a Comment