You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the process_request function the proxy is passed to the request only if has an proxy_user_pass, otherwise only print that the proxy is beign used and which are left. That means that a proxy like https://176.37.14.252:8080 does not work?
This is the function:
defprocess_request(self, request, spider):
# Don't overwrite with a random one (server-side state for IP)if'proxy'inrequest.meta:
ifrequest.meta["exception"] isFalse:
returnrequest.meta["exception"] =Falseiflen(self.proxies) ==0:
raiseValueError('All proxies are unusable, cannot proceed')
ifself.mode==Mode.RANDOMIZE_PROXY_EVERY_REQUESTS:
proxy_address=random.choice(list(self.proxies.keys()))
else:
proxy_address=self.chosen_proxyproxy_user_pass=self.proxies[proxy_address]
ifproxy_user_pass:
request.meta['proxy'] =proxy_addressbasic_auth='Basic '+base64.b64encode(proxy_user_pass.encode()).decode()
request.headers['Proxy-Authorization'] =basic_authelse:
log.debug('Proxy user pass not found')
log.debug('Using proxy <%s>, %d proxies left'% (
proxy_address, len(self.proxies)))
The text was updated successfully, but these errors were encountered:
ravillarreal
changed the title
How to check that a proxy is being used?
How to check that a proxy is really being used?
Aug 16, 2018
I made a test with this middleware : without proxy_user_pass (I don't have one to test with), proxy is not used :
import scrapy
class MyipSpider(scrapy.Spider):
name = 'myip'
start_urls = ['http://www.mon-ip.com]
def parse(self, response):
for in in response.xpath('//*[@id="PageG"]'):
yield {
'ip': ip.xpath('p[3]/span[2]//text()').extract_first(),
}
gives : 2018-08-28 15:17:10 [scrapy.proxies] DEBUG : Using proxy <https://pro.xy.add.ress:port>, x proxies left [...] 2018-08-28 15:17:10 [scrapy.core.scraper] DEBUG : Scraped from <200 http://www.mon-ip.com> {'ip': 'my.ip.add.ress'}
In the
process_request
function the proxy is passed to the request only if has anproxy_user_pass
, otherwise only print that the proxy is beign used and which are left. That means that a proxy likehttps://176.37.14.252:8080
does not work?This is the function:
The text was updated successfully, but these errors were encountered: