аватар question@mail.ru · 01.01.1970 03:00

Asynchronous HTTP calls Grequets

always used the Request library. But how I can’t change my memory there, you can send a request to only 1 http url and only then to another, in tu. Grequests as it is written can be sent simultaneously at least 10 url asynchronism. But I don’t know how to do it, I have already tried it. The file has approximately 150 URLs. Code:

   import  makerssimplese =  'http://shost-craft.su'    with   open  ( "" C: \\ Cruelnetwork \\ Cruel.need \\ wolfs.txt "" )  as  werewolves:  array = [row.strip ()+simplesite  for  row  in  werewolves] params = { '' a 'a'  '' b ',  'c' :  'd' } rs = (Grequests.post (u, data = params)  for  u  in  in  map  (RS)   print  (RS)     

I waited for about a minute, I understand it for a very long time. And I don’t know how to solve this problem. Or is it just the same fast? At the end, it was displayed on the screen: & lt; Generator Object & Lt; Genexpr & GT; AT 0x0000029AD8D29A98 & GT; I want a request to 10 URL addresses immediately, not in tu, but immediately for 10 addresses. Or can not be implemented using this library?

аватар answer@mail.ru · 01.01.1970 03:00

The grequests library is an asynchronous wrapper over regular requests. Accordingly, when you have sent a bundle of request objects to grequests.map(), you will receive a list of response objects, something like this

[200], <Response [200],>, <Response [200],>, <Response [200], None, 200]]

class="">200]]

And you are already working with them as with regular requests.Response.

For example, to see the result of the first request in your code, try doing this, for example:

import/span> grequestssimplesite = 'http://shost-craft.su'withclass="">with  open(""C:\\cruelnetwork\\cruel.need\\wolfs.txt"") as werewolves:    array = [row.strip()+simplesite for row in werewolves]params ="">in werewolves]params = {'a':'b', 'c'/span>:'d'}rs = (grequests.post(u, data=params) for u in array)responses_list = grenresponses_list = grequests.map(rs)print(responses_list[0].text)printprint(rs)

If you specify exactly what you want to get, you may be able to give more precise recommendations

EDIT: under the hood, this library uses gevent with a task pool (lea more about it and asynchrony, for example), it blocks the call until the end of the entire batch, but it does not block the execution of each task in the batch. You can control the pool size. I wrote "" of the first request"" above because I didn't bother with the loop. I can suggest this solution:

import/span> grequestssimplesite = 'http://shost-craft.su'withclass="">with  open(""C:\\cruelnetwork\\cruel.need\\wolfs.txt"") as werewolves:    array = [row.strip()+simplesite for row in werewolves]params ="">in werewolves]params = {pan class="">'a':'b', 'c':'d'}rs = [grequests.post(u, data=params) for u in array]for r in r in grequests.imap(rs, size=10)     print(r.status_code, r.url)print(rs)

size=10 - it means, for example, throwing in ten tasks in a batch, as soon as one of them is completed, add another one (in case of performance problems)

imap in the loop will allow you to see the results immediately after completing each of the tasks

If you need straight clean parallel streams, then yes, just multiply threads or fork or something else, there are a lot of options.

EDIT2: I apologize for my incompetence on the issue of response statuses. So the situation is as follows. Considering that the grequests author does not use Error classes from requests, but does:

....def/span> send"">send(self, **kwargs):    """"""    Prepares request based on parameter passed to constructor and optional      `kwargs``.    Then sends request and saves reses response to :attr:`response`    :retus: `Response`    """"""    merged_kwargs = {}    merged_kwargs.update(self.e(self.kwargs) merged_kwargs.update(kwargs) try: self.response =self.session.request(self.method, self.url, **merged_kwargs) except Exception as e: self.exception =e self.traceback =traceback.format_exc() retu self

i.e.catches an Exception and throws it back. We can catch it this way:

import/span> grequestsdef"">def exception_eption_handler(request, exception):    print(""Request failed"", request.url) # Report invalid and delete url # print(str(exception))tion)) # if you want more detailssimplesite = 'http://shost-craft.su 'withnwith open(""C:\\cruelnetwork\\cruel.need\\wolfs.txt"") as werewolves:    array = [row.strip()+simplesite for row in werewolves]n class="">in werewolves]params = {'a':'b', 'c'/span>:'d'}rs = [grequests.post(u, data=params) for u in array]for r in="">for r in grequests.imap(rs, size=10, exception_handler=exception_handler)     print">print(r.status_code, r.url)

Now, if the request was not fulfilled for some reason, we will find out about it. You can only get the status_codes of requests that ended without an Exception. That is, 404 and other client or network errors do not retu to the final list. However, the question remains how to sort out the status. I can make an assumption that you can try to extract information from exception that can be used to compare it with one of the error types, and then expand it to statuses. But considering that all the sources found simply ignore this issue, it's up to you.

Latest

Similar