-
Notifications
You must be signed in to change notification settings - Fork 21
use requests library instead of 'curl' and obtain all comments #357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
Tested in casparvl/software-layer#6, I can confirm it fixes the issue since this report casparvl/software-layer#6 (comment) is complete (even though there is >> 100 comments and the build is comment 130-or-so). I'll still need to review the actual code, but this is promising at least :) |
|
|
||
| try: | ||
| while url: | ||
| response = requests.get(url, params={'per_page': 100}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should specify a timeout argument - default is 'None', which means it'll wait forever if it doesn't get a response. I'm not sure what a good value is, but I guess 10 would be pretty reasonable.
| response = requests.get(url, params={'per_page': 100}) | |
| response = requests.get(url, params={'per_page': 100}, timeout=10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should specify a timeout to avoid indefinite hangs.
We should also realize that we might hit the unauthenticated rate limit (60 requests/hour) quite easily. A token could help, but that would probably have to be set in the config of the bot... Some dummy code of how it would be used here is then:
headers = {}
if token:
headers["Authorization"] = f"Bearer {token}"
...
response = requests.get(url, params={'per_page': 100}, headers=headers, timeout=10)
It is a bit weird, since this means the bot would need someone's token (i.e. identify as an individual)... but it could be an optional configuration. Having the capability at least means an easy fix (update of a bot config, rather than a full new bot code) if we ever hit this. I leave it up to you if you think it's worth implementing in this PR - I'm also fine in doing it in a follow-up PR, or even just postponing until it is actually an issue.
Another thing that could be considered is to wrap this in a retry loop, to make it more resilient against transient 5xx errors or network hiccups.
Again, pseudo (well: unchecked) code:
while url:
for attempt in range(1, max_retries + 1):
try:
resp = requests.get(
url,
params={"per_page": per_page},
headers=headers,
timeout=10, # avoid hanging forever
)
resp.raise_for_status() # 4xx/5xx → exception
break # success → exit retry loop
except requests.HTTPError as http_err:
# 5xx are usually transient – retry; 4xx are not
if 500 <= resp.status_code < 600 and attempt < max_retries:
wait = 2 ** attempt
log.warning(
"Transient HTTP %s, retry %d/%d after %ds",
resp.status_code, attempt, max_retries, wait,
)
time.sleep(wait)
continue
# re‑raise – caller will see the exception
raise
except requests.RequestException as req_err:
# network glitches, timeouts, DNS failures, etc.
if attempt < max_retries:
wait = 2 ** attempt
log.warning("Network error: %s – retry %d/%d after %ds",
req_err, attempt, max_retries, wait)
time.sleep(wait)
continue
raise
Again, I'm not convinced we should do this in the current PR - we could also retry by just calling bot:status again. To me, a quick merge is more important, as it is currently very hard to get (complete) status overviews for builds with many targets (i.e. CPU+GPU), which is exactly where this functionality is most essential. It's almost impossible to check if all CPU+GPU combinations for 2025.06 are covered, currently.
fixes #356
PR looks much bigger than it actually is. For most lines, only indentation has been reduced by one level / 4 spaces (previous version: lines 1217 - 1356, new version: lines 1229 - 1368)
Other changes are:
requestslibrarycurlcommand by using therequestslibrary plus using theLinkheader in the response for obtaining the next page of the paginated response (if PR has more than 100 comments) and handles an exception if any happens while querying GitHub's REST APIcommentstoall_comments