use requests library instead of 'curl' and obtain all comments #357

trz42 · 2025-12-02T21:05:45Z

fixes #356

PR looks much bigger than it actually is. For most lines, only indentation has been reduced by one level / 4 spaces (previous version: lines 1217 - 1356, new version: lines 1229 - 1368)

Other changes are:

imports requests library
replaces the existing logic relying on the curl command by using the requests library plus using the Link header in the response for obtaining the next page of the paginated response (if PR has more than 100 comments) and handles an exception if any happens while querying GitHub's REST API
variable that contains all comments changed from comments to all_comments
old lines 1358-1359 are not needed any longer

casparvl · 2025-12-24T12:59:10Z

Tested in casparvl/software-layer#6, I can confirm it fixes the issue since this report casparvl/software-layer#6 (comment) is complete (even though there is >> 100 comments and the build is comment 130-or-so).

I'll still need to review the actual code, but this is promising at least :)

casparvl · 2025-12-24T13:45:55Z

tasks/build.py

+
+    try:
+        while url:
+            response = requests.get(url, params={'per_page': 100})


You should specify a timeout argument - default is 'None', which means it'll wait forever if it doesn't get a response. I'm not sure what a good value is, but I guess 10 would be pretty reasonable.

Suggested change

response = requests.get(url, params={'per_page': 100})

response = requests.get(url, params={'per_page': 100}, timeout=10)

casparvl

Should specify a timeout to avoid indefinite hangs.

We should also realize that we might hit the unauthenticated rate limit (60 requests/hour) quite easily. A token could help, but that would probably have to be set in the config of the bot... Some dummy code of how it would be used here is then:

    headers = {}
    if token:
        headers["Authorization"] = f"Bearer {token}"
...
    response = requests.get(url, params={'per_page': 100}, headers=headers, timeout=10)

It is a bit weird, since this means the bot would need someone's token (i.e. identify as an individual)... but it could be an optional configuration. Having the capability at least means an easy fix (update of a bot config, rather than a full new bot code) if we ever hit this. I leave it up to you if you think it's worth implementing in this PR - I'm also fine in doing it in a follow-up PR, or even just postponing until it is actually an issue.

Another thing that could be considered is to wrap this in a retry loop, to make it more resilient against transient 5xx errors or network hiccups.

Again, pseudo (well: unchecked) code:

while url:
        for attempt in range(1, max_retries + 1):
            try:
                resp = requests.get(
                    url,
                    params={"per_page": per_page},
                    headers=headers,
                    timeout=10,                 # avoid hanging forever
                )
                resp.raise_for_status()       # 4xx/5xx → exception
                break                         # success → exit retry loop
            except requests.HTTPError as http_err:
                # 5xx are usually transient – retry; 4xx are not
                if 500 <= resp.status_code < 600 and attempt < max_retries:
                    wait = 2 ** attempt
                    log.warning(
                        "Transient HTTP %s, retry %d/%d after %ds",
                        resp.status_code, attempt, max_retries, wait,
                    )
                    time.sleep(wait)
                    continue
                # re‑raise – caller will see the exception
                raise
            except requests.RequestException as req_err:
                # network glitches, timeouts, DNS failures, etc.
                if attempt < max_retries:
                    wait = 2 ** attempt
                    log.warning("Network error: %s – retry %d/%d after %ds",
                                req_err, attempt, max_retries, wait)
                    time.sleep(wait)
                    continue
                raise

Again, I'm not convinced we should do this in the current PR - we could also retry by just calling bot:status again. To me, a quick merge is more important, as it is currently very hard to get (complete) status overviews for builds with many targets (i.e. CPU+GPU), which is exactly where this functionality is most essential. It's almost impossible to check if all CPU+GPU combinations for 2025.06 are covered, currently.

use requests library instead of 'curl' and obtain all comments

97d6625

trz42 added the bug label Dec 2, 2025

fix spelling typo

5c22855

casparvl mentioned this pull request Dec 24, 2025

Add cowsay as test casparvl/software-layer#6

Open

casparvl reviewed Dec 24, 2025

View reviewed changes

casparvl requested changes Dec 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use requests library instead of 'curl' and obtain all comments #357

use requests library instead of 'curl' and obtain all comments #357

Uh oh!

trz42 commented Dec 2, 2025 •

edited

Loading

Uh oh!

casparvl commented Dec 24, 2025

Uh oh!

casparvl Dec 24, 2025

Uh oh!

casparvl left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	response = requests.get(url, params={'per_page': 100})
	response = requests.get(url, params={'per_page': 100}, timeout=10)

use requests library instead of 'curl' and obtain all comments #357

Are you sure you want to change the base?

use requests library instead of 'curl' and obtain all comments #357

Uh oh!

Conversation

trz42 commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

casparvl commented Dec 24, 2025

Uh oh!

casparvl Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

casparvl left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

trz42 commented Dec 2, 2025 •

edited

Loading

casparvl left a comment •

edited

Loading