Skip to content

Conversation

@trz42
Copy link
Contributor

@trz42 trz42 commented Dec 2, 2025

fixes #356

PR looks much bigger than it actually is. For most lines, only indentation has been reduced by one level / 4 spaces (previous version: lines 1217 - 1356, new version: lines 1229 - 1368)

Other changes are:

  • imports requests library
  • replaces the existing logic relying on the curl command by using the requests library plus using the Link header in the response for obtaining the next page of the paginated response (if PR has more than 100 comments) and handles an exception if any happens while querying GitHub's REST API
  • variable that contains all comments changed from comments to all_comments
  • old lines 1358-1359 are not needed any longer

@trz42 trz42 added the bug label Dec 2, 2025
@casparvl
Copy link
Contributor

Tested in casparvl/software-layer#6, I can confirm it fixes the issue since this report casparvl/software-layer#6 (comment) is complete (even though there is >> 100 comments and the build is comment 130-or-so).

I'll still need to review the actual code, but this is promising at least :)


try:
while url:
response = requests.get(url, params={'per_page': 100})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should specify a timeout argument - default is 'None', which means it'll wait forever if it doesn't get a response. I'm not sure what a good value is, but I guess 10 would be pretty reasonable.

Suggested change
response = requests.get(url, params={'per_page': 100})
response = requests.get(url, params={'per_page': 100}, timeout=10)

Copy link
Contributor

@casparvl casparvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should specify a timeout to avoid indefinite hangs.

We should also realize that we might hit the unauthenticated rate limit (60 requests/hour) quite easily. A token could help, but that would probably have to be set in the config of the bot... Some dummy code of how it would be used here is then:

    headers = {}
    if token:
        headers["Authorization"] = f"Bearer {token}"
...
    response = requests.get(url, params={'per_page': 100}, headers=headers, timeout=10)

It is a bit weird, since this means the bot would need someone's token (i.e. identify as an individual)... but it could be an optional configuration. Having the capability at least means an easy fix (update of a bot config, rather than a full new bot code) if we ever hit this. I leave it up to you if you think it's worth implementing in this PR - I'm also fine in doing it in a follow-up PR, or even just postponing until it is actually an issue.

Another thing that could be considered is to wrap this in a retry loop, to make it more resilient against transient 5xx errors or network hiccups.

Again, pseudo (well: unchecked) code:

while url:
        for attempt in range(1, max_retries + 1):
            try:
                resp = requests.get(
                    url,
                    params={"per_page": per_page},
                    headers=headers,
                    timeout=10,                 # avoid hanging forever
                )
                resp.raise_for_status()       # 4xx/5xx → exception
                break                         # success → exit retry loop
            except requests.HTTPError as http_err:
                # 5xx are usually transient – retry; 4xx are not
                if 500 <= resp.status_code < 600 and attempt < max_retries:
                    wait = 2 ** attempt
                    log.warning(
                        "Transient HTTP %s, retry %d/%d after %ds",
                        resp.status_code, attempt, max_retries, wait,
                    )
                    time.sleep(wait)
                    continue
                # re‑raise – caller will see the exception
                raise
            except requests.RequestException as req_err:
                # network glitches, timeouts, DNS failures, etc.
                if attempt < max_retries:
                    wait = 2 ** attempt
                    log.warning("Network error: %s – retry %d/%d after %ds",
                                req_err, attempt, max_retries, wait)
                    time.sleep(wait)
                    continue
                raise

Again, I'm not convinced we should do this in the current PR - we could also retry by just calling bot:status again. To me, a quick merge is more important, as it is currently very hard to get (complete) status overviews for builds with many targets (i.e. CPU+GPU), which is exactly where this functionality is most essential. It's almost impossible to check if all CPU+GPU combinations for 2025.06 are covered, currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Overview table from bot:status last_build of the bot is incomplete for large numbers of builds

3 participants