Skip to content

Conversation

@bernhardkaindl
Copy link
Contributor

@bernhardkaindl bernhardkaindl commented Sep 20, 2025

No functional change, only the function is split

collect_data has a cyclomatic complexity of 21. It is the 3rd-highest in xen-bugtool, and code review tools warn that it should be simplified.

  • The first small step would be to split it into smaller parts and add comments to make the code more digestible.
  • This makes it easier to review and write unit tests for the smaller parts (before making changes).

Below are the instructions for reviewing the diff without whitespace to confirm that it only split:

  • Only the indentation of the code changes in this PR
    • No functional change, shifted lines are not touched.

Commits:

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>

Pull Request Overview

Updates Python CI to use Python 3.11 and refactors the large collect_data() function by extracting file and function output collection into separate, more manageable functions.

  • Updates Python version from 3.10 to 3.11 across CI configuration files
  • Breaks down collect_data() function into smaller, focused functions for better maintainability
  • Updates test expectations to reflect the new function structure

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
xen-bugtool Refactors collect_data() by extracting proc file and function output collection into separate functions
tests/unit/test_output.py Updates test to expect new function name in stack traces
pytype.cfg Updates Python version to 3.11 for static type analysis
.pre-commit-config.yaml Updates pytype configuration to use local repo and Python 3.11
.github/workflows/main.yml Updates CI environment to use Python 3.11
.codecov.yml Updates coverage configuration flags to reference Python 3.11

No functional change, only the function is split

The changes without changing the indentation (white space) can be verified using:

- Install the `gh` CLI (e.g. from https://github.com/cli/cli/releases)
- `gh auth login`
- `git clone https://github.com/xenserver/status-report; cd cd status-report`
- `gh pr checkout 144`
- `git diff -w origin/master`

Excluding the other commits, the code change this shows (without showing the whitespace) is:
```py
diff --git a/xen-bugtool b/xen-bugtool
index b114f4f..fc7b483 100755
--- a/xen-bugtool
+++ b/xen-bugtool
@@ -713,10 +713,20 @@ def collect_data(subdir, archive):
     # Afterwards, traverse the directory specifications for files to add
     traverse_directory_specifications(directory_specifications, entries)

+    collect_proc_files_and_func_outputs(subdir, archive)
+
+
+def collect_proc_files_and_func_outputs(subdir, archive):
+    """Collect all /proc and function outputs and add them to the output archive"""
     # Then, loop over the files which were found and collect their contents
     for k, v in data.items():
         if "cmd_args" in v:
             continue  # commands processing has been moved to a different loop
+        collect_proc_file_and_func_output(subdir, archive, k, v)
+
+
+def collect_proc_file_and_func_output(subdir, archive, k, v):
+    """Collect a /proc file or function output and add it to the output archive"""
     name = construct_filename(subdir, k, v)
     cap = v["cap"]
     filename = v.get("filename")
@@ -726,6 +736,9 @@ def collect_data(subdir, archive):
             f = open(v["filename"], "rb")
             s = f.read(unlimited_data and -1 or caps[cap][MAX_SIZE])
             f.close()
+
+            # If unlimited_data is set, or the file is smaller than the max_size,
+            # or the file is empty, include it in the archive:
             if unlimited_data or caps[cap][MAX_SIZE] == -1 or \
                     cap_sizes[cap] < caps[cap][MAX_SIZE] or len(s) == 0:
                 v['output'] = StringIOmtime(s)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
@bernhardkaindl bernhardkaindl force-pushed the breakup-collect_data-into-manageable-parts branch from b22f497 to 6200a75 Compare September 20, 2025 15:35
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@bernhardkaindl bernhardkaindl changed the title Refs/heads/breakup collect data into manageable parts Break collect_data() (2nd-largest function) into more manageable parts: Sep 20, 2025
@bernhardkaindl bernhardkaindl changed the title Break collect_data() (2nd-largest function) into more manageable parts: Break collect_data() (2nd-largest function) into more manageable parts Sep 20, 2025
@bernhardkaindl bernhardkaindl changed the title Break collect_data() (2nd-largest function) into more manageable parts Refactor: Split collect_data() into more smaller parts Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant