Skip to content

ciq_tag: lib and command line tool to operate on CIQ commit tags#48

Open
pvts-mat wants to merge 3 commits intoctrliq:mainlinefrom
pvts-mat:commit-msg-lib
Open

ciq_tag: lib and command line tool to operate on CIQ commit tags#48
pvts-mat wants to merge 3 commits intoctrliq:mainlinefrom
pvts-mat:commit-msg-lib

Conversation

@pvts-mat
Copy link
Contributor

@pvts-mat pvts-mat commented Nov 30, 2025

About

A library and a command line tool to get, insert, modify and delete CIQ tags from commit messages.

Features:

  • Pure python 3

  • Minimal dependencies

    pip3 install more-itertools
    

    (can be avoided if required)

  • Tools-indepentent

    • Operates on string level, doesn't call git underneath (or any other external commands for that matter).
    • Input / output can be a stdin / stdout or a file.
  • Robust

    • Accounts for different alternative keywords found in the wild (cve-bf, cve-bugfix, cve-update, …)
    • Accounts for different keyword separators (space, :)
    • Deals with complex multiline tags like upstream-diff regardless of their formatting and indentation.
  • Handling the formatting

    • Automatic wrapping / unwrapping of long tag values.
    • Indenting / unindenting the values
    • Auto controlling of the empty lines separating the tags from the rest of commit message, ensuring readability.
  • Full-featured

    • All tools in place to operate on the commit's tags without editing the message by hand.
    • Values can be read from a file if some specific formatting is desired (eg. bullet points in the upstream-diff)
    • Support for multiple tags of the same type (eg. multiple jira tags).
  • Extendable.

    • New tags or alternative spellings of keywords can be very easily added.

By encapsulating the tags editing operations this tool allows for automating those revision history maintenance tasks which previously had to be done by hand or with the use of ad-hoc greps and seds..

Examples

Adding tags

Input

$ git log --pretty=%B -n 1 94d10a4dba0bc482f2b01e39f06d5513d0f75742
sunrpc: handle SVC_GARBAGE during svc auth processing as auth error

tianshuo han reported a remotely-triggerable crash if the client sends a
kernel RPC server a specially crafted packet. If decoding the RPC reply
fails in such a way that SVC_GARBAGE is returned without setting the
rq_accept_statp pointer, then that pointer can be dereferenced and a
value stored there.

[…]

Output

$ git log --pretty=%B -n 1 94d10a4dba0bc482f2b01e39f06d5513d0f75742 \
  | ./ciq_tag.py add cve CVE-2025-38089
sunrpc: handle SVC_GARBAGE during svc auth processing as auth error

cve CVE-2025-38089

tianshuo han reported a remotely-triggerable crash if the client sends a
kernel RPC server a specially crafted packet. If decoding the RPC reply
fails in such a way that SVC_GARBAGE is returned without setting the
rq_accept_statp pointer, then that pointer can be dereferenced and a
value stored there.

[…]

Automatic tracking of the proper tags order

(jira tag should be before cve)

$ git log --pretty=%B -n 1 94d10a4dba0bc482f2b01e39f06d5513d0f75742 \
  | ./ciq_tag.py add cve CVE-2025-38089 \
  | ./ciq_tag.py add jira VULN-71607
sunrpc: handle SVC_GARBAGE during svc auth processing as auth error

jira VULN-71607
cve CVE-2025-38089

tianshuo han reported a remotely-triggerable crash if the client sends a
kernel RPC server a specially crafted packet. If decoding the RPC reply
fails in such a way that SVC_GARBAGE is returned without setting the
rq_accept_statp pointer, then that pointer can be dereferenced and a
value stored there.

[…]

Adding multiple instances of the same tag if necessary

$ git log --pretty=%B -n 1 94d10a4dba0bc482f2b01e39f06d5513d0f75742 \
  | ./ciq_tag.py add cve CVE-2025-38089 \
  | ./ciq_tag.py add jira VULN-71607 \
  | ./ciq_tag.py add jira VULN-70617
sunrpc: handle SVC_GARBAGE during svc auth processing as auth error

jira VULN-71607
jira VULN-70617
cve CVE-2025-38089

tianshuo han reported a remotely-triggerable crash if the client sends a
kernel RPC server a specially crafted packet. If decoding the RPC reply
fails in such a way that SVC_GARBAGE is returned without setting the
rq_accept_statp pointer, then that pointer can be dereferenced and a
value stored there.

[…]

Modifying existing tags

Input

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because
              this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change
              content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Output

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py modify cve CVE-2025-32867
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-32867
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because
              this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change
              content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Return nonzero exit code if tag not found

(but produce the output unchanged)

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py modify cve-pre CVE-2025-32867
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because
              this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change
              content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]
$ echo $?
1

Use set to insert anyway if not exist

This differs from add which would always insert the tag, while set first attempts to modify the existing one and only if that didn't succeed - to insert the new instance.

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py set cve-pre CVE-2025-32867
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
cve-pre CVE-2025-32867
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because
              this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change
              content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Deleting tags

Input

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because
              this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change
              content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Basic usage

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py delete jira
vsock: Orphan socket after transport release

cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because
              this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change
              content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Handling the multi-line properties well

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py delete upstream-diff
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Clean all metadata

Shring the vertical separation if needed

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py delete jira \
  | ./ciq_tag.py delete upstream-diff \
  | ./ciq_tag.py delete cve \
  | ./ciq_tag.py delete commit-author \
  | ./ciq_tag.py delete commit
vsock: Orphan socket after transport release

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Getting the values of tags

Input

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because
              this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change
              content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Basic usage

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py get cve
CVE-2025-21756

Return error if tag not found

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py get cve-pre
$ echo ?$
1

Extract the raw upstream-diff

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py get upstream-diff
Indentation is different from upstream change because
              this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change
              content is the same.

Avoid the indentation

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py get upstream-diff --dedent
Indentation is different from upstream change because
this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
proto::close on close()").  Otherwise the change
content is the same.

Unwrap the text to abstract from formatting where it get in the way

$ git log --pretty=%B -n 1 be2f55fc817f6dfb1ca1d7cd267766764798c9f9 \
  | ./ciq_tag.py get upstream-diff --unwrap
Indentation is different from upstream change because this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke proto::close on close()").  Otherwise the change content is the same.

Handle also other multiline properties

Input:

$ git log --pretty=%B -n 1 a99ddaf06a3c93645b0b2ab435d5be584e63733d
x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto

jira LE-2015
cve CVE-2024-2201
Rebuild_History Non-Buildable kernel-5.14.0-427.42.1.el9_4
commit-author Josh Poimboeuf <jpoimboe@kernel.org>
commit 36d4fe147c870f6d3f6602befd7ef44393a1c87a
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-427.42.1.el9_4/36d4fe14.failed

Unlike most other mitigations' "auto" options, spectre_bhi=auto only
mitigates newer systems, which is confusing and not particularly useful.

[…]

Output:

$ git log --pretty=%B -n 1 a99ddaf06a3c93645b0b2ab435d5be584e63733d \
  | ./ciq_tag.py get Empty-Commit 
Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-427.42.1.el9_4/36d4fe14.failed

Autoformatting of values

Input

$ cat msg.txt
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Set upstream-diff right from the command line

$ cat msg.txt \
  | ./ciq_tag.py set --wrap upstream-diff 'Indentation is different from upstream change because this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke proto::close on close()").  Otherwise the change content is the same.'
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because this
kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke proto::close on
close()").  Otherwise the change content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Set indentation to a fixed value

$ cat msg.txt \
  | ./ciq_tag.py set --indent 4 --wrap upstream-diff 'Indentation is different from upstream change because this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke proto::close on close()").  Otherwise the change content is the same.'
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because this
    kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke proto::close on
    close()").  Otherwise the change content is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Set indent to align with the keyword

$ cat msg.txt \
  | ./ciq_tag.py set --indent -1 --wrap upstream-diff 'Indentation is different from upstream change because this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke proto::close on close()").  Otherwise the change content is the same.'
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from upstream change because this
              kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke
              proto::close on close()").  Otherwise the change content
              is the same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

Control the width

$ cat msg.txt \
  | ./ciq_tag.py set --indent -1 --wrap --wrap-width 50 upstream-diff 'Indentation is different from upstream change because this kernel lacks 135ffc7becc8 ("bpf, vsock: Invoke proto::close on close()").  Otherwise the change content is the same.'
vsock: Orphan socket after transport release

jira VULN-53609
cve CVE-2025-21756
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb
upstream-diff Indentation is different from
              upstream change because this kernel
              lacks 135ffc7becc8 ("bpf, vsock:
              Invoke proto::close on close()").
              Otherwise the change content is the
              same.

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

[…]

@pvts-mat
Copy link
Contributor Author

No docs in the code - will be added if the tool finds acceptance. Same for the usage examples from the library POV

@roxanan1996
Copy link
Contributor

I am not sure how often I would use this tool, except for maybe some mistakes like modifying the upstream-diff comment later. The rest of the tags should be automated.

But, I think the library would be very useful to improve the current tooling. Can you separate the library from the CLI tool to make things easier?

@roxanan1996
Copy link
Contributor

pip3 install more-itertools

Add this as dependency in pyproject.toml. As explained here. I realized there's no readme with this, will add this soon.

And solve the formatting error in the pr_check.

Copy link
Contributor

@roxanan1996 roxanan1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will continue reviewing and testing tomorrow, but first the formatting has to be addressed.

ciq_tag.py Outdated
args = read_args()
for c in Parameter:
logging.debug(f"{c}: {c.get_val(args)}")
return CommandRoot.get_by_cmd(args.CommandRoot).process(args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why this was implemented from scratch? Why not use argparse or even better click (https://click.palletsprojects.com/en/stable/)
I want to change all argparse usages to click, but haven't had the time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is argparse. I didn't implement anything from scratch that wasn't implemented by argparse already. What I implemented was a translation from flexible data structures to inflexible argparse procedure calls.

With regards to click I didn't know about this tool. Maybe it provides what I needed, maybe not. I would have to evaluate it.

ciq_tag.py Outdated

DEFAULT_LOGLEVEL = "INFO"

LOGLEVEL = os.environ.get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use this as a param to the tool (the logging level)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would limit the log setting ability to the CLI tool only. What abut python modules which would be including this one?
Some switches might be provided, but the modules would have to be aware of them and consciously propagate end-user's logging level wish down to the library.
Environment variables cross the application-library barrier effortlessly, so the module's users don't have to deal with these settings. For example, a ciq_tag_user.py program may look like

#!/usr/bin/env python3

import ciq_tag

print(ciq_tag.add_tag("""
sunrpc: handle SVC_GARBAGE during svc auth processing as auth error

tianshuo han reported a remotely-triggerable crash if the client sends a
kernel RPC server a specially crafted packet. If decoding the RPC reply
fails in such a way that SVC_GARBAGE is returned without setting the
rq_accept_statp pointer, then that pointer can be dereferenced and a
value stored there.
""", ciq_tag.CiqTag.CVE, "CVE-2025-38089"))

and calling it as

LOGS=DEBUG ./ciq_tag_user.py

would show debug messages from the ciq_tag library without engaging ciq_tag_user.py in it.

If the programs using ciq_tag wished they may "plug" themselves into this LOGS variable logic as well, of course, so they recognize the logging level wanted by the end user in the same way. It is a form of very simple global logging setup, generally advised for python logging.

Having said that, the logging setup I did was really poor and lazy. Rewrote it.

ciq_tag.py Outdated
return (True, (message[:deleted_tag_pos.keyword_start] +
omit_prefixing_empty_lines(message[deleted_tag_pos.boundary_end:])))
else:
return (False, message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked this properly, but my personal preference is to avoid multiple indents and do early returns if possible. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we both want the code to look good. My criteria for code aesthetic is based on how well it exposes the behavioral complexity of an algorithm. Nested if/else statements make the task of answering key questions like "what are the conditions for function f to return value y?" as simple as taking the conjunction of conditions up the branching hierarchy from the picked exit point, while the existence of early returns necessitate mental code execution, which is prone to errors, takes more time and unnecessary effort.

This particular code also doesn't deal with eliminating error cases which the early returns seem to be most suited for.

@pvts-mat
Copy link
Contributor Author

pvts-mat commented Dec 9, 2025

Thanks for looking into it @roxanan1996. Let me address the comments

I am not sure how often I would use this tool, except for maybe some mistakes like modifying the upstream-diff comment later. The rest of the tags should be automated.

But, I think the library would be very useful to improve the current tooling.

The library was the main goal. I wanted to create an abstraction layer over CIQ meta-data operations, hoping to reduce code duplication, wheel reinvention and the associated mental load. The development of the command line tool wasn't driven by some expected use cases, but rather guided by the principle of computational freedom. The aim was to transfer 1-to-1 the functionality of the library to the computation environment most natural and default for many tasks - the shell. I try not to mold the developed tools into imagined usecases because my imagination is limited and I do not wish to impose it on others.

Regarding the typical commit backporting workflow I don't think the tool is much useful, but it does open the door to automation of many kinds, which may be very useful in larger PRs like the recent netfilter one. For example, I could have kept all the metadata for the commits in some CSV table or any other structured form, where they would be much easier to manage, control and validate, then just "export" it to git commits as a branch. No upstream-diff would be missing then.

Additionaly the CLI frontend is very useful in testing the library, so I would have written it anyway.

Can you separate the library from the CLI tool to make things easier?

You mean the "things" here are the decision about what to keep and what to reject? Because if it's about usage then these two do not conflict in any way, you can import ciq_tag in python or ./ciq_tag.py in shell. I thought it's more concise and simple to keep them together. I may split it, but first I would need to know if CLI gets approval or not, otherwise there's no point - I would just drop it from ciq_tag.py.

@pvts-mat
Copy link
Contributor Author

pvts-mat commented Dec 9, 2025

Add this as dependency in pyproject.toml. As explained here. I realized there's no readme with this, will add this soon.

And solve the formatting error in the prcheck.

Done.
Also simplified the main lib function API, converting positional arguments to keywords.

@roxanan1996
Copy link
Contributor

Thanks for looking into it @roxanan1996. Let me address the comments

I am not sure how often I would use this tool, except for maybe some mistakes like modifying the upstream-diff comment later. The rest of the tags should be automated.

But, I think the library would be very useful to improve the current tooling.

The library was the main goal. I wanted to create an abstraction layer over CIQ meta-data operations, hoping to reduce code duplication, wheel reinvention and the associated mental load. The development of the command line tool wasn't driven by some expected use cases, but rather guided by the principle of computational freedom. The aim was to transfer 1-to-1 the functionality of the library to the computation environment most natural and default for many tasks - the shell. I try not to mold the developed tools into imagined usecases because my imagination is limited and I do not wish to impose it on others.

Regarding the typical commit backporting workflow I don't think the tool is much useful, but it does open the door to automation of many kinds, which may be very useful in larger PRs like the recent netfilter one. For example, I could have kept all the metadata for the commits in some CSV table or any other structured form, where they would be much easier to manage, control and validate, then just "export" it to git commits as a branch. No upstream-diff would be missing then.

Additionaly the CLI frontend is very useful in testing the library, so I would have written it anyway.

Can you separate the library from the CLI tool to make things easier?

You mean the "things" here are the decision about what to keep and what to reject? Because if it's about usage then these two do not conflict in any way, you can import ciq_tag in python or ./ciq_tag.py in shell. I thought it's more concise and simple to keep them together. I may split it, but first I would need to know if CLI gets approval or not, otherwise there's no point - I would just drop it from ciq_tag.py.

Yeah, completely agree, that's what I meant. It will help improving the existing tooling a lot.
And yesterday I ran into an issue where I had to modify the jira ticket for a commit, so the CLI tool may end up more useful than I thought. I see no issue with merging this, of course after proper review, but the idea is nice.

As for the separation, I think it's usually best practice to separate the lib and the tool, if we need to import the lib in another tool we have. I am 100% sure the lib would be used further, so it's just easier to do separate them now then in the future.

@roxanan1996
Copy link
Contributor

I am a bit busy today. but I'll continue looking at this tomorrow.

- Rewrote the tags parsing logic. The previous one was based on the
  on-demand, separate, atomic message parsing for each tag operation. It
  was kinda raw and liberal, avoiding building additional data
  structures. The goal was to keep it flexible, but it turned out to be
  too flexible, recognizing parts of upstream message as CIQ tags, like
  in the case of the commit 081056dc00a27bccb55ccc3c6f230a3d5fd3f7e0. In
  the new version the CIQ tags are required to consitute a continuous
  block at the very beginning of git message. The message is parsed in
  whole, all tags tracked no matter what operation on which of them is
  requested. An explicit list of tags for the given commit message is
  build, which allows for simplification of a lot of editing code. From
  the API perspective there is no longer a loose bag of functions but a
  CiqMsg class encapsulating the operations as modifier methods, with
  the `get_message()' getter method to obtain the end result of all
  carried out operations (adding/editing/deleting tags). The command
  line interface remained exactly the same.
- Extracted the CLI to a separate file
  ciq_tag.py: library
  ciq-tag.py: executable CLI
- Rewrote arguments parsing from argpars to click
@pvts-mat
Copy link
Contributor Author

  • Split lib and CLI to separate files
  • Used click instead of argparse
  • Rewrote the logic to handle problematic edge cases (see 081056dc00a27bccb55ccc3c6f230a3d5fd3f7e0 and its commit claimed to fix an issue introduced in 5.13, but it should actually line)
  • Encapsulated the API in a single class CiqMsg

@pvts-mat pvts-mat requested a review from roxanan1996 February 23, 2026 13:25
roxanan1996
roxanan1996 previously approved these changes Mar 2, 2026
Copy link
Contributor

@roxanan1996 roxanan1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi. Sorry it took so long.

I left some very minor comments.
Overall it looks ok to me and it would be useful to add it.
People can iterate over it once it gets used.

So feel free to push it.

ciq_tag.py Outdated
UPSTREAM_DIFF = (["upstream-diff"], True)

def __init__(self, keywords: List[str], multiline: bool = False):
assert len(keywords) > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should return an Exception here or sth, I use assert only for tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this particular condition is false it means the enum entry was defined wrong, so this assert is more of a buggy code canary than an error path of a regular program execution. Switched the asserts to exceptions though, if you prefer.

While we are at it - how much arguments checking do you want for the public functions? Currently there is almost none. All of them? Only the least obvious? Do you want the types checked too or just values?


JIRA = ["jira"]
SUBSYSTEM_SYNC = ["subsystem-sync"]
SUBSYSTEM_UPDATE = ["subsystem-update"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check if all of these are present? I am not familiar with all of them.

Also, since this is mostly for Cves, maybe start small and use the tags only relevant only for cve backports in lts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all of these tags were found in the wild, I didn't introduce to CiqTag any new tag upfront. Examples:

Part of the intent behind this lib was to systematize and catalogue the CIQ tags existing in the repos, so I included everything I could find. It's a good place to document them, for example, like what do they mean, when to use which, etc.

Another thing is that lack of completeness undermines the purpose of the library, which is simplifying programmer's work. If some tags are supported and some not then it puts on the user the burden of keeping the mental track of which tags the library can be used for and for which it cannot. This can be thought of as part of lib's implicit interface, which is simplified by completing the list of supported tags.

There is also a technical issue of the library needing to know about the lesser used tags to properly parse also the more common ones. In the example commit above, if subsystem-sync was not recognized it would signify the beginning of the message and the following commit line would be understood as part of the message as well. On the other hand if the tags were allowed to be all over the commit message and not in a single block then we would occasionally have a problem with messages like this one ctrliq/kernel-src-tree@081056d where there is an accidental line starting with commit word but it's not a CIQ tag. And if the block of consecutive tags wasn't recognized by explicitly identifying them, but by some typographic features like "forming one paragraph", then we would have troubles with messages without any tags (yet) - they would also have some paragraph in the beginning, although not of CIQ tags, which again could have some accidental false positives.



def process_in_out(input, output, result_to_output_map, ciq_msg_method, *method_args_pos, **method_args_key):
with open_input(input) as in_file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, input does not have to stay open after input_str is read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@pvts-mat
Copy link
Contributor Author

pvts-mat commented Mar 2, 2026

  • Addressed comments.

  • Simplified and enhanced the library

    • Removed unnecessary trim argument from the CiqMsg functions.
    • Removed unused get_indexed_tag_position function.
    • Added checking of the tag value if it doesn't contain the boundary defined for the given tag. (If it does it won't be parsed correctly next time). For example the upstream-diff cannot contain two line breaks.
  • Added more practical command seq to the CLI. Usage example

    git -C <kernel-src-tree--repo> log --format=%B -n 1 fa6be9cc6e80ec79892ddf08a8c10cabab9baf38 \
        | ./ciq-tag.py seq -w -t 4 \
                       -s jira VULN-155649 \
                       -s cve CVE-2022-50410 \
                       -s commit-author "Chuck Lever <chuck.lever@oracle.com>" \
                       -s commit fa6be9cc6e80ec79892ddf08a8c10cabab9baf38 \
                       -s upstream-diff "LTS 8.6 version of the function \`nfsd3_proc_read()' misses changes from 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960, cc9bcdad7773c295375e66c892c7ac00524706f2 and be63bd2ac6bbf8c065a0ef6dfbea76934326c352. Preserved the logic of what is assigned to \`resp->count' despite the missing changes."
    
    NFSD: Protect against send buffer overflow in NFSv3 READ
    
    jira VULN-155649
    cve CVE-2022-50410
    commit-author Chuck Lever <chuck.lever@oracle.com>
    commit fa6be9cc6e80ec79892ddf08a8c10cabab9baf38
    upstream-diff LTS 8.6 version of the function `nfsd3_proc_read()' misses
        changes from 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960,
        cc9bcdad7773c295375e66c892c7ac00524706f2 and
        be63bd2ac6bbf8c065a0ef6dfbea76934326c352. Preserved the logic of
        what is assigned to `resp->count' despite the missing changes.
    
    Since before the git era, NFSD has conserved the number of pages
    held by each nfsd thread by combining the RPC receive and send
    buffers into a single array of pages. This works because there are
    no cases where an operation needs a large RPC Call message and a
    large RPC Reply at the same time.
    
    Once an RPC Call has been received, svc_process() updates
    svc_rqst::rq_res to describe the part of rq_pages that can be
    used for constructing the Reply. This means that the send buffer
    (rq_res) shrinks when the received RPC record containing the RPC
    Call is large.
    
    A client can force this shrinkage on TCP by sending a correctly-
    formed RPC Call header contained in an RPC record that is
    excessively large. The full maximum payload size cannot be
    constructed in that case.
    
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants