-
Notifications
You must be signed in to change notification settings - Fork 121
Separated paket.lock handling from NuGetComponentDetector to PaketCom… #1502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Thorium
wants to merge
8
commits into
microsoft:main
Choose a base branch
from
Thorium:paket-component-detector
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,259
−0
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
2d097a1
Separated paket.lock handling from NuGetComponentDetector to PaketCom…
Thorium 40a3e72
Merge branch 'main' into paket-component-detector
Thorium 1059a11
Addressed to the code-review feedback of FernandoRojo
Thorium 7e06225
Copilot suggestion commit
Thorium 2b73aa3
New CoPilot feedback addressed as well
Thorium 7595d73
Merge branch 'paket-component-detector' of https://github.com/Thorium…
Thorium 7b72d3f
Add isDevelopmentDependency detection for Paket.
Thorium db38ced
Merge branch 'main' into paket-component-detector
Thorium File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| # Paket Detection | ||
|
|
||
| ## Requirements | ||
|
|
||
| Paket Detection depends on the following to successfully run: | ||
|
|
||
| - One or more `paket.lock` files. | ||
| - The Paket detector looks for [`paket.lock`][1] files. | ||
|
|
||
| [1]: https://github.com/microsoft/component-detection/blob/main/src/Microsoft.ComponentDetection.Detectors/paket/PaketComponentDetector.cs | ||
|
|
||
| ## Detection Strategy | ||
|
|
||
| Paket Detection is performed by parsing any `paket.lock` files found under the scan directory. | ||
|
|
||
| The `paket.lock` file is a lock file that records the concrete dependency resolution of all direct and transitive dependencies of your project. It is generated by [Paket][2], an alternative dependency manager for .NET that is popular in both large-scale C# projects and small-scale F# projects. | ||
|
|
||
| [2]: https://fsprojects.github.io/Paket/ | ||
|
|
||
| ## What is Paket? | ||
|
|
||
| Paket is a dependency manager for .NET and Mono projects that provides: | ||
| - Precise control over package dependencies | ||
| - Reproducible builds through lock files | ||
| - Support for multiple package sources (NuGet, GitHub, HTTP, Git) | ||
| - Better resolution algorithm compared to legacy NuGet | ||
|
|
||
| The `paket.lock` file structure is straightforward and human-readable: | ||
| ``` | ||
| NUGET | ||
| remote: https://api.nuget.org/v3/index.json | ||
| PackageName (1.0.0) | ||
| DependencyName (>= 2.0.0) | ||
|
|
||
| GROUP Test | ||
| NUGET | ||
| remote: https://api.nuget.org/v3/index.json | ||
| NUnit (4.3.2) | ||
| ``` | ||
|
|
||
| ## Paket Detector | ||
|
|
||
| The Paket detector parses `paket.lock` files to extract: | ||
| - Resolved package names and versions recorded in the lock file | ||
| - Dependency relationships between packages as represented in the lock file | ||
| - Development dependency classification based on Paket group names | ||
|
|
||
| The detector does not authoritatively distinguish which packages were explicitly requested (from `paket.dependencies`) versus brought in transitively; it approximates this by treating packages that appear as dependencies of other packages as transitive. | ||
|
|
||
| Currently, the detector focuses on the `NUGET` section of the lock file, which contains NuGet package dependencies. Other dependency types (GITHUB, HTTP, GIT) are not currently supported. | ||
|
|
||
| ## How It Works | ||
|
|
||
| The detector: | ||
| 1. Locates `paket.lock` files in the scan directory | ||
| 2. Parses the file line by line, tracking the current GROUP context | ||
| 3. Identifies packages (4-space indentation) and their versions, keyed by group | ||
| 4. Identifies dependencies (6-space indentation) and their version constraints | ||
| 5. Records all packages as NuGet components | ||
| 6. Establishes parent-child relationships between packages and their dependencies | ||
| 7. Classifies packages as development dependencies based on their group name | ||
|
|
||
| ## Development Dependency Classification | ||
|
|
||
| Paket organizes dependencies into groups within `paket.lock`. The detector uses group names to classify packages as development (`isDevelopmentDependency: true`) or production (`isDevelopmentDependency: false`) dependencies. | ||
|
|
||
| **Well-known development groups** (case-insensitive): | ||
| - Exact matches: `Test`, `Tests`, `Docs`, `Documentation`, `Build`, `Analyzers`, `Fake`, `Benchmark`, `Benchmarks`, `Samples`, `DesignTime` | ||
| - Suffix matches: any group name ending with `Test` or `Tests` (e.g., `UnitTest`, `IntegrationTests`, `AcceptanceTests`, `E2ETest`) | ||
|
|
||
| **Production groups**: | ||
| - The default/unnamed group (packages before any `GROUP` line) | ||
| - `Main` | ||
| - Any group name not matching the well-known patterns above (e.g., `Server`, `Client`, `Shared`) | ||
|
|
||
| When the same package appears in multiple groups (e.g., `FSharp.Core` in both `Build` and `Server`), both occurrences are registered. The framework's merge logic ensures that if a package appears in **any** production group, it is ultimately classified as a production dependency. | ||
|
|
||
| ## Known Limitations | ||
|
|
||
| - This detector is currently **DefaultOff** and must be explicitly enabled | ||
| - Only NuGet dependencies from the `NUGET` section are detected | ||
| - GitHub, HTTP, and Git dependencies are not currently supported | ||
| - Without cross-referencing the `paket.dependencies` file, the detector cannot reliably distinguish between direct and transitive dependencies; it uses the dependency graph within the lock file to approximate this | ||
| - Development dependency classification is based on group names only; it does not cross-reference `paket.references` files to verify which packages are actually used by test vs. production projects (planned for a future iteration) | ||
| - The detector assumes the lock file format follows the standard Paket conventions | ||
275 changes: 275 additions & 0 deletions
275
src/Microsoft.ComponentDetection.Detectors/paket/PaketComponentDetector.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,275 @@ | ||
| namespace Microsoft.ComponentDetection.Detectors.Paket; | ||
|
|
||
| using System; | ||
| using System.Collections.Generic; | ||
| using System.IO; | ||
| using System.Text.RegularExpressions; | ||
| using System.Threading; | ||
| using System.Threading.Tasks; | ||
| using Microsoft.ComponentDetection.Contracts; | ||
| using Microsoft.ComponentDetection.Contracts.Internal; | ||
| using Microsoft.ComponentDetection.Contracts.TypedComponent; | ||
| using Microsoft.Extensions.Logging; | ||
|
|
||
| /// <summary> | ||
| /// Detects NuGet packages in paket.lock files. | ||
| /// Paket is a dependency manager for .NET that provides better control over package dependencies. | ||
| /// </summary> | ||
| // TODO: Promote to default-on (remove IDefaultOffComponentDetector) once validated in real-world usage. | ||
| public sealed class PaketComponentDetector : FileComponentDetector, IDefaultOffComponentDetector | ||
| { | ||
| private static readonly Regex PackageLineRegex = new(@"^\s{4}(\S+)\s+\(([^\)]+)\)", RegexOptions.Compiled); | ||
| private static readonly Regex DependencyLineRegex = new(@"^\s{6}(\S+)\s+\(([^)]+)\)", RegexOptions.Compiled); | ||
|
|
||
| /// <summary> | ||
| /// Well-known Paket group names that indicate development-time dependencies. | ||
| /// Exact matches (case-insensitive): test, tests, docs, documentation, build, analyzers, fake, | ||
| /// benchmark, benchmarks, samples, designtime. | ||
| /// Suffix matches (case-insensitive): groups ending with "test" or "tests" to cover names like | ||
| /// "unittest", "unittests", "integrationtest", "integrationtests", etc. | ||
| /// </summary> | ||
| private static readonly HashSet<string> ExactDevGroupNames = new(StringComparer.OrdinalIgnoreCase) | ||
| { | ||
| "test", "tests", "docs", "documentation", "build", "analyzers", "fake", | ||
| "benchmark", "benchmarks", "samples", "designtime", | ||
| }; | ||
|
|
||
| /// <summary> | ||
| /// Initializes a new instance of the <see cref="PaketComponentDetector"/> class. | ||
| /// </summary> | ||
| /// <param name="componentStreamEnumerableFactory">The factory for handing back component streams to File detectors.</param> | ||
| /// <param name="walkerFactory">The factory for creating directory walkers.</param> | ||
| /// <param name="logger">The logger to use.</param> | ||
| public PaketComponentDetector( | ||
| IComponentStreamEnumerableFactory componentStreamEnumerableFactory, | ||
| IObservableDirectoryWalkerFactory walkerFactory, | ||
| ILogger<PaketComponentDetector> logger) | ||
| { | ||
| this.ComponentStreamEnumerableFactory = componentStreamEnumerableFactory; | ||
| this.Scanner = walkerFactory; | ||
| this.Logger = logger; | ||
| } | ||
|
|
||
| /// <inheritdoc /> | ||
| public override IList<string> SearchPatterns => ["paket.lock"]; | ||
|
|
||
| /// <inheritdoc /> | ||
| public override string Id => "Paket"; | ||
|
|
||
| /// <inheritdoc /> | ||
| public override IEnumerable<string> Categories => | ||
| [Enum.GetName(typeof(DetectorClass), DetectorClass.NuGet)!]; | ||
|
|
||
| /// <inheritdoc /> | ||
| public override IEnumerable<ComponentType> SupportedComponentTypes => [ComponentType.NuGet]; | ||
|
|
||
| /// <inheritdoc /> | ||
| public override int Version => 2; | ||
|
|
||
| /// <summary> | ||
| /// Determines whether a Paket group name represents a development-time dependency group. | ||
| /// The unnamed/default group and "Main" are considered production groups. | ||
| /// </summary> | ||
| /// <param name="groupName">The group name from the paket.lock file, or empty string for the default group.</param> | ||
| /// <returns><c>true</c> if the group is a well-known development group; <c>false</c> otherwise.</returns> | ||
| internal static bool IsDevelopmentDependencyGroup(string groupName) | ||
| { | ||
| if (string.IsNullOrEmpty(groupName) || groupName.Equals("Main", StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| return false; | ||
| } | ||
|
|
||
| if (ExactDevGroupNames.Contains(groupName)) | ||
| { | ||
| return true; | ||
| } | ||
|
|
||
| // Suffix matches: *test, *tests (e.g., UnitTest, IntegrationTests) | ||
| if (groupName.EndsWith("test", StringComparison.OrdinalIgnoreCase) || | ||
| groupName.EndsWith("tests", StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| return true; | ||
| } | ||
|
|
||
| return false; | ||
| } | ||
|
|
||
| /// <inheritdoc /> | ||
| protected override async Task OnFileFoundAsync(ProcessRequest processRequest, IDictionary<string, string> detectorArgs, CancellationToken cancellationToken = default) | ||
| { | ||
| try | ||
| { | ||
| var singleFileComponentRecorder = processRequest.SingleFileComponentRecorder; | ||
| using var reader = new StreamReader(processRequest.ComponentStream.Stream); | ||
|
|
||
| // First pass: collect all resolved packages and their dependency relationships, keyed by group. | ||
| // In paket.lock, 4-space indented lines are resolved packages with pinned versions. | ||
| // 6-space indented lines are dependency specifications (version constraints) of the parent | ||
| // package; they are NOT resolved versions. The actual resolved version for each dependency | ||
| // will appear as its own 4-space entry elsewhere in the file. | ||
| // | ||
| // Packages are tracked per group because the same package may appear in multiple groups | ||
| // (e.g., FSharp.Core in both "Build" and "Server") potentially with different versions. | ||
| // Group names are also used to classify packages as development dependencies: well-known | ||
| // group names like "Test", "Build", "Docs", etc. indicate development-time dependencies. | ||
| // | ||
| // Limitation: without cross-referencing paket.dependencies or paket.references, we cannot | ||
| // perfectly distinguish between direct and transitive dependencies. We use the dependency | ||
| // graph within each group to approximate: packages that appear as dependencies of other | ||
| // packages are marked as transitive, and the rest are treated as explicit. | ||
|
|
||
| // Key: (groupName, packageName) -> version | ||
| var resolvedPackages = new Dictionary<(string Group, string Name), string>(GroupAndNameComparer.Instance); | ||
|
|
||
| // (groupName, parentName, dependencyName) | ||
| var dependencyRelationships = new List<(string Group, string ParentName, string DependencyName)>(); | ||
|
|
||
| var currentSection = string.Empty; | ||
| var currentGroupName = string.Empty; // empty string = default/unnamed group | ||
| string? currentPackageName = null; | ||
|
|
||
| while (await reader.ReadLineAsync(cancellationToken) is { } line) | ||
| { | ||
| if (string.IsNullOrWhiteSpace(line)) | ||
| { | ||
| continue; | ||
| } | ||
|
|
||
| // Check if this is a section header (e.g., NUGET, GITHUB, HTTP, GROUP, RESTRICTION, STORAGE) | ||
| if (!line.StartsWith(' ') && line.Trim().Length > 0) | ||
| { | ||
| var trimmed = line.Trim(); | ||
|
|
||
| // GROUP lines set the current group context; they are not a "section" like NUGET. | ||
| // The format is "GROUP <name>" and subsequent sections (NUGET, GITHUB, etc.) | ||
| // belong to this group until the next GROUP line. | ||
| if (trimmed.StartsWith("GROUP ", StringComparison.OrdinalIgnoreCase) && trimmed.Length > 6) | ||
| { | ||
| currentGroupName = trimmed[6..].Trim(); | ||
| currentSection = string.Empty; | ||
| currentPackageName = null; | ||
| } | ||
| else | ||
| { | ||
| currentSection = trimmed; | ||
| currentPackageName = null; | ||
| } | ||
|
|
||
| continue; | ||
| } | ||
|
|
||
| // Only process NUGET section for now | ||
| if (!currentSection.Equals("NUGET", StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| continue; | ||
| } | ||
|
|
||
| // Check if this is a remote line (source URL) | ||
| if (line.TrimStart().StartsWith("remote:", StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| continue; | ||
| } | ||
|
|
||
| // Check if this is a package line (4 spaces indentation) - these are resolved packages | ||
| var packageMatch = PackageLineRegex.Match(line); | ||
| if (packageMatch.Success) | ||
| { | ||
| currentPackageName = packageMatch.Groups[1].Value; | ||
| var currentPackageVersion = packageMatch.Groups[2].Value; | ||
|
|
||
| var key = (currentGroupName, currentPackageName); | ||
| if (!resolvedPackages.TryAdd(key, currentPackageVersion)) | ||
| { | ||
| this.Logger.LogDebug( | ||
| "Duplicate package {PackageName} found in group '{GroupName}' with version {Version}; keeping previously resolved version {ExistingVersion}", | ||
| currentPackageName, | ||
| currentGroupName, | ||
| currentPackageVersion, | ||
| resolvedPackages[key]); | ||
| } | ||
|
|
||
| continue; | ||
| } | ||
|
|
||
| // Check if this is a dependency line (6 spaces indentation) - these are version constraints | ||
| var dependencyMatch = DependencyLineRegex.Match(line); | ||
| if (dependencyMatch.Success && currentPackageName != null) | ||
| { | ||
| var dependencyName = dependencyMatch.Groups[1].Value; | ||
| dependencyRelationships.Add((currentGroupName, currentPackageName, dependencyName)); | ||
| } | ||
| } | ||
|
|
||
| // Build a set of package names (per group) that appear as dependencies of other packages | ||
| var transitiveDependencyNames = new HashSet<(string Group, string Name)>(GroupAndNameComparer.Instance); | ||
| foreach (var (group, _, dependencyName) in dependencyRelationships) | ||
| { | ||
| transitiveDependencyNames.Add((group, dependencyName)); | ||
| } | ||
|
|
||
| // Register all resolved packages with group-aware isDevelopmentDependency. | ||
| // If a package appears in multiple groups, it will be registered multiple times with | ||
| // potentially different isDevelopmentDependency values. The framework's AND-merge | ||
| // semantics ensure that if ANY registration says false (production), the final result | ||
| // is false -- preventing accidental hiding of production dependencies. | ||
| foreach (var ((group, name), version) in resolvedPackages) | ||
| { | ||
| var isDev = IsDevelopmentDependencyGroup(group); | ||
| var component = new DetectedComponent(new NuGetComponent(name, version)); | ||
| singleFileComponentRecorder.RegisterUsage( | ||
| component, | ||
| isExplicitReferencedDependency: !transitiveDependencyNames.Contains((group, name)), | ||
| isDevelopmentDependency: isDev); | ||
| } | ||
|
|
||
| // Register parent-child relationships using the dependency specifications | ||
| foreach (var (group, parentName, dependencyName) in dependencyRelationships) | ||
| { | ||
| var parentKey = (group, parentName); | ||
| var depKey = (group, dependencyName); | ||
|
|
||
| if (resolvedPackages.ContainsKey(depKey) && resolvedPackages.ContainsKey(parentKey)) | ||
| { | ||
| var isDev = IsDevelopmentDependencyGroup(group); | ||
| var parentVersion = resolvedPackages[parentKey]; | ||
| var parentComponentId = new NuGetComponent(parentName, parentVersion).Id; | ||
|
|
||
| var depVersion = resolvedPackages[depKey]; | ||
| var depComponent = new DetectedComponent(new NuGetComponent(dependencyName, depVersion)); | ||
|
|
||
| singleFileComponentRecorder.RegisterUsage( | ||
| depComponent, | ||
| isExplicitReferencedDependency: false, | ||
| parentComponentId: parentComponentId, | ||
| isDevelopmentDependency: isDev); | ||
| } | ||
| } | ||
| } | ||
| catch (Exception e) when (e is IOException or InvalidOperationException) | ||
| { | ||
Thorium marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| processRequest.SingleFileComponentRecorder.RegisterPackageParseFailure(processRequest.ComponentStream.Location); | ||
| this.Logger.LogWarning(e, "Failed to read paket.lock file {File}", processRequest.ComponentStream.Location); | ||
| } | ||
| } | ||
|
|
||
| /// <summary> | ||
| /// Case-insensitive equality comparer for (Group, Name) tuples used as dictionary keys. | ||
| /// </summary> | ||
| private sealed class GroupAndNameComparer : IEqualityComparer<(string Group, string Name)> | ||
| { | ||
| public static readonly GroupAndNameComparer Instance = new(); | ||
|
|
||
| public bool Equals((string Group, string Name) x, (string Group, string Name) y) | ||
| { | ||
| return StringComparer.OrdinalIgnoreCase.Equals(x.Group, y.Group) && | ||
| StringComparer.OrdinalIgnoreCase.Equals(x.Name, y.Name); | ||
| } | ||
|
|
||
| public int GetHashCode((string Group, string Name) obj) | ||
| { | ||
| return HashCode.Combine( | ||
| StringComparer.OrdinalIgnoreCase.GetHashCode(obj.Group), | ||
| StringComparer.OrdinalIgnoreCase.GetHashCode(obj.Name)); | ||
| } | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.