Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/detectors/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,12 @@
| NuGetPackagesConfigDetector | Stable |
| NuGetProjectModelProjectCentricComponentDetector | Stable |

- [Paket](paket.md)

| Detector | Status |
| --------------------- | ---------- |
| PaketComponentDetector | DefaultOff |

- [Pip](pip.md)

| Detector | Status |
Expand Down
85 changes: 85 additions & 0 deletions docs/detectors/paket.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Paket Detection

## Requirements

Paket Detection depends on the following to successfully run:

- One or more `paket.lock` files.
- The Paket detector looks for [`paket.lock`][1] files.

[1]: https://github.com/microsoft/component-detection/blob/main/src/Microsoft.ComponentDetection.Detectors/paket/PaketComponentDetector.cs

## Detection Strategy

Paket Detection is performed by parsing any `paket.lock` files found under the scan directory.

The `paket.lock` file is a lock file that records the concrete dependency resolution of all direct and transitive dependencies of your project. It is generated by [Paket][2], an alternative dependency manager for .NET that is popular in both large-scale C# projects and small-scale F# projects.

[2]: https://fsprojects.github.io/Paket/

## What is Paket?

Paket is a dependency manager for .NET and Mono projects that provides:
- Precise control over package dependencies
- Reproducible builds through lock files
- Support for multiple package sources (NuGet, GitHub, HTTP, Git)
- Better resolution algorithm compared to legacy NuGet

The `paket.lock` file structure is straightforward and human-readable:
```
NUGET
remote: https://api.nuget.org/v3/index.json
PackageName (1.0.0)
DependencyName (>= 2.0.0)

GROUP Test
NUGET
remote: https://api.nuget.org/v3/index.json
NUnit (4.3.2)
```

## Paket Detector

The Paket detector parses `paket.lock` files to extract:
- Resolved package names and versions recorded in the lock file
- Dependency relationships between packages as represented in the lock file
- Development dependency classification based on Paket group names

The detector does not authoritatively distinguish which packages were explicitly requested (from `paket.dependencies`) versus brought in transitively; it approximates this by treating packages that appear as dependencies of other packages as transitive.

Currently, the detector focuses on the `NUGET` section of the lock file, which contains NuGet package dependencies. Other dependency types (GITHUB, HTTP, GIT) are not currently supported.

## How It Works

The detector:
1. Locates `paket.lock` files in the scan directory
2. Parses the file line by line, tracking the current GROUP context
3. Identifies packages (4-space indentation) and their versions, keyed by group
4. Identifies dependencies (6-space indentation) and their version constraints
5. Records all packages as NuGet components
6. Establishes parent-child relationships between packages and their dependencies
7. Classifies packages as development dependencies based on their group name

## Development Dependency Classification

Paket organizes dependencies into groups within `paket.lock`. The detector uses group names to classify packages as development (`isDevelopmentDependency: true`) or production (`isDevelopmentDependency: false`) dependencies.

**Well-known development groups** (case-insensitive):
- Exact matches: `Test`, `Tests`, `Docs`, `Documentation`, `Build`, `Analyzers`, `Fake`, `Benchmark`, `Benchmarks`, `Samples`, `DesignTime`
- Suffix matches: any group name ending with `Test` or `Tests` (e.g., `UnitTest`, `IntegrationTests`, `AcceptanceTests`, `E2ETest`)

**Production groups**:
- The default/unnamed group (packages before any `GROUP` line)
- `Main`
- Any group name not matching the well-known patterns above (e.g., `Server`, `Client`, `Shared`)

When the same package appears in multiple groups (e.g., `FSharp.Core` in both `Build` and `Server`), both occurrences are registered. The framework's merge logic ensures that if a package appears in **any** production group, it is ultimately classified as a production dependency.

## Known Limitations

- This detector is currently **DefaultOff** and must be explicitly enabled
- Only NuGet dependencies from the `NUGET` section are detected
- GitHub, HTTP, and Git dependencies are not currently supported
- Without cross-referencing the `paket.dependencies` file, the detector cannot reliably distinguish between direct and transitive dependencies; it uses the dependency graph within the lock file to approximate this
- Development dependency classification is based on group names only; it does not cross-reference `paket.references` files to verify which packages are actually used by test vs. production projects (planned for a future iteration)
- The detector assumes the lock file format follows the standard Paket conventions
Original file line number Diff line number Diff line change
@@ -0,0 +1,275 @@
namespace Microsoft.ComponentDetection.Detectors.Paket;

using System;
using System.Collections.Generic;
using System.IO;
using System.Text.RegularExpressions;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.ComponentDetection.Contracts;
using Microsoft.ComponentDetection.Contracts.Internal;
using Microsoft.ComponentDetection.Contracts.TypedComponent;
using Microsoft.Extensions.Logging;

/// <summary>
/// Detects NuGet packages in paket.lock files.
/// Paket is a dependency manager for .NET that provides better control over package dependencies.
/// </summary>
// TODO: Promote to default-on (remove IDefaultOffComponentDetector) once validated in real-world usage.
public sealed class PaketComponentDetector : FileComponentDetector, IDefaultOffComponentDetector
{
private static readonly Regex PackageLineRegex = new(@"^\s{4}(\S+)\s+\(([^\)]+)\)", RegexOptions.Compiled);
private static readonly Regex DependencyLineRegex = new(@"^\s{6}(\S+)\s+\(([^)]+)\)", RegexOptions.Compiled);

/// <summary>
/// Well-known Paket group names that indicate development-time dependencies.
/// Exact matches (case-insensitive): test, tests, docs, documentation, build, analyzers, fake,
/// benchmark, benchmarks, samples, designtime.
/// Suffix matches (case-insensitive): groups ending with "test" or "tests" to cover names like
/// "unittest", "unittests", "integrationtest", "integrationtests", etc.
/// </summary>
private static readonly HashSet<string> ExactDevGroupNames = new(StringComparer.OrdinalIgnoreCase)
{
"test", "tests", "docs", "documentation", "build", "analyzers", "fake",
"benchmark", "benchmarks", "samples", "designtime",
};

/// <summary>
/// Initializes a new instance of the <see cref="PaketComponentDetector"/> class.
/// </summary>
/// <param name="componentStreamEnumerableFactory">The factory for handing back component streams to File detectors.</param>
/// <param name="walkerFactory">The factory for creating directory walkers.</param>
/// <param name="logger">The logger to use.</param>
public PaketComponentDetector(
IComponentStreamEnumerableFactory componentStreamEnumerableFactory,
IObservableDirectoryWalkerFactory walkerFactory,
ILogger<PaketComponentDetector> logger)
{
this.ComponentStreamEnumerableFactory = componentStreamEnumerableFactory;
this.Scanner = walkerFactory;
this.Logger = logger;
}

/// <inheritdoc />
public override IList<string> SearchPatterns => ["paket.lock"];

/// <inheritdoc />
public override string Id => "Paket";

/// <inheritdoc />
public override IEnumerable<string> Categories =>
[Enum.GetName(typeof(DetectorClass), DetectorClass.NuGet)!];

/// <inheritdoc />
public override IEnumerable<ComponentType> SupportedComponentTypes => [ComponentType.NuGet];

/// <inheritdoc />
public override int Version => 2;

/// <summary>
/// Determines whether a Paket group name represents a development-time dependency group.
/// The unnamed/default group and "Main" are considered production groups.
/// </summary>
/// <param name="groupName">The group name from the paket.lock file, or empty string for the default group.</param>
/// <returns><c>true</c> if the group is a well-known development group; <c>false</c> otherwise.</returns>
internal static bool IsDevelopmentDependencyGroup(string groupName)
{
if (string.IsNullOrEmpty(groupName) || groupName.Equals("Main", StringComparison.OrdinalIgnoreCase))
{
return false;
}

if (ExactDevGroupNames.Contains(groupName))
{
return true;
}

// Suffix matches: *test, *tests (e.g., UnitTest, IntegrationTests)
if (groupName.EndsWith("test", StringComparison.OrdinalIgnoreCase) ||
groupName.EndsWith("tests", StringComparison.OrdinalIgnoreCase))
{
return true;
}

return false;
}

/// <inheritdoc />
protected override async Task OnFileFoundAsync(ProcessRequest processRequest, IDictionary<string, string> detectorArgs, CancellationToken cancellationToken = default)
{
try
{
var singleFileComponentRecorder = processRequest.SingleFileComponentRecorder;
using var reader = new StreamReader(processRequest.ComponentStream.Stream);

// First pass: collect all resolved packages and their dependency relationships, keyed by group.
// In paket.lock, 4-space indented lines are resolved packages with pinned versions.
// 6-space indented lines are dependency specifications (version constraints) of the parent
// package; they are NOT resolved versions. The actual resolved version for each dependency
// will appear as its own 4-space entry elsewhere in the file.
//
// Packages are tracked per group because the same package may appear in multiple groups
// (e.g., FSharp.Core in both "Build" and "Server") potentially with different versions.
// Group names are also used to classify packages as development dependencies: well-known
// group names like "Test", "Build", "Docs", etc. indicate development-time dependencies.
//
// Limitation: without cross-referencing paket.dependencies or paket.references, we cannot
// perfectly distinguish between direct and transitive dependencies. We use the dependency
// graph within each group to approximate: packages that appear as dependencies of other
// packages are marked as transitive, and the rest are treated as explicit.

// Key: (groupName, packageName) -> version
var resolvedPackages = new Dictionary<(string Group, string Name), string>(GroupAndNameComparer.Instance);

// (groupName, parentName, dependencyName)
var dependencyRelationships = new List<(string Group, string ParentName, string DependencyName)>();

var currentSection = string.Empty;
var currentGroupName = string.Empty; // empty string = default/unnamed group
string? currentPackageName = null;

while (await reader.ReadLineAsync(cancellationToken) is { } line)
{
if (string.IsNullOrWhiteSpace(line))
{
continue;
}

// Check if this is a section header (e.g., NUGET, GITHUB, HTTP, GROUP, RESTRICTION, STORAGE)
if (!line.StartsWith(' ') && line.Trim().Length > 0)
{
var trimmed = line.Trim();

// GROUP lines set the current group context; they are not a "section" like NUGET.
// The format is "GROUP <name>" and subsequent sections (NUGET, GITHUB, etc.)
// belong to this group until the next GROUP line.
if (trimmed.StartsWith("GROUP ", StringComparison.OrdinalIgnoreCase) && trimmed.Length > 6)
{
currentGroupName = trimmed[6..].Trim();
currentSection = string.Empty;
currentPackageName = null;
}
else
{
currentSection = trimmed;
currentPackageName = null;
}

continue;
}

// Only process NUGET section for now
if (!currentSection.Equals("NUGET", StringComparison.OrdinalIgnoreCase))
{
continue;
}

// Check if this is a remote line (source URL)
if (line.TrimStart().StartsWith("remote:", StringComparison.OrdinalIgnoreCase))
{
continue;
}

// Check if this is a package line (4 spaces indentation) - these are resolved packages
var packageMatch = PackageLineRegex.Match(line);
if (packageMatch.Success)
{
currentPackageName = packageMatch.Groups[1].Value;
var currentPackageVersion = packageMatch.Groups[2].Value;

var key = (currentGroupName, currentPackageName);
if (!resolvedPackages.TryAdd(key, currentPackageVersion))
{
this.Logger.LogDebug(
"Duplicate package {PackageName} found in group '{GroupName}' with version {Version}; keeping previously resolved version {ExistingVersion}",
currentPackageName,
currentGroupName,
currentPackageVersion,
resolvedPackages[key]);
}

continue;
}

// Check if this is a dependency line (6 spaces indentation) - these are version constraints
var dependencyMatch = DependencyLineRegex.Match(line);
if (dependencyMatch.Success && currentPackageName != null)
{
var dependencyName = dependencyMatch.Groups[1].Value;
dependencyRelationships.Add((currentGroupName, currentPackageName, dependencyName));
}
}

// Build a set of package names (per group) that appear as dependencies of other packages
var transitiveDependencyNames = new HashSet<(string Group, string Name)>(GroupAndNameComparer.Instance);
foreach (var (group, _, dependencyName) in dependencyRelationships)
{
transitiveDependencyNames.Add((group, dependencyName));
}

// Register all resolved packages with group-aware isDevelopmentDependency.
// If a package appears in multiple groups, it will be registered multiple times with
// potentially different isDevelopmentDependency values. The framework's AND-merge
// semantics ensure that if ANY registration says false (production), the final result
// is false -- preventing accidental hiding of production dependencies.
foreach (var ((group, name), version) in resolvedPackages)
{
var isDev = IsDevelopmentDependencyGroup(group);
var component = new DetectedComponent(new NuGetComponent(name, version));
singleFileComponentRecorder.RegisterUsage(
component,
isExplicitReferencedDependency: !transitiveDependencyNames.Contains((group, name)),
isDevelopmentDependency: isDev);
}

// Register parent-child relationships using the dependency specifications
foreach (var (group, parentName, dependencyName) in dependencyRelationships)
{
var parentKey = (group, parentName);
var depKey = (group, dependencyName);

if (resolvedPackages.ContainsKey(depKey) && resolvedPackages.ContainsKey(parentKey))
{
var isDev = IsDevelopmentDependencyGroup(group);
var parentVersion = resolvedPackages[parentKey];
var parentComponentId = new NuGetComponent(parentName, parentVersion).Id;

var depVersion = resolvedPackages[depKey];
var depComponent = new DetectedComponent(new NuGetComponent(dependencyName, depVersion));

singleFileComponentRecorder.RegisterUsage(
depComponent,
isExplicitReferencedDependency: false,
parentComponentId: parentComponentId,
isDevelopmentDependency: isDev);
}
}
}
catch (Exception e) when (e is IOException or InvalidOperationException)
{
processRequest.SingleFileComponentRecorder.RegisterPackageParseFailure(processRequest.ComponentStream.Location);
this.Logger.LogWarning(e, "Failed to read paket.lock file {File}", processRequest.ComponentStream.Location);
}
}

/// <summary>
/// Case-insensitive equality comparer for (Group, Name) tuples used as dictionary keys.
/// </summary>
private sealed class GroupAndNameComparer : IEqualityComparer<(string Group, string Name)>
{
public static readonly GroupAndNameComparer Instance = new();

public bool Equals((string Group, string Name) x, (string Group, string Name) y)
{
return StringComparer.OrdinalIgnoreCase.Equals(x.Group, y.Group) &&
StringComparer.OrdinalIgnoreCase.Equals(x.Name, y.Name);
}

public int GetHashCode((string Group, string Name) obj)
{
return HashCode.Combine(
StringComparer.OrdinalIgnoreCase.GetHashCode(obj.Group),
StringComparer.OrdinalIgnoreCase.GetHashCode(obj.Name));
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ namespace Microsoft.ComponentDetection.Orchestrator.Extensions;
using Microsoft.ComponentDetection.Detectors.Maven;
using Microsoft.ComponentDetection.Detectors.Npm;
using Microsoft.ComponentDetection.Detectors.NuGet;
using Microsoft.ComponentDetection.Detectors.Paket;
using Microsoft.ComponentDetection.Detectors.Pip;
using Microsoft.ComponentDetection.Detectors.Pnpm;
using Microsoft.ComponentDetection.Detectors.Poetry;
Expand Down Expand Up @@ -132,6 +133,9 @@ public static IServiceCollection AddComponentDetection(this IServiceCollection s
services.AddSingleton<IComponentDetector, NuGetPackagesConfigDetector>();
services.AddSingleton<IComponentDetector, NuGetProjectModelProjectCentricComponentDetector>();

// Paket
services.AddSingleton<IComponentDetector, PaketComponentDetector>();

// PIP
services.AddSingleton<IPyPiClient, PyPiClient>();
services.AddSingleton<ISimplePyPiClient, SimplePyPiClient>();
Expand Down
Loading
Loading