Skip to content

[GOBBLIN-2246] Make Orchestrator MDM metric dimesnion configurable #4164

Open
adithya2754 wants to merge 3 commits intoapache:masterfrom
adithya2754:aditrao/configurable-mdm-dimensions
Open

[GOBBLIN-2246] Make Orchestrator MDM metric dimesnion configurable #4164
adithya2754 wants to merge 3 commits intoapache:masterfrom
adithya2754:aditrao/configurable-mdm-dimensions

Conversation

@adithya2754
Copy link
Contributor

@adithya2754 adithya2754 commented Feb 13, 2026

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):
    Updated jobSucceeded OTel metric emission to support runtime-configurable dimensions.
  • Orchestrator-controlled defaults via metrics.reporting.opentelemetry.jobSucceeded.dimensionsMap (OTel key → GaaSJobObservabilityEvent Avro field).
  • Baseline dimensions always present for backward compatibility.
  • Optional template/user extra dimensions (from job props) behind metrics.reporting.opentelemetry.jobSucceeded.extraDimensions.enabled gate.
  • Safety cap: metrics.reporting.opentelemetry.jobSucceeded.maxDimensions (default 20). If baseline+orchestrator exceeds the cap, orchestrator dims are dropped; template extras are included only if all fit.
  • Added/updated unit tests.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Added unit tests

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds runtime-configurable dimensions support for the jobSucceeded OpenTelemetry metric in GaaS observability events. The changes enable orchestrators to control which dimensions are emitted through configuration while maintaining backward compatibility with baseline dimensions.

Changes:

  • Introduced configurable dimension mapping via metrics.reporting.opentelemetry.jobSucceeded.dimensionsMap allowing orchestrators to map OTel dimension keys to GaaSJobObservabilityEvent Avro fields
  • Added support for per-job extra dimensions from job properties with all-or-nothing enforcement based on maxDimensions cap (default: 20)
  • Refactored tests to use try-with-resources pattern for proper resource cleanup

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
GaaSJobObservabilityEventProducer.java Core implementation adding dimension configuration logic, baseline dimension preservation, duplicate detection, and dimension cap enforcement
GaaSJobObservabilityProducerTest.java Updated existing tests to use try-with-resources; added 6 new tests covering dimension map configurations, baseline backfilling, duplicate handling, and maxDimensions enforcement
GsonUtils.java Added plain GSON instance for general-purpose JSON parsing without special adapters

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Blazer-007 Blazer-007 changed the title [GOBBLIN-2246] Aditrao/configurable mdm dimensions [GOBBLIN-2246] Make Orchestrator MDM metric dimesnion configurable Feb 16, 2026
if (value instanceof Number) {
builder.put(mdmDimensionKey, ((Number) value).longValue());
return;
}
Copy link
Contributor

@abhishekmjain abhishekmjain Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get a double, we will be sending a long value, is that intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree a dimension usually would not have floating values, but its better to call that out somewhere.


/**
* Reads the JSON string config for job succeeded dimensions map, which is expected to be in the format of
* {@code <mdm_dim_key>: <gaas_obs_event_field_key>}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check if this value can be easily passed from configs? Given that we haven't used such pattern yet, would be better to confirm once.

return jobSucceededMaxDimensions;
}

private static Map<String, String> dropDuplicateObsEventFieldMappings(Map<String, String> configuredMap,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really required?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants