Skip to content

Conversation

@ebarlas
Copy link
Contributor

@ebarlas ebarlas commented Oct 28, 2025

This change adds configuration options to the PKI Realm for principal extraction from the client certificate based on a relative distinguished name (RDN) attribute value.

The following Elasticsearch YAML configuration options are included. One or neither can be used, not both.

  • username_rdn_oid - the RDN attribute type OID to use, e.g. 2.5.4.3 (CN)
  • username_rdn_name - the RDN attribute name to use, e.g. CN

The implementation is based on ASN1 DER parsing of the X500 distinguished name via DerParser.

Examples:

DN OID Name Value
CN=John Doe, OU=Security Team, OU=Engineering, O=Elastic 2.5.4.3 CN John Doe
[email protected], CN=John, OU=Eng, O=Elastic 1.2.840.113549.1.9.1 EMAILADDRESS [email protected]
UID=abc123, CN=John Doe, OU=Eng, O=Elastic 0.9.2342.19200300.100.1.1 UID abc123
CN=John Smith+OU=Development, O=Acme Corp 2.5.4.3 CN John Smith
1.3.6.1.4.1.42069.1.2.1.1=Jane Doe, OU=Engineering, O=Elastic 1.3.6.1.4.1.42069.1.2.1.1 - Jane Doe
UID=abc123, OU=Eng, O=Elastic 2.5.4.3 CN -

@ebarlas ebarlas requested a review from a team October 28, 2025 05:20
@ebarlas ebarlas added >bug :Security/Security Security issues without another label Team:Security Meta label for security team labels Oct 28, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @ebarlas, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

Copy link
Contributor

@jfreden jfreden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Left some minor comments on the code but have some more general comments on this approach.

The lack of multi-value support is because of:

an RDN might have more than 1 value. The current code assumes that whatever is on the right hand side of the = is "the value" but if it's a multi valued RDN then it could be all the values separated by +.

I don't know how common having multi-valued RDNs is, but I'm not sure I understand why we don't want to support it? As long as we parse it properly and do the sub-parsing of the multi-valued RDN string properly we wouldn't hit the same issue again?

username_rdn.pattern - the regular expression to apply for principal extraction, default (.*)

Wouldn't this be useful for customers that want to switch to this RDN based approach, but still want to be able to be flexible in their PKI realm auth?

: getPrincipalFromSubjectDN(principalPattern, token, logger);
}

static String getPrincipalFromRdnAttribute(String principalRdnType, X509AuthenticationToken token, Logger logger) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these being static for the sake of testability is making the non-test code confusing since we have to pass the logger instance and principalRdnType instance.

Copy link
Contributor Author

@ebarlas ebarlas Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The approach is unusual, but it is consistent with the surrounding code. This method is modeled after the adjacent method getPrincipalFromSubjectDN:

static String getPrincipalFromSubjectDN(Pattern principalPattern, X509AuthenticationToken token, Logger logger) {
  ...
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the footprint isn't too large it might make sense to change that. I'll leave it up to you.

X500Principal certPrincipal = token.credentials()[0].getSubjectX500Principal();
String principal = RdnFieldExtractor.extract(certPrincipal.getEncoded(), principalRdnType);
if (principal == null) {
logger.debug(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be a warning since it's probably due to a misconfiguration most of the time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also a case of remaining consistent with surrounding code. getPrincipalFromSubjectDN uses debug logging.

}
}

private static String doExtract(byte[] encoded, String oid) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand what's going on here. Each Asn1Object is a TLV (Type Length Value), and "constructed" means that it's another object, not a primitive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. My understanding is that "constructed" is a tag bit that denotes a sequence or set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can read chapter 12 of the ASN.1 book if you're particularly interested in becoming an expert in ASN.1, but yes, "constructed" is akin to "structured", that is a non-primitive ( a sequence, set or choice )

private static String doExtract(byte[] encoded, String oid) throws IOException {
DerParser parser = new DerParser(encoded);

DerParser.Asn1Object dnSequence = parser.readAsn1Object();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you can pass the expected type here so DerParser.Type.SEQUENCE.

To check my own understanding, this is the sequence of RDN key-values.

Copy link
Contributor Author

@ebarlas ebarlas Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great insight! According to ASN.1 X509 standard, the name is a SEQUENCE of RDNs, where each RDN is a SET. This approach could shave off a few lines of code, too, since the constructed check will be redundant.

DerParser sequenceParser = dnSequence.getParser();

while (true) {
DerParser.Asn1Object rdnSet = sequenceParser.readAsn1Object();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set here is the actual key-value object in the sequence. If multi-valued this could contain several attributes, but we currently only support one?

Copy link
Contributor Author

@ebarlas ebarlas Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Each RDN is a SET. Exactly one attribute per RDN is allowed, although I'm relaxing that now.


if (oid.equals(attrOid)) {
try {
setParser.readAsn1Object();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of handling this with an exception, the parser could be updated to contain something like peekNextAsn10Object() maybe? It would require wrapping the underlaying InputStream in a BufferedInputStream I think but might be worth it? WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. But, I'm not too bothered by having to catch the EOF exception. I'll give it more thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took a while for me to understand that the exception being thrown was the "expected" execution path for a successful parsing. But I don't feel strongly about it, might be overkill to add all that.

key -> new Setting<>(key, DEFAULT_USERNAME_PATTERN, s -> Pattern.compile(s, Pattern.CASE_INSENSITIVE), Setting.Property.NodeScope)
);

public static final Setting.AffixSetting<Boolean> USERNAME_RDN_ENABLED_SETTING = Setting.affixKeySetting(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both pattern and RDN is enabled, should it fail with some validation error?

Copy link
Contributor Author

@ebarlas ebarlas Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it stands, the username pattern is always defined, with a default of CN=(.*?)(?:,|$).

So, I don't think there is a simple way to detect misconfiguration.

I could verify that a non-default pattern does not coexist with enabled true, but I'm not sure that adds much value.

@ebarlas
Copy link
Contributor Author

ebarlas commented Nov 5, 2025

I don't know how common having multi-valued RDNs is, but I'm not sure I understand why we don't want to support it? As long as we parse it properly and do the sub-parsing of the multi-valued RDN string properly we wouldn't hit the same issue again?

Agreed. Rejecting MV RDNs does seem arbitrary. I'll relax that constraint.

Wouldn't this be useful for customers that want to switch to this RDN based approach, but still want to be able to be flexible in their PKI realm auth?

Good question! I'm not sure. My interpretation is that the regex-based extraction exists because it has to deal with all of the surrounding text. I'm not sure if it's useful to do extractions within an RDN attribute value.

Along those line, I am also slightly concerned that a single OID (like CN) might be too limiting. We could use an RDN path expression of some kind that can specify more about the RDN placement. Or even a query expression! But, that also seems too complex without a clear need.

@ebarlas ebarlas self-assigned this Nov 7, 2025

public static final Setting.AffixSetting<Boolean> USERNAME_RDN_ENABLED_SETTING = Setting.affixKeySetting(
RealmSettings.realmSettingPrefix(TYPE),
"username_rdn.enabled",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this. We don't usually have enabled settings that control whether another setting is used - if a setting has a value then we treat it as enabled.

If the admin configures a RDN attribute name, then we should be able to infer that they want to use RDN based principal extraction.

public static final Setting.AffixSetting<String> USERNAME_RDN_TYPE_SETTING = Setting.affixKeySetting(
RealmSettings.realmSettingPrefix(TYPE),
"username_rdn.type",
key -> Setting.simpleString(key, "2.5.4.3", Setting.Property.NodeScope)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some comments below about RDN parsing, but the fact that using our own parsing is forcing us to work in OIDs is a problem. ES admins don't want to care about OIDs - they want to be able to specific cn or some other standard attribute name.

This feature is a lot less useful if we don't apply a schema to the OIDs.

Copy link
Contributor Author

@ebarlas ebarlas Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it may not be the most ergonomic, but is this really a problem in practice? The default is effectively CN and changes will likely be one of a very small number of other OID names. And the rare power user will be able to carry over their custom enterprise OID with ease.

/**
* Utility class to extract RDN field values from X500 principal DER encoding.
*/
public class RdnFieldExtractor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I proposed that we process RDNs, I didn't think we were going to need to write our own parser (but the alternative isn't really obvious)

We have multiple DN/RDN implementations already, they're not a perfect fit for what we want, but I think some of them can be made to work

  • javax.security.auth.x500.X500Principal has all the decoding we need (via sun.security.x509.X500Name) but doesn't expose it. It does however provide a way of turning it into a canonical string representation
  • com.unboundid.ldap.sdk.DN is a good DN representation that exposes RDNs and Attributes, but doesn't include a way to go from a DER byte[] to a DN. However, it can parse canonical strings, so it would be possible to turn a X500Principal into a DN via string encoding.
  • com.unboundid.util.ssl.cert.X509Certificate.decodeName can turn a ASN1Element into a DN but it's package private, so not useful.
  • org.cryptacular.x509.dn.NameReader.readX500Principal can turn a X500Principal into a RDNSequence, although I don't love making our dependency on cryptacular stronger, it does do what we need
  • org.bouncycastle.asn1.x500.X500Name is, in typical Bouncy Castle fashion, exactly what we want, but we try to limit our dependency on BC due to FIPS complexities.

I think we can avoid this custom parsing code entirely if we use either

  • org.cryptacular.x509.dn.NameReader.readX500Principal, or
  • new com.unboundid.ldap.sdk.DN(subject.getName(format)) (I'd need to do more investigation to know which format is going to suit us best).

We use DN in lots of other places, (DnRoleMapper, DistinguishedNameNormalizer) so it would have been my preference, if ldapsdk had an easy to use DER decoder. But parsing via an RFC based string representation isn't completely terrible.

If I had to pick, I guess I'd use cryptacular, since it does exactly what we need, and we can swap it out if we ever need to.

That said, it's not the worst thing to use our own parser (it's surprisingly short since we have a working DER parser), but we lose schema support, and I really don't think we can force OIDs onto our users. So, even if we stick with our own parsing we'll need to pull in a schema from one of those libraries so we can work with standard attribute names.

Copy link
Contributor Author

@ebarlas ebarlas Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that requiring schema definition awareness weakens the case for DerParser, although doing an initial mapping from schema definition attribute name to OID may be reasonable.

javax.naming.ldap.LdapName is another option and it doesn't require dependencies.

  • This parser operates on RFC 2253 text
  • It also provides a way to iterate over RDNs (via javax.naming.ldap.Rdn)
  • Conversion to RFC 2253 format (via X500Principal) results in hex-conversion of string values for unknown OIDs

Can we determine how the username pattern setting is used today? That might help inform the OID decision.

Copy link
Contributor Author

@ebarlas ebarlas Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hurdle that I ran into earlier with javax.naming.ldap.LdapName is that the RFC 2253 formatted text is a hexadecimal representation of the BER encoding. That representation is mirrored in the LdapName or DN object.

This is what led me to a direct interpretation of the DER bytes, which bypasses nuances with text formatting.

String s = "OID.1.3.6.1.4.1.50000.1.1=EMP-2024-42, CN=Jane Developer, OU=Engineering, O=Acme Corp";
X500Principal principal = new X500Principal(s);
System.out.println(principal.getName(X500Principal.RFC2253));
1.3.6.1.4.1.50000.1.1=#130b454d502d323032342d3432,CN=Jane Developer,OU=Engineering,O=Acme Corp

}
}

private static String doExtract(byte[] encoded, String oid) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can read chapter 12 of the ASN.1 book if you're particularly interested in becoming an expert in ASN.1, but yes, "constructed" is akin to "structured", that is a non-primitive ( a sequence, set or choice )

// ASN.1 type is the lower 5 bits of the identifier octet
// See: ITU-T X.690
private static final int ASN1_TYPE_SEQUENCE = 0x10;
private static final int ASN1_TYPE_SET = 0x11;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use DerParser.Type instead (but will need to add SET)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated ✅

@ebarlas
Copy link
Contributor Author

ebarlas commented Nov 11, 2025

@tvernum and @jfreden , I made another round of changes to address your feedback:

  • Configuration is now done with either username_rdn_oid or username_rdn_name, where the name approach supports standard LDAP names such as CN and UID
  • Dropped the enabled setting
  • Most significant RDN has priority
  • Any OID attribute name can be used, including non-LDAP OIDs and custom/non-standard OIDs

I also created a much simpler alternative PR that excludes OID support. It uses RFC 2253 text parsing instead of DER parsing: #137931

Copy link
Contributor

@jfreden jfreden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code looks good.

For the question on if we should go with this approach vs #137931 I do really like that we're leaning more on libraries in that PR where this is well tested, but I'm also not sure it's worth the trade off with:

Non-LDAP extensions, such as EMAILADDRESS, and custom OIDs are not supported.

I'm not sure what the current usage looks like and I guess that customers that want more custom behaviour could fall back to the regex approach, but why not provide a more flexible approach so they don't have to when we have it? I think the approach in this PR turned out pretty straight forward and minimal.

Would like to hear @tvernum opinion on this.

// When multiple RDNs with the same OID exist, should return the last one encountered in DER encoding
// Note: X.500 encoding reverses the order - the last attribute in the DN string is first in DER encoding
String ou = extractFromDN("CN=John Doe, OU=Security Team, OU=Engineering, O=Elastic", OID_OU);
assertThat(ou, is(equalTo("Security Team")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this test. I was wondering about this, I think it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Security/Security Security issues without another label Team:Security Meta label for security team v8.19.8 v9.1.8 v9.2.2 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants