Skip to content

Conversation

@Pjb518
Copy link

@Pjb518 Pjb518 commented Nov 10, 2025

Simplifies the implementation of toUnix and normalizePath. Adds a negative lookbehind for the DOUBLE_SLASH_RE constant. Removes the now unused DOUBLE_SLASH constant.

…ative lookbehind for the DOUBLE_SLASH_RE constant. Removes the now unused DOUBLE_SLASH constant.
@Pjb518
Copy link
Author

Pjb518 commented Nov 10, 2025

Not sure what is causing the tests to fail here. I've just run them locally on every version of node listed, and all tests pass in every case.

@43081j
Copy link
Collaborator

43081j commented Nov 10, 2025

some nonsense is going on with CI for sure

ill see if i can figure it out

@Pjb518
Copy link
Author

Pjb518 commented Nov 10, 2025

@43081j Small additional thing: is there any case where normalizePathToUnix produces a different result to normalizePath?

function normalizePath(path: Path): Path {
  if (typeof path !== 'string') throw new Error('string expected');

  return toUnix(sp.normalize(path));
}
const normalizePathToUnix = (path: Path) => toUnix(sp.normalize(toUnix(path)));

I'm pretty sure that inner call to toUnix is completely redundant, and these functions do the same thing, barring one providing a specific error message. If so, maybe one of these should just be made into an alias for the other function and marked for removal in a future version.

@Pjb518
Copy link
Author

Pjb518 commented Nov 10, 2025

One final thing, there's a TODO regarding refactoring normalizeIgnored that currently looks like this:

const normalizeIgnored =
  (cwd = '') =>
  (path: Matcher): Matcher => {
    if (typeof path === 'string') {
      return normalizePathToUnix(sp.isAbsolute(path) ? path : sp.join(cwd, path));
    } else {
      return path;
    }
  };

I propose creating an isStringMatcher utility, which can be used in a number of places.

function isStringMatcher(matcher: Matcher): matcher is string {
  return typeof matcher === 'string';
}

The normalizeIgnored function can then be updated to either:

const normalizeIgnored =
  (cwd = '') =>
  (path: Matcher): Matcher => {
    if (isStringMatcher(path)) {
      return normalizePathToUnix(getAbsolutePath(path, cwd));
    }

    return path;
  };

Or, if you prefer something more concise:

const normalizeIgnored =
  (cwd = '') =>
  (path: Matcher): Matcher =>
    isStringMatcher(path)
      ? normalizePathToUnix(getAbsolutePath(path, cwd))
      :  path;

Both approaches make use of the existing getAbsolutePath function that performs the same logic. Let me know if you want to take care of that, and I'll sort it in a second PR.

@43081j
Copy link
Collaborator

43081j commented Nov 10, 2025

they are likely the same but have existed for different amounts of time (so just tech debt)

we should unify them if we can. not necessarily in this PR as we need to figure out what would be affected by no longer throwing an error

for extracting the string type guard, do you have an example of where else it would be useful? by itself it doesn't seem to give us much benefit, especially since typeof narrows correctly too

@Pjb518
Copy link
Author

Pjb518 commented Nov 11, 2025

for extracting the string type guard, do you have an example of where else it would be useful? by itself it doesn't seem to give us much benefit, especially since typeof narrows correctly too

In terms of type narrowing, it doesn't provide any benefit. However, there are a lot of cases where we check the matcher type is a string, and I think it reads nicer if the implementation details are hidden behind a simple utility. I admit that for strings, this yields the smallest benefit, but moving something like the createPattern function to use utilities instead of matcher instanceof RegExp and typeof matcher === "string" reads a lot better to me.

function createPattern(matcher: Matcher): MatchFunction {
  if (isFunctionMatcher(matcher)) return matcher;
  if (isStringMatcher(matcher)) return (string) => matcher === string;
  if (isRegExpMatcher(matcher)) return (string) => matcher.test(string);
  if (isObjectMatcher(matcher)) {
    return (string) => {
      if (matcher.path === string) return true;
      if (matcher.recursive) {
        const relative = sp.relative(matcher.path, string);
        if (!relative) {
          return false;
        }
        return !relative.startsWith('..') && !sp.isAbsolute(relative);
      }
      return false;
    };
  }
  return () => false;
}

I also think that the isObjectMatcher utility should be made a little more strict, as this will allow more invalid objects to hit the () => false default. The current checks are also not sufficient to justify the return type. For example,

isMatcherObject(new Date())

will return true, and TS erroneously takes this to mean that new Date() is of type MatcherObject.

interface MatcherObject {
  path: string;
  recursive?: boolean;
}

This version, for example, would rule out arrays, dates, and several other object types, and it'll ensure that whenever true is returned, the object minimally satisfies the MatcherObject interface.

function isObjectMatcher(matcher: Matcher): matcher is MatcherObject {
  return typeof matcher === 'object' && isStringMatcher(matcher?.path);
}

EDIT: I suppose better yet would be this:

function isObjectMatcher(matcher: unknown): matcher is MatcherObject {
  return typeof matcher === 'object' && isStringMatcher((matcher as any)?.path);
}

EDIT 2: Here is what I envision the createPattern function looking like after some additional changes:

function createMatchFunctionFromObject(matcher: MatcherObject): MatchFunction {
  if (!matcher.recursive) return (string: string) => matcher.path === string;

  return (string: string) => {
    if (matcher.path === string) return true;
    const relative = sp.relative(matcher.path, string);
    return relative ? !relative.startsWith('..') && !sp.isAbsolute(relative) : false;
  };
}

function createPattern(matcher: Matcher): MatchFunction {
  if (isFunctionMatcher(matcher)) return matcher;
  if (isStringMatcher(matcher)) return (string) => matcher === string;
  if (isRegExpMatcher(matcher)) return (string) => matcher.test(string);
  if (isObjectMatcher(matcher)) return createMatchFunctionFromObject(matcher);
  return () => false;
}

Or even this:

function createPattern(matcher: Matcher): MatchFunction {
  if (isFunctionMatcher(matcher)) return matcher;
  if (isStringMatcher(matcher)) return createMatchFunctionFromString(matcher);
  if (isRegExpMatcher(matcher)) return createMatchFunctionFromRegExp(matcher);
  if (isObjectMatcher(matcher)) return createMatchFunctionFromObject(matcher);
  return () => false;
}

@43081j
Copy link
Collaborator

43081j commented Nov 13, 2025

you're basically suggesting runtime type checking, which i'm not sure i agree with.

we statically type this to already accept the right shape, so this typeof check is merely to narrow between the strong types we know we will receive: string | RegExp | MatchFunction | MatcherObject.

if someone passes an object which doesn't satisfy this type definition, i don't think we need to care, that's their own fault.

@Pjb518
Copy link
Author

Pjb518 commented Nov 13, 2025

you're basically suggesting runtime type checking, which i'm not sure i agree with.

we statically type this to already accept the right shape, so this typeof check is merely to narrow between the strong types we know we will receive: string | RegExp | MatchFunction | MatcherObject.

if someone passes an object which doesn't satisfy this type definition, i don't think we need to care, that's their own fault.

My proposed check differs from the current one only in that we check for a path attribute rather than ensuring the object is not a regular expression. The goal is less about runtime type checking and more about more effectively moving junk matchers into the "this is garbage" () => false path.

Checking for a string path property rules out regex just as well as checking instanceof, but it also catches other unusable object types. In practice, that means running the following chain far less for objects we could easily verify up front are not going to work:

if (matcher.path === string) return true;
if (matcher.recursive) {
  const relative = sp.relative(matcher.path, string);
  if (!relative) {
    return false;
  }
  return !relative.startsWith('..') && !sp.isAbsolute(relative);
}

@43081j
Copy link
Collaborator

43081j commented Nov 13, 2025

if (matcher.path === string) return true;
if (matcher.recursive) {
  const relative = sp.relative(matcher.path, string);
  if (!relative) {
    return false;
  }
  return !relative.startsWith('..') && !sp.isAbsolute(relative);
}

this only happens as part of the test function of a matcher object (not a RegExp)

  if (matcher instanceof RegExp) return (string) => matcher.test(string);
  if (typeof matcher === 'object' && matcher !== null) {
    return (string) => {
      if (matcher.path === string) return true;
      if (matcher.recursive) {
        const relative = sp.relative(matcher.path, string);
        if (!relative) {
          return false;
        }
        return !relative.startsWith('..') && !sp.isAbsolute(relative);
      }
      return false;
    };
  }

everything inside the anonymous function assumes matcher is a matcher object.

In practice, that means running the following chain far less for objects we could easily verify up front are not going to work:

so i don't understand this part. this function is only called for matcher objects, what am i missing? 🤔

Checking for a string path property rules out regex just as well as checking instanceof, but it also catches other unusable object types

we have strongly typed what our function accepts. if the user passes nonsense into it, i think its their own fault.

i feel like you're trying to guard against invalid inputs from users. but im saying we don't need to care because we've strongly defined what we accept and its your own fault if you pass something that isn't that. this is already correctly narrowing for the types we accept. we don't need to check if path exists because, if its not a RegExp, it must be a valid matcher object as per our types.

@Pjb518
Copy link
Author

Pjb518 commented Nov 13, 2025

if (matcher.path === string) return true;
if (matcher.recursive) {
  const relative = sp.relative(matcher.path, string);
  if (!relative) {
    return false;
  }
  return !relative.startsWith('..') && !sp.isAbsolute(relative);
}

this only happens as part of the test function of a matcher object (not a RegExp)

Yes, I understand that. The RegExp thing is actually not relevant, as I misremembered the condition for the matcher object branch. There's a utility defined right above called isMatcherObject, and I misremembered that being used in the chain of conditions. That utility looks like this:

const isMatcherObject = (matcher: Matcher): matcher is MatcherObject =>
  typeof matcher === 'object' && matcher !== null && !(matcher instanceof RegExp);

As you can see, it performs essentially the same check as the one in the conditional statement, but it also checks whether the object is a RegExp instance. It turns out, however, that this utility is used just about everywhere except in the chain of conditions inside createPattern.

That being said, the argument for my alternative condition is the same if we're trying to rule out null as well. Checking that matcher?.pattern is a string (really that it exists, but if we're already checking defined, why not check its the type we actually need) tells us that the object is not null, but it also tells us whether the object has the absolute bare minimum properties to potentially generate a useful matcher function.

For example, if we end up with a matcher object being passed in with an incorrect key, like so:

{ pth: "somePath", recursive: true }

Why should we give the user back a matcher function in this form:

(string) => {
  if (matcher.path === string) return true;
  if (matcher.recursive) {
    const relative = sp.relative(matcher.path, string);
    if (!relative) {
      return false;
    }
    return !relative.startsWith('..') && !sp.isAbsolute(relative);
  }
  return false;
};

When the updated check can give them the following, avoiding a bunch of useless work when the matcher function is used:

() => false;

i feel like you're trying to guard against invalid inputs from users. but im saying we don't need to care because we've strongly defined what we accept and its your own fault if you pass something that isn't that. this is already correctly narrowing for the types we accept. we don't need to check if path exists because, if its not a RegExp, it must be a valid matcher object as per our types.

If that's true, then why do we need to check for null, and why do we need that final default case where we provide () => false; as the matcher function where we couldn't trigger any of the conditions?

The thing is, we just say that the options we're getting from the user are Matchers, but I don't think we actually check that anywhere.

We have this type where the user is passing in config:

type FSWInstanceOptions = BasicOpts & {
  ignored: Matcher[]; // string | fn ->
  awaitWriteFinish: false | AWF;
};

which defines the options being passed into the FSWatcher constructor. And then everywhere down the line, we assume that we have a Set<Matcher> or a Matcher[], all the way down to createPattern. As far as I can see, the typing there is actually erroneous, as we don't know that we have something matching the Matcher type at all.

If I've missed something, I'm happy to be corrected.

@Pjb518
Copy link
Author

Pjb518 commented Nov 13, 2025

I will also say that this has nothing really to do with the current PR, so I'm happy to move this to somewhere more appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants