You have a string that holds a (syntactically) valid path
to a file or folder on a Windows PC or network, and you want to extract
the filename, if any, from the path. For example, you want to extract
file.ext
from
c:folderfile.ext
.
[^\/:*?"<>| ]+$
Regex options: Case insensitive |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Extracting the filename from a string known to hold a valid path is trivial, even if you don’t know whether the path actually ends with a filename.
The filename always occurs at the end of the string. It can’t contain any colons or backslashes, so it cannot be confused with folders, drive letters, or network shares, which all use backslashes and/or colons.
The anchor ‹$
› matches
at the end of the string (Recipe 2.5). The fact
that the dollar also matches at
embedded line breaks in Ruby doesn’t matter, because valid Windows paths
don’t include line breaks. The negated character class ‹[^\/:*?"<>|
]+
› (Recipe 2.3) matches the characters that can occur
in filenames. Though the regex engine scans the string from left to
right, the anchor at the end of the regex makes sure that only the last run of filename
characters in the string will be matched, giving us our filename.
If the string ends with a backslash, as it will for paths that don’t specify a filename, the regex won’t match at all. When it does match, it will match only the filename, so we don’t need to use any capturing groups to separate the filename from the rest of the path.
See Recipe 3.7 to learn how to retrieve text matched by the regular expression in your favorite programming language.
Follow Recipe 8.19 if you don’t know in advance that your string holds a valid Windows path.