8.23. Extract the Filename from a Windows Path

Problem

You have a string that holds a (syntactically) valid path to a file or folder on a Windows PC or network, and you want to extract the filename, if any, from the path. For example, you want to extract file.ext from c:folderfile.ext.

Solution

[^\/:*?"<>|
]+$
Regex options: Case insensitive
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

Extracting the filename from a string known to hold a valid path is trivial, even if you don’t know whether the path actually ends with a filename.

The filename always occurs at the end of the string. It can’t contain any colons or backslashes, so it cannot be confused with folders, drive letters, or network shares, which all use backslashes and/or colons.

The anchor $ matches at the end of the string (Recipe 2.5). The fact that the dollar also matches at embedded line breaks in Ruby doesn’t matter, because valid Windows paths don’t include line breaks. The negated character class [^\/:*?"<>| ]+ (Recipe 2.3) matches the characters that can occur in filenames. Though the regex engine scans the string from left to right, the anchor at the end of the regex makes sure that only the last run of filename characters in the string will be matched, giving us our filename.

If the string ends with a backslash, as it will for paths that don’t specify a filename, the regex won’t match at all. When it does match, it will match only the filename, so we don’t need to use any capturing groups to separate the filename from the rest of the path.

See Also

See Recipe 3.7 to learn how to retrieve text matched by the regular expression in your favorite programming language.

Follow Recipe 8.19 if you don’t know in advance that your string holds a valid Windows path.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset