Thursday, November 5, 2015

Java: Poor performance of Files.newDirectoryStream() with wildcards

Here are results of comparison of getting single result of Files.newDirectoryStream() and File.exists().

Wildcards pattern I used is "prefix_prefix_prefix____?.tmp". Results are almost the same when calling Files.newDirectoryStream() with string argument with no wildcard (direct match).

Test results 1:

measure exists speed in loop 1000 times - BEGIN
measure exists speed in loop 1000 times - END
Elapsed: 5 ms

Test results 2:

measure wildcards match speed in loop 1000 times - BEGIN
measure wildcards match speed in loop 1000 times - END
Elapsed: 80590 ms

Test environment:

Windows 7 system
NTFS system
Folder with 90 000 files (zero length)
Java 8

dir approach

I looped this MSDOS command for 1000 times to check performance of dir with wildcards:

dir prefix_prefix_prefix____*.tmp

I tested it in loop with:
FOR /L %i IN (1,1,1000) DO @dir prefix_prefix_prefix____*.tmp > nul

It finished in about 1000ms. I suppose this time is spend mainly in calling DIR command by command interpreter.

Conclusion

Files.newDirectoryStream() has absolutely no performance optimizations when working with wildcards.

Relative information

File.list also has poor performance

Source code of newDirectoryStream

    public static DirectoryStream<Path> newDirectoryStream(Path dir, String glob)
        throws IOException
    {
        // avoid creating a matcher if all entries are required.
        if (glob.equals("*"))
            return newDirectoryStream(dir);

        // create a matcher and return a filter that uses it.
        FileSystem fs = dir.getFileSystem();
        final PathMatcher matcher = fs.getPathMatcher("glob:" + glob);
        DirectoryStream.Filter<Path> filter = new DirectoryStream.Filter<Path>() {
            @Override
            public boolean accept(Path entry)  {
                return matcher.matches(entry.getFileName());
            }
        };
        return fs.provider().newDirectoryStream(dir, filter);
    }

No comments:

Post a Comment