Writing a custom Spliterator

Obviously, writing a custom Spliterator is not a daily task, but let's assume that we are working on a project that, for some reason, requires us to process strings that contain ideographic characters (CJKV (short for Chinese, Japanese, Korean, and Vietnamese)) and non-ideographic characters. We want to process these strings in parallel. This mandates that we split them into characters only at positions representing ideographic characters.

Obviously, the default Spliterator will not perform as we want it to, and so we may need to write a custom Spliterator. For this, we have to implement the Spliterator interface and provide an implementation of a few methods. The implementation is available in the code that's been bundled with this book. Consider opening the IdeographicSpliterator source code and keeping it close by while reading the rest of this section.

The climax of the implementation is in the trySplit() method. Here, we are trying to split the current string in half and continue to traverse it until we find an ideographic character. For checking purposes, we've just added the following line:

System.out.println("Split successfully at character: " 
+ str.charAt(splitPosition));

Now, let's consider a string containing ideographic characters:

String str = "Character Information  Development and Maintenance " 
+ "Project for e-Government MojiJoho-Kiban Project";

Now, let's create a parallel stream for this string and force IdeographicSpliterator to do its job:

Spliterator<Character> spliterator = new IdeographicSpliterator(str);
Stream<Character> stream = StreamSupport.stream(spliterator, true);

// force spliterator to do its job
stream.collect(Collectors.toList());

One possible output will reveal that the split takes place only at positions containing ideographic characters:

Split successfully at character: 
Split successfully at character:
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset