One of the common operations of the Collectors API introduced in Java 8 is the possibility to collect results into a result container like List, Set or Map. The following example uses the collect()
method to generate a HashSet
containing unique numbers:
Now suppose a given string and the need to compute some summaries on it, like the number of uppercase, lowercase, invalid chars and how many digits are present on that string.
Applying a reduce operation on each needed summary operation would result in more than one pass through the data, like this (here I am using the var keyword introduced in Java 10):
In the above code excerpt we are iterating three times through the string to compute the summaries. If we want to iterate just one time on the give string and get the desired results, we could fallback to the traditional imperative approach:
Despite the solution above, there is another one: writing your own custom stream collector. This custom collector can compute the number of uppercase, lowercase, invalid chars and how many digits are present on the given string, in a single pass through the data. It is possible to make it run in parallel
with the Streams API as well.
The custom stream collector shown here uses the chars()
method of the String class which returns an IntStream. The IntStream
class contains a collect()
method that computes a mutable reduction on the elements and returns its result in a container class.
The next example shows the container class code. It receives and accumulates each char
of the String in the accept()
method, thus categorizing it as a digit, uppercase char, lowercase char or as an invalid char.
The complete implementation of the collect() method which produces the result of the reduction in the CharSummaryStatistics class is shown below. As a new char arrives, it is categorized in the CharSummaryStatistics.accept()
method. The CharSummaryStatistics.combine()
method is used to merge partial results.
Have you ever written a custom Java stream collector? Please drop your comments here.
The complete source code can be found on GitHub.
As a reference for writing this post, this article is part of a series about Streams and it is a good reference for learning how to aggregate and collect with Streams. All the articles from this series about Streams can be found here.