Friday, March 21, 2014

Stream and Lambda examples in Java 8

Stream|Parallel Stream|Lambda examples Java 8
In this post, we'll see how Stream concept works in Java 8, its characteristics, how pipelines operations works in Stream with examples. This post help you to understand and grasp the basic knowledge of writing Stream, its usefulness and working.


What is a Stream in Java
"Stream is a wrappers around collections that support many convenient and high-performance operations expressed compactly but clearly with lambdas."
Streams are not collections: they do not manage their own data. Instead, they are wrappers around existing data structures. When you make or transform a Stream, it does not copy the underlying data. Instead, it just builds a pipeline of operations. How many times that pipeline will be invoked depends on what you later do with the stream.

Characteristics of Streams

  • No storage : A stream is not a data structure that stores elements; instead, it conveys elements from a source such as a data structure, an array, a generator function, or an I/O channel, through a pipeline of computational operations.
  • Designed for lambdas : All Stream operations take lambdas as arguments.
  • Functional in nature : operation on a stream produces a result, but does not modify its source. For example, filtering a Stream obtained from a collection produces a new Stream without the filtered elements, rather than removing elements from the source collection.
  • Possibly unbounded : While collections have a finite size, streams need not. Short-circuiting operations such as limit(n) or findFirst() can allow computations on infinite streams to complete in finite time.
  • Consumable : The elements of a stream are only visited once during the life of a stream. Like an Iterator, a new stream must be generated to revisit the same elements of the source.
  • Do not support indexed access : You can ask for the first element, but not the second or third or last element. But, see next bullet.
  • Laziness-seeking : Many Stream operations are postponed until it is known how much data is eventually needed.
  • Parallelizable : If you designate a Stream as parallel, then operations on it will automatically be done concurrently, without having to write explicit multi-threading code.


How Stream operations and pipelines works

Stream operations are divided into 2 operations:

  1. Intermediate operation
  2. Terminal operations.

There are combined to form stream pipelines.


A stream pipeline consists 3 parts

  1. A source (such as a Collection, an array, a generator function, or an I/O channel).
  2. Followed by zero or more intermediate operations such as Stream.filter or Stream.map.
  3. A terminal operation such as Stream.forEach or Stream.reduce.

Intermediate operations
Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate.
Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.
Intermediate operations are further divided into operations:

  1. Stateless operations
  2. Stateful operations
Stateless operations, such as filter and map, retain no state from previously seen element when processing a new element -- each element can be processed independently of operations on other elements.

Stateful operations, such as distinct and sorted, may incorporate state from previously seen elements when processing new elements. Stateful operations may need to process the entire input before producing a result. For example, one cannot produce any results from sorting a stream until one has seen all elements of the stream.

As a result, under parallel computation, some pipelines containing stateful intermediate operations may require multiple passes on the data or may need to buffer significant data. Pipelines containing exclusively stateless intermediate operations can be processed in a single pass, whether sequential or parallel, with minimal data buffering.

Intermediate methods

  • map (and related mapToInt, flatMap, etc.)
  • filter
  • distinct
  • sorted
  • peek
  • limit
  • substream
  • parallel
  • sequential
  • unordered

Terminal operations
Terminal operations, such as Stream.forEach or IntStream.sum, may traverse the stream to produce a result or a side-effect. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used; if you need to traverse the same data source again, you must return to the data source to get a new stream.
In almost all cases, terminal operations are eager, completing their traversal of the data source and processing of the pipeline before returning. Only the terminal operations iterator() and spliterator() are not.

Processing streams lazily allows for significant efficiencies. It allows avoiding examining all the data when it is not necessary; for operations such as "find the first string longer than 1000 characters", it is only necessary to examine just enough strings to find one that has the desired characteristics without examining all of the strings available from the source. 

Further, some operations are deemed short-circuiting operations.

  • An intermediate operation is short-circuiting if, when presented with infinite input, it may produce a finite stream as a result.
  • A terminal operation is short-circuiting if, when presented with infinite input, it may terminate in finite time.
Having a short-circuiting operation in the pipeline is a necessary, but not sufficient, condition for the processing of an infinite stream to terminate normally in finite time.

Terminal methods

  • forEach
  • forEachOrdered
  • toArray
  • reduce
  • collect
  • min
  • max
  • count
  • anyMatch
  • allMatch
  • noneMatch
  • findFirst
  • findAny
  • iterator
Short-circuit methods
  • anyMatch
  • allMatch
  • noneMatch
  • findFirst
  • findAny
  • limit
  • substream


How Stream can be obtained
Streams can be obtained in a number of ways:
  1. From a Collection via the stream() and parallelStream() methods.
  2. From an array via Arrays.stream(Object[]).
  3. From static factory methods on the stream classes, such as Stream.of(Object[]), IntStream.range(int, int) or Stream.iterate(Object, UnaryOperator).
  4. The lines of a file can be obtained from BufferedReader.lines().
  5. Streams of file paths can be obtained from methods in Files.
  6. Streams of random numbers can be obtained from Random.ints().
  7. Numerous other stream-bearing methods in the JDK, including BitSet.stream(), Pattern.splitAsStream(java.lang.CharSequence), and JarFile.stream().

Here we see the basic one here to understand Steam first -  From a Collection.

Collection is interface containing methods such as stream() and parallelStream() implementing by all the collections such as ArrayList, LinkedList etc.
Streams is a interface having useful methods. Before moving to out first example, let understand few method of Stream interface








Here Interface Predicate<T> is a functinal interface with abstract method.
boolean test(T t)
So in place of Predicate we always use lambda expression that return always take an input and return boolean result. If you don't have an idea about lambda expression, check this post.








Here Interface Consumer<T> is a functinal interface with abstract method.
void accept(T t)
So in place of Consumer we always use lambda expression that return always take an input and return nothing. If you don't have an idea about lambda expression, check this post.

Examples to explain the above concept.









User defined Stream examples:


















Map( ) in Stream
Stream operations include map(), which applies a function across each element present within a Stream to produce a result out of each element. So, for example, we can obtain the age of each Student in the collection by applying a simple function to retrieve the age out of each Student.











ToIntFunctionToDoubleFunction and ToLongFunction are functional interface that accept one argument and return result.


Making Streams from Primitives

  • Stream.of(val1, val2, ...)
  • Stream.of(someArray)

You always have be cautious while making Streams from Primitives.










How it works
Suppose in a list of integers such as 1, 2, 3, 4, and 5, the seed 0 is added to 1 and the result (1) is stored as the accumulated value, which then serves as the left-hand value in addition to serving as the next number in the stream (1+2). The result (3) is stored as the accumulated value and used in the next addition (3+3). The result (6) is stored and used in the next addition (6+4), and the result is used in the final addition (10+5), yielding the final result 15.





Parallelism
All streams operations can execute either in serial or in parallel. The stream implementations in the JDK create serial streams unless parallelism is explicitly requested.

For example, Collection has methods Collection.stream() and Collection.parallelStream(), which produce sequential and parallel streams respectively; other stream-bearing methods such as IntStream.range(int, int) produce sequential streams but these streams can be efficiently parallelized by invoking their BaseStream.parallel() method.



The only difference between the serial and parallel versions of this example is the creation of the initial stream, using "parallelStream()" instead of "stream()".
When the terminal operation is initiated, the stream pipeline is executed sequentially or in parallel depending on the orientation of the stream on which it is invoked. Whether a stream will execute in serial or parallel can be determined with the isParallel() method, and the orientation of a stream can be modified with the BaseStream.sequential() and BaseStream.parallel() operations. When the terminal operation is initiated, the stream pipeline is executed sequentially or in parallel depending on the mode of the stream on which it is invoked.

Most stream operations accept parameters that describe user-specified behavior, which are often lambda expressions. To preserve correct behavior, these behavioral parameters must be non-interfering, and in most cases must be stateless. Such parameters are always instances of a functional interface such as Function, and are often lambda expressions or method references.
Here non-interfering means: For most data sources, preventing interference means ensuring that the data source is not modified at all during the execution of the stream pipeline.
Related Post
Basic of functional interface and Lambda in Java 8
Lambda and effectively final with examples in Java 8
How to use :: method reference in Java 8
Metaspace in Java 8


If you know anyone who has started learning Java, why not help them out! Just share this post with them. 
Thanks for studying today!...

1 comment: