Yusong's Blog Don't Panic

Functional Parallelism and Java Stream API


Coursera - Parallel Programming in Java Week2

Functional Parallelism

Future Tasks

  • Future task are tasks with returned value
  • Future (Promise) object is a “handle” for accessing a task’s return value
  • The idea of Pure Function means given the same input, it will always give same output, which means the call of function order doesn’t matter and it’s good for parallel computing.
  • Assignment : A can be assigned a reference to a future object returned by a task of the form, future { ⟨ task-with-return-value ⟩ } (using pseudocode notation). The content of the future object is constrained to be single assignment (similar to a final variable in Java), and cannot be modified after the future task has returned.
  • Blocking Read : the operation, A.get(), waits until the task associated with future object A has completed, and then propagates the task’s return value as the value returned by A.get(). Any statement, S, executed after A.get() can be assured that the task associated with future object A must have completed before S starts execution.

Creating Future Tasks in Java’s Fork/Join Framework

Some key differences between future tasks and regular tasks in the FJ framework are as follows:

  1. A future task extends the RecursiveTask class in the FJ framework, instead of RecursiveAction as in regular tasks.

  2. The 𝚌𝚘𝚖𝚙𝚞𝚝𝚎() method of a future task must have a non-void return type, whereas it has a void return type for regular tasks.

  3. A method call like 𝚕𝚎𝚏𝚝.𝚓𝚘𝚒𝚗() waits for the task referred to by object 𝚕𝚎𝚏𝚝 in both cases, but also provides the task’s return value in the case of future tasks.

Memoization in parallelization

The memoization pattern lends itself easily to parallelization using futures by modifying the memoized data structure to store {(x1, y1 = future(f (x1))), (x2, y2 = future(f (x2))), . . .}. The lookup operation can then be replaced by a get() operation on the future value, if a future has already been created for the result of a given input.

Java Streams

To compute the average age of all active students using Java streams:

students.stream()
    .filter(s -> s.getStatus() == Student.ACTIVE)
    .map(a -> a.getAge())
    .average();

An important benefit of using Java streams when possible is that the pipeline can be made to execute in parallel by designating the source to be a parallel stream, i.e., by simply replacing students.stream() in the above code by students.parallelStream() or Stream.of(students).parallel().

Optional Reading:

  1. Article on “Processing Data with Java SE 8 Streams”

  2. Tutorial on specifying Aggregate Operations using Java streams

  3. Documentation on java.util.stream.Collectors class for performing reductions on streams

Determinism and Data Races

A parallel program is said to be functionally deterministic if it always computes the same answer when given the same input, and structurally deterministic if it always computes the same computation graph, when given the same input.

Data Races often indicate some sort of deterministic property missing.


Get update from Yusong's blog by Email on → Feedburner

下一篇 Loop Parallelism

Comments

Content