Breadth-First Search with Java 8 Stream API

I recently had the problem to walk through some data and collect elements on the go. I thought it would be nice to use the (relatively) new Stream API in Java 8. After a short search in “the internet” I found a solution in this blog post for tree structures and in this one for recursing through a file system (which is a tree structure again). So, story told? No way! I needed to walk through graph structures (i.e., there might be cycles in it). In contrary, the solutions above will run forever, or more precisely they will terminate relatively fast with a StackOverflowException.

A solution is e.g. a breadth-first search (BFS). In essence, a BFS explores the search space level by level. On the go it stores the so called open list. open contains the nodes (on the next level) not explored, yet. Already processed nodes won’t be put into open again. This enables searching through structures with cycles.

Consider the following as an iterative solution to BFS using the Stream API:

class Node {
	Node(String name) {
		this.name = name;
	}

	String name;
	private List<Node> childs = new LinkedList<>();

	List<Node> getChilds() {
		return childs;
	}
}

class IterativeStreamBFS {
	public static void main(String args[]) {
		final Node first = new Node("first");
		Node second = new Node("second");

		first.getChilds().add(second);
		second.getChilds().add(first);

		final Set<Node> nodes = new LinkedHashSet<Node>();
		nodes.add(first);

		Set<Node> openSet = new HashSet<>(nodes);

		do {
			final Stream<Node> childNodes = openSet.stream()
					.map(Node::getChilds).flatMap(List::stream);
			openSet = childNodes.filter(p -> !nodes.contains(p))
					.collect(Collectors.toSet());
			nodes.addAll(openSet);
		}
		while(!openSet.isEmpty());
	}
}

This solution works, but has some downsides. First, it is not recursive, but iterative. This is not a bad thing for itself, but it is not very functional, something I wanted upfront when I decided to use the Stream API. Second, it is inefficient as it isn’t lazy and creates new openSet sets in each iteration. Third, (which is related to first) it puts the burden onto the programmer to handle all the moving parts (i.e. handling of the openSet and the implicit return value processes). For sure we could hide a lot of this in corresponding methods, but then there is a risk of becoming inflexible.

Looking back at the recursive solutions for trees mentioned before, I came up with this one:

class Node {

	//...

	Stream<Node> flattened() {
		final Set<Node> closedSet = Collections.synchronizedSet(new HashSet<>(1000));
		return flattened(closedSet).parallel();
	}

	private Stream<Node> flattened(Set<Node> closedSet) {
		if (!closedSet.add(this)) return Stream.empty();
		else return Stream.concat(Stream.of(this),
				childs.stream().flatMap(n -> n.flattened(closedSet)));
	}
}

class LazyRecursiveStreamBFS {
	public static void main(String args[]) {
		// ...
		first.flattened().collect(Collectors.toList());
	}
}

So, this solution is lazy and it does not produce intermediate open sets (although it creates a whole bunch of streams on the go). Further on, you can parallelize it and do further stream operations on it. You can e.g. use limit(), which reduces the overall cost significantly, as the exploration is done lazily. Last but not least, its usage is very easy.

Some words to the implementation: the initialization of the HashSet with an initial capacity of 1000 is for performance reasons (so that the set must not be resized so often, for bigger structures). I use a synchronizedSet, so that the execution can be parallelized. I use only atomic operations on the set, so that I do not need any further synchronization (see line 11).

Of course this solution can be enhanced to search through more complex structures, where not every explored object should be put into the stream. But be careful. If you search through objects that you don’t care and they may have cyclic structures, you have to do extra book-keeping for them, or you will end in endless loops (or stack overflows) again.

Exciting.

Leave a Reply

Your email address will not be published. Required fields are marked *