days
-4
-5
hours
0
-2
minutes
-4
0
seconds
-2
-3
search
Potential Bug?

Beware Of findFirst() And findAny()

Nicolai Parlog

Wooden Blocks with the text: Beware image via Shutterstock

After filtering a Java 8 Stream it is common to use findFirst() or findAny() to get the element that survived the filter. But that might not do what you really meant and subtle bugs can ensue.

So What’s Wrong With findFirst() And findAny()?

As we can see from their Javadoc (here and here) both methods return an arbitrary element from the stream – unless the stream has an encounter order, in which case findFirst() returns the first element. Easy.

A simple example looks like this:

public Optional<Customer> findCustomer(String customerId) {
	return customers.stream()
			.filter(customer -> customer.getId().equals(customerId))
			.findFirst();
}

Of course this is just the fancy version of the good old for-each-loop:

public Optional<Customer> findCustomer(String customerId) {
	for (Customer customer : customers)
		if (customer.getId().equals(customerId))
			return Optional.of(customer);
	return Optional.empty();
}

But both variants contain the same potential bug: they are built on the implicit assumption that there can only be one customer with any given ID.

Now, this might be a very reasonable assumption. Maybe this is a known invariant, guarded by dedicated parts of the system, relied upon by others. In that case this is totally fine.

Often the code relies on a unique matching element but does nothing to assert this.

But in many cases I see out in the wild, it is not. Maybe the customers were just loaded from an external source that makes no guarantees about the uniqueness of their IDs. Maybe an existing bug allowed two books with the same ISBN. Maybe the search term allows surprisingly many unforeseen matches (did anyone say regular expressions?).

Often the code’s correctness relies on the assumption that there is a unique element matching the criteria but it does nothing to enforce or assert this.

Worse, the misbehavior is entirely data-driven, which might hide it during testing. Unless we have this scenario in mind, we might simply overlook it until it manifests in production.

Even worse, it fails silently! If the assumption that there is only one such element proves to be wrong, we won’t notice this directly. Instead the system will misbehave subtly for a while before the effects are observed and the cause can be identified.

So of course there is nothing inherently wrong with findFirst() and findAny(). But it is easy to use them in a way that leads to bugs within the modeled domain logic.

Failing Fast

So let’s fix this! Say we’re pretty sure that there’s at most one matching element and we would like the code to fail fast if there isn’t. With a loop we have to manage some ugly state and it would look as follows:

public Optional<Customer> findOnlyCustomer(String customerId) {
	boolean foundCustomer = false;
	Customer resultCustomer = null;
	for (Customer customer : customers)
		if (customer.getId().equals(customerId))
			if (!foundCustomer) {
				foundCustomer = true;
				resultCustomer = customer;
			} else {
				throw new DuplicateCustomerException();
			}

	return foundCustomer
			? Optional.of(resultCustomer)
			: Optional.empty();
}

Now, streams give us a much nicer way. We can use the often neglected reduce, about which the documentation says:

Performs a reduction on the elements of this stream, using an associative accumulation function, and returns an Optional describing the reduced value, if any. This is equivalent to:

boolean foundAny = false;
T result = null;
for (T element : this stream) {
    if (!foundAny) {
        foundAny = true;
        result = element;
    }
    else
        result = accumulator.apply(result, element);
}
return foundAny ? Optional.of(result) : Optional.empty();

but is not constrained to execute sequentially.

Doesn’t that look similar to our loop above?! Crazy coincidence…

So all we need is an accumulator that throws the desired exception as soon as it is called:

public Optional<Customer> findOnlyCustomerWithId_manualException(String customerId) {
	return customers.stream()
			.filter(customer -> customer.getId().equals(customerId))
			.reduce((element, otherElement) -> {
				throw new DuplicateCustomerException();
			});
}

This looks a little strange but it does what we want. To make it more readable, we should put it into a Stream utility class and give it a nice name:

public static <T> BinaryOperator<T> toOnlyElement() {
	return toOnlyElementThrowing(IllegalArgumentException::new);
}

public static <T, E extends RuntimeException> BinaryOperator<T>
toOnlyElementThrowing(Supplier<E> exception) {
	return (element, otherElement) -> {
		throw exception.get();
	};
}

Now we can call it as follows:

// if a generic exception is fine
public Optional<Customer> findOnlyCustomer(String customerId) {
	return customers.stream()
			.filter(customer -> customer.getId().equals(customerId))
			.reduce(toOnlyElement());
}

// if we want a specific exception
public Optional<Customer> findOnlyCustomer(String customerId) {
	return customers.stream()
			.filter(customer -> customer.getId().equals(customerId))
			.reduce(toOnlyElementThrowing(DuplicateCustomerException::new));
}

How is that for intention revealing code?

This will materialize the entire stream.

It should be noted that, unlike findFirst() and findAny(), this is of course no short-circuiting operation and will materialize the entire stream. That is, if there is indeed only one element. The processing of course stops as soon as a second element is encountered.

Reflection

We have seen how findFirst() and findAny() do not suffice to express the assumption that there is at most one element left in the stream. If we want to express that assumption and make sure the code fails fast if it is violated, we need to reduce(toOnlyElement()).

You can find the code on GitHub and use it as you like – it is in the public domain.

Thanks to Boris Terzic for making me aware of this intention mismatch in the first place.

This post was originally published on codefx.org.

Author
Nicolai Parlog
Nicolai is a thirty year old boy, as the narrator would put it, who has found his passion in software development. He constantly reads, thinks, and writes about it, and codes for a living as well as for fun. Nicolai is the editor of SitePoint's Java channel, writes a book about Project Jigsaw, blogs about software development on codefx.org, and is a long-tail contributor to several open source projects. You can hire him for all kinds of things.

Leave a Reply

Be the First to Comment!

avatar
400
  Subscribe  
Notify of