Picking the right abstraction
Recently I had to adapt some older Java code to support a new requirement: an existing CSV report needed to include a user’s email address, translated from the user id. Pretty simple, but how does the CSV report generator translate the user id to an email address?
The obvious implementation is to simply pass the UserDao
directly to the report class1:
package com.example.fizzbuzz.reports;
import com.example.fizzbuzz.dao.UserDao;
import com.example.fizzbuzz.domain.User;
public class FizzBuzzCsvReport extends CsvReport {
private final UserDao userDao;
public FizzBuzzCsvReport(UserDao userDao) {
this.userDao = userDao;
}
public String[] headers() {
return new String[] { "Fizzy", "Buzzy", "Email" };
}
public String[] row(FizzBuzz data) {
User user = userDao.findById(data.getUserId());
return new String[] {
data.getFizz(),
data.getBuzz(),
user.getEmail()
};
}
}
The UserDao
is injected into the constructor of the FizzBuzzCsvReport
and used in the row
method to translate the user id into an email
address. Simple and probably what most Java code would look like. Unfortunately, this is not the right solution2. Let’s write a unit test:
package com.example.fizzbuzz.reports;
import static org.junit.Assert.*;
import static org.mockito.Mockito.*;
import org.junit.Test;
import com.example.fizzbuzz.dao.UserDao;
import com.example.fizzbuzz.domain.Role;
import com.example.fizzbuzz.domain.User;
public class FizzBuzzCsvReportTest {
@Test
public void test_row_generation() {
UserDao userDao = mock(UserDao.class);
when(userDao.findById("userid")).thenReturn(
new User("userid", "user name", "user@example.com", new Role[] { Role.ADMINSTRATOR }));
FizzBuzzCsvReport subject = new FizzBuzzCsvReport(userDao);
String[] result = subject.row(new FizzBuzz("fizz", "buzz", "userid"));
assertArrayEquals(
new String[] { "fizz", "buzz", "user@example.com" },
result);
}
}
That’s quite a bit of overhead just to perform a simple check! Why is it so hard to write the test? We’re even programming to an interface, not an implementation!
The main problem is not that UserDoa
is not an abstraction (it is), but that it is the wrong abstraction for this usage. UserDao
abstracts over
how users are stored3, and by passing in the UserDao
to the report we unnecessarily couple the report to the details of how users are represented
and managed within the rest of our system. Note that using a dynamic language doesn’t really help either. The coupling would be less (no need to agree
on the exact type) but the report would still require an object that responds to the findById
message with an object that responds to the getEmail
message.
So what would be the right abstraction? Let’s go back to the new requirement: “given the user id, add the user’s email address”. What’s the simplest abstraction that could work here? Let’s just use a function:
package com.example.fizzbuzz.reports;
import com.google.common.base.Function;
public class FizzBuzzCsvReport extends CsvReport {
private final Function<String, String> userIdToEmail;
public FizzBuzzCsvReport(Function<String, String> userIdToEmail) {
this.userIdToEmail = userIdToEmail;
}
public String[] headers() {
return new String[] { "Fizzy", "Buzzy", "Email" };
}
public String[] row(FizzBuzz data) {
return new String[] {
data.getFizz(),
data.getBuzz(),
userIdToEmail.apply(data.getUserId())
};
}
}
Now the report is fully decoupled from our application’s user infrastructure and is much easier to test:
package com.example.fizzbuzz.reports;
import static org.junit.Assert.*;
import java.util.Collections;
import org.junit.Test;
import com.google.common.base.Functions;
public class FizzBuzzCsvReportTest {
@Test
public void test_row_generation() {
FizzBuzzCsvReport subject = new FizzBuzzCsvReport(Functions.forMap(
Collections.singletonMap("userid", "user@example.com")));
String[] result = subject.row(new FizzBuzz("fizz", "buzz", "userid"));
assertArrayEquals(
new String[] { "fizz", "buzz", "user@example.com" },
result);
}
}
We use the
handy Functions.forMap
method
to create a function from a map4 and use this in our test. Compared to the previous version, setup boilerplate has been reduced by 57.1%5.
Obviously, the actual production code (for example, the controller that let’s the user download the CSV) will still have to adapt the UserDao
to the
Function
interface to make use of the FizzBuzzCsvReport
. This is straightforward and only needs to be defined once. With Java 8’s
upcoming lambda support this will be even easier. To summarize:
- Prefer simple, well-understood abstractions over home-grown variants (in this case a
Function<String, String>
versus aUserDao
). - By defining the
FizzBuzzCsvReport
in terms of aFunction<String, String> userIdToEmail
we make clear what the report needs and also limits what it can do (principle of least privilege). With theUserDao
approach we wouldn’t know what exactly the report is using that DAO for, it could even be deleting users! - Using an abstraction like
Function
gives you a huge library of pre-defined tools: adapting maps as functions, using memoization or caching, function composition, etc. Compare this to having to write your own caching adapter for aUserDao
! - Function is just the start. There are many others that are simple and widely applicable.
-
Code simplified for explanatory reasons. ↩
-
There are many other possible solutions for this example1. ↩
-
Abstracting over how users are stored was very useful when we replaced LDAP with a web based user management system. ↩
-
In many languages collections are automatically functions. Scala’s Map and other collections already extend scala.Function1, so you can pass a
Map
whenever a function is expected. Ruby 1.9’s Array, Hash, Proc (Ruby’s function class), and String classes all respond to the[]
method, etc. ↩ -
98% of all statistics are made up. ↩