Advent of Functional PHP: Day 7

in PHPlast year (edited)

Advent of Code Day 7 this year is another problem that's more about the math than about the programming, so we won't see much in the way of new functional techniques. Still, there's some interesting bits in there.

Today we need to calculate the fuel costs of moving a bunch of crabs in submarines all into a line. (Don't ask. Really, don't ask.) Essentially we want to center-align a series of points using the least "cost" possible. Crab positions are represented by a single number, as crabs can only move horizontally. (Because crabs.)

The trick for today is realizing that the crabs don't matter; it's a distance-cost calculation. In part 1, the cost for a crab to move one space toward whatever alignment number we want to pick is 1.

Crabby programmers

The data input is again just a single line of comma-separated numbers, so reading that in is the same as yesterday:

$data = pipe($inputFile,
    file_get_contents(...),
    trim(...),
    explode(','),
    amap(intval(...)),
);

Now we need to determine to what number to move all of those numbers, such that the amount of change of each number adds up to the smallest possible amount. Naturally we could brute-force it, but that's usually unwise even if your CPU can handle it. In this case, there's a much easier way: the optimal number is the median of all the numbers.

Computing the median is easy if we have a sorted list, but we don't, so we have to sort it ourselves first. PHP's sort() function operates on an array in-place, so it's not really functional-friendly, but that's readily solved.

function numSort(array $a): array
{
    sort($a);
    return $a;
}

Next we can compute the median value of that list. However, if the list has an even number of values then the median will be a fractional index number. Our crabs are quantized, so they cannot move a half-crab. Fortunately, computing the cost of two numbers (one more and one less than the fractional median) is not too bad, cost-wise, so let's just do that.

function medians(array $sorted): array
{
    $idx = (count($sorted) + 1) / 2;
    return is_float($idx)
        ? [$sorted[floor($idx)], $sorted[ceil($idx)]]
        : [$sorted[$idx]];
}

That takes a sorted list and computes the index of the median value. If it's a floating point (ie, fractional), then we'll return a list of two numbers: the one right before and right after it. If not, we can just return the single value at that index, as an array so that the code that comes after is the same either way. (Remember Garfield's Law: One is a special case of many.)

Now we need to know how to compute the cost of moving all of the crabs to that position. That is the sum of the cost of moving each individual crab. The cost of an individual crab is trivially simple: It's just the diff between those two numbers.

function cost(int $x, int $target): int
{
    return abs($x - $target);
}

To get the cost for all crabs, we need to apply that function to each crab and then sum the results. "apply to each" means a map operation.

$costFinder = static fn (int $median): int => pipe($data,
    amap(static fn (int $i): int => cost($i, $median)),
    array_sum(...),
);

That looks a bit weird with the double anonymous function, but most of it is just syntax flotsam. $costFinder is a pipe operation that will map across the original $data list (the crab start points), and the function it applies to each one is cost(), with the $median provided. We can't as easily make it a named function because we need to close over the $data, which is what actually gets mapped. (Yet another case where a first class syntax for partial function application would be helpful.)

Now, we can take our list of possible target destinations, find each of their total costs, and go with whichever the smallest is. That means our full processing pipeline looks like this:

$totalDistance = pipe($data,
    numSort(...),
    medians(...),
    amap($costFinder(...)),
    min(...),
);

Crabs legs are expensive

Part 2 tweaks the problem a little. In this case, the cost of moving a crab to a given point isn't one per unit. It's an arithmetic sequence, so moving one space costs 1, moving a second space costs 2 (3 total), a third space costs 3 (6 total), etc.

That means two changes to the program.

  • The cost calculation is different.
  • The median is no longer the destination.

Let's consider the first change first. The cost for a crab to move from its start position to some other position is the sum of the numbers 1 through the total spaces it has to move; the specific spaces it's moving from/to don't matter, just the difference.

Fortunately, there's a known shortcut for adding up all the numbers in a given range. (If you don't know it, that's OK. I had forgotten it too, until I looked it up.) Our new cost function, then, determines the difference between start and end points, then counts the sums from 1 to that number.

function cost(int $x, int $target): int
{
    $steps = abs($x - $target);
    return ($steps * (1 + $steps)) / 2;
}

Now, what's our new target? Because the costs are weighted, the median doesn't work. What does approximate the target, however, is the average, or mean. (I don't think it's a guaranteed exact answer, but a statistical approximation that is accurate enough when dealing with our size of data.) The average is also going to have the same fractional problem as the median, but we can solve it in the same way by just checking the neighboring integers:

function average(array $list): array
{
    $avg = array_sum($list) / count($list);
    return is_float($avg)
        ? [floor($avg), ceil($avg)]
        : [$avg];
}

We also no longer care about a sorted list, so we can leave that out. Our final pipeline now looks like this:

$totalDistance = pipe($data,
    average(...),
    amap($costFinder(...)),
    min(...),
);

The full code is on GitHub as always.


Want to know more about functional programming and PHP? Read the whole book on the topic: Thinking Functionally in PHP.


Thinking Functionally in PHP