Advent of Functional PHP: Day 1

in PHP2 years ago (edited)

Today's challenge asks us to interpret a list of numbers. In the first part, our goal is to determine how many elements in the list are larger than their immediate predecessor.

The imperative way would be to toss it in a foreach() loop and track some state along the way. But we want to be functional and avoid "track some state," because the whole point of functional programming is to avoid tracking state, as tracking state is error prone.

When looking at a foreach-style operation that has some state, my first inclination is to look at a reduce operation. A reduce operation walks over a list and performs the same operation (function) on each item, using the output of the previous iteration as an input. That is, each step takes the output of the previous operation and the next element, and produces an output. It's quite elegant.

However, reduction operations only allow you to pass forward one intermediary result. In our case, the input to a given operation is the current element and its predecessor. That can be squeezed into a reduction operation if we try, but it feels awkward to me, so I'm going to allow some for-style loops. As long as they're controlled, that's OK.

Let's think about the steps involved in our problem:

  1. Read the input file in from disk.
  2. Break it up into a series of integers (or numeric strings, which for PHP is close enough).
  3. Determine which of those elements is larger than its predecessor.
  4. Count the number of elements that pass the test in step 3.
  5. Profit!

Pipe steps

Any time we see a "list of steps" like that, we should start thinking of functional composition. Functional programming makes heavy use of functional composition. Functional composition is, in practice, nothing more than sticking two functions end to end, so that the output of one becomes the input to the next. Do that multiple times and you get a pipeline, or "pipe," of functions, that you can then use as a single function.

Conceptually, you could write it as funcD(funcC(funcB(funcA($input)))). However, dear god why would you want to. That's gross. Many languages have a dedicated syntax to allow you to write something like this instead:

$input |> funcA |> funcB |> funcC |> funcD

Or sometimes (as in Haskell), you write it backwards:

newFunc = funcD . funcC . funcB . funcA

(The backwards version reads closer to what the naive infix version looks like, but I find the |> version to be much easier to read.)

PHP, sadly, lacks a |> operator. (There was an RFC for one in PHP 8.1, but it was declined.) However, it can be roughly emulated in user-space without too much trouble.

One way of looking at the pipe is as composing multiple functions together, then calling the result with some input. For that, we would need a compose() function like this:

function compose(callable ...$fns): callable
{
    return static function (mixed $arg) use ($fns): mixed  {
        foreach ($fns as $fn) {
            $arg = $fn($arg);
        }
        return $arg;
    };
}

That takes a list of callables and produces a new function that will take one argument, then pass it to the first function, then pass its result to the second, and its result to the third, and so on until the last result is just returned. Then a pipe() function is as simple as:

function pipe(mixed $arg, ...$fns): mixed
{
    return compose($fns)($arg);
}

Piece of cake! However, in my own benchmarks it's measurably faster to just combine them:

function pipe(mixed $arg, callable ...$fns): mixed
{
    foreach ($fns as $fn) {
        $arg = $fn($arg);
    }
    return $arg;
}

That's the version of pipe() available in Crell/fp, so we'll use that.

Reading the data

Now that we can string functions together, what are the steps to string together? First off is reading in the data. That consists of three steps: Reading the file, trimming off any excess whitespace, and breaking the file into an array of numbers.

The first part is easy: file_get_contents(). The second part is also easy, trim(). And the third part is easy, explode(). Yay, we have three trivially easy tasks we can string together! Let's try it:

$inputFile = __DIR__ . '/input.txt';

$result = pipe($inputFile,
    file_get_contents(...),
    trim(...),
    explode(PHP_EOL, ...),
);

It starts off simple enough. PHP 8.1 introduces a syntax for first-class-callables, which lets us turn any function into a callable by calling it with .... That means file_get_contents(...) is (approximately) equivalent to fn (...$args) => file_get_contents(...$args). pipe() will then take inputFile and pass it to file_get_contents(). The result of that (a string) is passed to trim(). The result of that... is a problem, because pipe() only works on single-parameter functions, and explode() is a multi-parameter function. It takes a string to explode and the string to explode on.

The naive and verbose way to solve that is to make our own closure for explode(), like so:

fn (string $s): array => explode(PHP_EOL, $s);

This process is called "partial application," because we are partially "applying" arguments to a function, producing a new function that takes fewer arguments. Some functional languages (e.g., Haskell or F#) have such functionality built in as part of their core syntax. PHP does not, and an RFC to add a flexible partial application syntax was also declined for 8.1. That leaves us with just the manual approach above.

Of course, manual approaches naturally beg to be automated, which is why Crell/fp includes a utility that does just that:

namespace Crell\fp;

function explode(string $delimiter): callable
{
    return static fn (string $s): array => \explode($delimiter, $s);
}

That's the exact same thing as above, just wrapped up into a utility function that returns a new function. That means we can use that instead, just by useing it so we get our version instead:

use function Crell\fp\explode;

$result = pipe($inputFile,
    file_get_contents(...),
    trim(...),
    explode(PHP_EOL),
);

And done. We now have our data all ready for processing.

Paring up

The second part of the task is to take that input data and determine how many entries are larger than their immediately previous entry, with the first item always being no.

As noted above, we could just foreach over the list and keep a counter. However, part 2 of the challenge makes that a bit more difficult, and the goal here is to be functional, so let's not do that. Instead, let's consider the problem again from the other end: The end result we want is a count of things, so our last step will be counting. What are we counting? Things that match a certain criteria. What's that criteria? That they're larger than their precedessor.

Thinking backwards, we come up with a different approach: The last step is obvious: we want to count() some list. The next to last step is "find things that match some criteria," which sounds a lot like a filter operation. What are we filtering? A list of pairs, where each pair is an element and its predecessor. Now that we have our outline, let's write it forwards.

Given a list of numbers, we want to produce a list of pairs of numbers. Again, while this could be done with a reducing operation that feels clunky to me, so I'm just going to use a boring foreach loop:

function pairUp(array $vals): array
{
    $ret = [];
    foreach ($vals as $i => $val) {
        $ret[] = [$val, $vals[$i - 1] ?? PHP_INT_MAX];
    }
    return $ret;
}

Is this functional? Well... depends on your point of view. Internally, not really. There's a mutable variable, manual iteration, etc. But in PHP, which doesn't have an engine that optimizes away recursion and such, it's probably the most efficient way. Externally, it very much is: Data goes in, data comes out, no other effects happen. That's good enough.

Two things to note: We're pairing up each number with its predecessor. The first item has no predecessor, but we know we want it to always be a "no," so we declare its predecessor to be PHP_MAX_INT on the assumption that will always be larger than the input data.

Also, it would be possible to instead use the key of the array for the value and the value of each item for the predecessor (or vice versa). That would probably be a bit more memory efficient, but PHP's filter operation for arrays doesn't handle keys. (PHP's array handling functions are actually quite awful, when you consider that they sometimes know that arrays are not arrays and sometimes don't.) We could write our own filter operation, but I chose to change the pair structure. You're free to do it the other way around if you prefer. Both work.

Home stretch

Now that we have our pairing function, we're nearly done. The last missing piece is filtering within a composed pipe, which PHP's native array_filter() function cannot handle for the same reason as explode(). The solution is, again, to write a little wrapper, which is exactly what Crell\fp has. It actually has two:

function afilter(?callable $c = null): callable
{
    $c ??= static fn (mixed $v, mixed $k = null): bool => (bool)$v;
    return static function (iterable $it) use ($c): array {
        if (is_array($it)) {
            return array_filter($it, $c);
        }
        $result = [];
        foreach ($it as $k => $v) {
            if ($c($v, $k)) {
                $result[$k] = $v;
            }
        }
        return $result;
    };
}

function itfilter(?callable $c = null): callable
{
    $c ??= static fn (mixed $v, mixed $l): bool => (bool)$v;
    return static function (iterable $it) use ($c): iterable {
        foreach ($it as $k => $v) {
            if ($c($v, $k)) {
                yield $k => $v;
            }
        }
    };
}

There's four cases to consider:

  1. The input is an array and we want an array at the end.
  2. The input is an iterable and we want an array at the end.
  3. The input is an array and we want an iterable at the end.
  4. The input is an iterable and we want an iterable at the end.

Options 3 and 4 are effectively the same: The itfilter() function returns a new function that closes over a filtering function and takes an iterable to process, then runs a simple foreach() loop over it, yielding each entry iff it passed the filter check, preserving keys. If no filtering function is provided, a truthi-ness check is used as a default, just like in array_filter().

The afilter() function handles the cases where we want an array result. It has the same default filtering function. If the input is an array, then we can just fall back on the built-in array_filter() for performance. If not, then we foreach() over the incoming iterable manually, build up a result, and return that.

All of these are just giving us easy ways to stick a filter operation into a pipeline, with either an array or lazy generator as the output. In our specific case we have an array coming in and want an array coming out, so it's really just a simple wrapper around array_filter() at the end of the day. I included the full explanation for completeness, and because I expect it will come up again in future challenges.

Now, we can filter out pairs, then count the result:

$result = pipe($inputFile,
    file_get_contents(...),
    trim(...),
    explode(PHP_EOL),
    pairUp(...),
    afilter(static fn($v): bool => $v[0] > $v[1]),
    count(...),
);

Bam! Done. And quite pretty if I do say so myself.

Sliding averages

Part 2 of the challenge gets a little bit trickier. Instead of comparing numbers directly, it wants us to compare sliding averages of the last three elements in order to minimize noise. Howe can we do that?

The first observation is that... everything we've done so far still applies. We don't need to modify any existing functions. That wouldn't be the case for an imperative solution where the counting is all mushed up with the iteration and filtering into a single loop. We'd have to rewrite the whole thing. This is where functional composition really shines: When you want to change things.

We still have the same input setup. We also still have the same pair-and-filter-and-count logic. The only difference is that the list we're pairing-filtering-counting has changed. That means we need to convert a list of values into a list of sliding averages.

Ah ha! That sounds like an operation that takes a list and returns a list, which sounds highly pipe-able. We'll call it makeSums(), and stick it in right here:

$result = pipe($inputFile,
    file_get_contents(...),
    trim(...),
    explode(PHP_EOL),
    makeSums(...), /// <-- Right here
    pairUp(...),
    afilter(static fn($v): bool => $v[0] > $v[1]),
    count(...),
);

That was easy enough. Now all we need to do is write a function named makeSums() that takes a list of numbers and returns a list of rolling averages. As before, there's probably a reduction-based way to do so but in PHP, it's likely not worth the effort. So let's just do it the old-fashioned way:

function makeSums(array $vals): array
{
    $sum = [];
    $count = count($vals);
    for ($i = 2; $i < $count; ++$i) {
        $sum[] = $vals[$i] + $vals[$i - 1] + $vals[$i - 2];
    }
    return $sum;
}

The function is pure, so from the system's point of view it's still functional, or functional enough. Note that we're starting as $i = 2, (the third element), so we will always have two previous values to compute from.

And there you have it. A functional, or nearly-entirely-functional anyway, solution to today's problem, along with a tour of PHP 8.1 and functional composition along the way. Exciting, eh?

You can find my full source code on GitHub. See you tomorrow for the next challenge.


Want to know more about functional programming and PHP? Read the whole book on the topic: Thinking Functionally in PHP.


Thinking Functionally in PHP

Sort:  

Congratulations @crell! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s):

You received more than 4250 upvotes.
Your next target is to reach 4500 upvotes.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Check out the last post from @hivebuzz:

Feedback from the December 1st Hive Power Up Day
Hive Power Up Month Challenge - Winners List