Using PSR-3 placeholders properly

in PHPlast year (edited)

In the last 2 years or so, I've run into a number of projects that claim to use the PSR-3 logging standard as published by the PHP Framework Interoperability Group (PHP-FIG, or just FIG). Unfortunately, it's quite clear that those responsible for the project have not understood PSR-3 and how it is intended to work. This frustrates me greatly, as PSR-3's design addresses a number of issues that these projects are not benefiting from, and it reduces interoperability between projects (which was the whole point in the first place).

Rather than just rant angrily online (fun as it is, it doesn't actually accomplish anything), many of my PHP community colleagues encouraged me to blog about using PSR-3 properly. So, here we are.

The Short, Short version

If you just want the final point, here it is.

If you're writing this:

$logger->info("User $userId bought $productName");

Then you're doing it wrong, abusing PSR-3, and may have a security attack vector. You need to switch to doing this instead:

$logger->info("User {userId} bought {productName}", [
   'userId' => $userId,
   'productName' => $productName,
]);

And if your logger isn't handling that properly, it means you have it misconfigured and need to fix your configuration.

If your project's documentation is telling you to do the first one, then your project's documentation is wrong, and it should be fixed.

If you want to understand why, read on.

Backstory and credentials

If you don't care about the history or who I am to dare to talk about "you're doing it wrong," go ahead and skip this section.

PSR-3 was FIG's first "interface" specification, back in 2013. In all honesty, it's not FIG's best work, even if it is the most widely-used. We were still not entirely sure how best to do interface specs, FIG's process at the time was still rather slip-shot, etc. We've gotten a lot better since then, especially under the FIG v3 model we use today, but that's another matter. The point is, that PSR-3 does have a bit too much weasel room in it, which is undoubtedly at least a part of the problem.

PSR-3 was spearheaded by Jordi Boggiano, who among other things is/was the maintainer of Monolog, the logging framework used by Symfony, Laravel, and many, many others. A number of people were involved in its design, however, myself included.

At the time, I was the FIG representative for the Drupal project (I'm still heavily involved in FIG, but not Drupal), and used the experience of Drupal's own logger to drive a number of features in PSR-3. In particular, the placeholder functionality that so many projects seem to be unaware of. I later also rewrote Drupal's logging framework to implement PSR-3 and, nearly a decade later, rewrote much of TYPO3's logging system to conform to PSR-3 properly.

(For the sake of completeness and modesty, I'll note that including LoggerAwareInterface was my idea, which is a mistake I have regretted ever since. More on that later.)

Context and placeholders

The $context array in the PSR-3 spec serves double-duty. One, it is a place to provide "extra contextually-relevant stuff" with a log message that the caller thinks may be useful. In particular, the exception key is reserved for passing the exception that caused a log message to be needed, if any. Any other keys are left up to the caller, and may be anything. Per the spec:

This is meant to hold any extraneous information that does not fit well in a string. The array can contain anything.

However, there is one particular use for them that encompasses nearly all remaining cases: Interpolating into the message string, via placeholders.

From the spec:

The message MAY contain placeholders which implementors MAY replace with values from the context array.

Placeholder names MUST correspond to keys in the context array.

Placeholder names MUST be delimited with a single opening brace { and a single closing brace }. There MUST NOT be any whitespace between the delimiters and the placeholder name.

Placeholder names SHOULD be composed only of the characters A-Z, a-z, 0-9, underscore _, and period .. The use of other characters is reserved for future modifications of the placeholders specification.

Implementers MAY use placeholders to implement various escaping strategies and translate logs for display. Users SHOULD NOT pre-escape placeholder values since they can not know in which context the data will be displayed.

That second "MAY" in the first line is part of the problem, but the intent here is clear: Context values should not be injected into the message string at call time. The meta-document for the spec (which was added much later, because PSR-3 was written before FIG invented meta-documents) notes:

## Static log messages

It is the intent of this specification that the message passed to a logging method always be a static value. Any context-specific variability (such as a username, timestamp, or other information) should be provided via the $context array only, and the string should use a placeholder to reference it.

The intent of this design is twofold. One, the message is then readily available to translation systems to create localized versions of log messages. Two, context-specific data may contain user input, and thus requires escaping. That escaping will be necessarily different if the log message is stored in a database for later rendering in HTML, serialized to JSON, serialized to a syslog message string, etc. It is the responsibility of the logging implementation to ensure that $context data that is shown to the user is appropriately escaped.

There's not much I can add to the metadoc's reasoning. Passing potentially user-provided data into a log message is a security risk, and the escaping mechanism needed for them is not known at call time. Doing this, therefore, is a security risk:

$logger->info("Failed user login for {$_POST['username']}.");

just as much as doing that in a raw SQL query is. Just as you should always use prepared statements for SQL, you should always use placeholders with a log message, for the same reasons.

Instead, do this:

$logger->info("Failed user login for {username}.", [
   'username' => $_POST['username'],
]);

for all the same reasons.

How did we get here?

I suspect there's a couple of reasons why I keep seeing projects misusing PSR-3, and not using placeholders.

  1. The original spec was annoyingly unclear about it, and used lots of MAY and SHOULD weasel-worlds. This is on FIG; we should have been a lot more specific about what placeholders were for, and that supporting them was mandatory. Hopefully the metadoc text (quoted above) helps to clarify that.
  2. In Monolog, that interpolation doesn't happen by default. That's sensible, in a way, because you only want to run that interpolation when writing to the log file sometimes; if you're writing to a database for later display in HTML, for instance, you want to do that interpolation later on display, not when writing out to the log. However, it does mean that the default behavior often does not include it. You need to enable the PsrLogMessageProcessor yourself.
  3. Symfony's logging documentation doesn't include any up-front examples of using dynamic values at all. It's not wrong, it just skips over that part and spends more time talking about configuring the logging backend. While context values are mentioned, they're not mentioned as placeholders. (UPDATE: This has been fixed)
  4. Laravel's logging documentation, however, is actively wrong. It includes examples such as this, which actively encourage developers to concatenate random values into the log message. This is, as explained above, bad practice and potentially a security risk:
class UserController extends Controller
 {
     /**
      * Show the profile for the given user.
      */
     public function show(string $id): View
     {
         Log::info('Showing the user profile for user: '.$id);
 
         return view('user.profile', [
             'user' => User::findOrFail($id)
         ]);
     }
 }

(UPDATE: This has been fixed)

As most application developers will not deal with PSR-3 directly, only via their framework (which is fine, and appropriate), most developers are introduced to PSR-3 only via the Symfony or Laravel docs... both of which lead them to the wrong conclusion!

What can we do about it?

Each of these causes has a straightforward solution.

  1. FIG can do a better job of publicizing the "correct" usage. That's why the metadoc was added, and why this blog post exists. (Later specs were a lot better about explaining the "Why" behind things.)
  2. Monolog could do a better job of highlighting the placeholder settings in its documentation, and how to flip that on.
  3. Symfony's docs team can include a better example that includes a placeholder on the logging page. It doesn't have to be long, just one more example at the top showing how the placeholder works. (UPDATE: They did!)
  4. Laravel's docs team can fix their examples to not recommend insecure behavior, and show a good use of placeholders instead. (UPDATE: They did!)

If you're reading this, it means we've already done step 1.

I would therefore put the call to the Symfony and Laravel docs teams to update their documentation, too. That may involve adjusting the Monolog default configuration for those frameworks as well, which is fine. These two pages are where most PHP developers "learn how to log." Please, teach them how to do so securely!

An apology for LoggerAwareInterface

As an aside, I want to apologize for LoggerAwareInterface. The intent was to allow for easier auto-wiring of services that needed a logger. From my original suggestion on the matter:

Something else that occurred to me: In Symfony, there are a couple of "Aware" interfaces, such as ContainerAwareInterface, which define standard setter-injection methods. They're not complicated, but allow for post-instantiation setup. Should we provide a LoggerAwareInterface? Something like:
interface LoggerAwareInterface {
   public function setLogger(LoggerInterface $logger);
   // I don't think we need a getter in this case, but I don't mind it.
}

Which would then allow for:

$obj = $container->getByWhatever($service_requested);
   if ($obj instanceof LoggerAwareInterface) {
   $obj->setLogger($this->logger);
   }
}

In my defense, modern practices around DI Containers and good OO practices in PHP were still in their infancy in 2013, and I was still learning rapidly myself. It was an off-the-cuff suggestion based on being only recently exposed to Symfony 2. However, these days nearly all DI Containers support auto-wiring, making such one-off hacky mechanisms entirely unnecessary. That also allows greater control over using different logging channels (different logging objects) for different services, via many mechanisms, and results in a cleaner object graph.

As the person who originally proposed LoggerAwareInterface, I am asking you to please never use it again. Thank you.

Sort:  

Thanks for your contribution to the STEMsocial community. Feel free to join us on discord to get to know the rest of us!

Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).

You may also include @stemsocial as a beneficiary of the rewards of this post to get a stronger support. 
 

As an addendum, thank you to the Symfony docs folks who have already addressed the highlighted issue and updated their Logging documentation accordingly.

https://reddit.com/r/PHP/comments/11k2o6i/on_the_use_of_psr3_placeholders/
The rewards earned on this comment will go directly to the people sharing the post on Reddit as long as they are registered with @poshtoken. Sign up at https://hiveposh.com.

And now Laravel has updated their configuration to default new projects to interpolating PSR-3 placeholders, too!

The Laravel docs have now also been updated accordingly!