Function-Generating Folds

April 20, 2023 - performance Haskell

Introduction

It is well-known that foldl / foldl' can be defined via foldr (Hutton 1999). Indeed, this is the default implementation in Data.Foldable. However, there are two interesting things about the “foldl-as-foldr” pattern that I want to emphasize in this post:

  1. This pattern is optimized into a tail-recursive loop by GHC
  2. “foldl-as-foldr” generalizes to top-down folds that support early termination.

The Basics

Before we dive into the above points, let’s briefly go over some basics.

foldr supports early termination

Consider the following example:

module Main where

import Prelude hiding (product, foldr)

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z = go
  where
    go [] = z
    go (x:xs) = f x (go xs)

product :: [Int] -> Int
product = foldr (\a b -> if a == 0 then 0 else a * b) 1

main :: IO ()
main = print (product (2 : 3 : 0 : undefined))

This program will print 0, as it doesn’t force the next cons cell, undefined, after encountering 0. Inlining the final call to go we have:

(\a b -> if a == 0 then 0 else a * b) 0 (go undefined)

which \beta-reduces to

if 0 == 0 then 0 else a * go undefined

Note that a saturated call to go with a non-empty list \beta-reduces to a call to the passed in function f. Thus, the caller of foldr determines if the loop continues by forcing the recursive term b in f.

foldl' doesn’t support early termination

module Test where

foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f = go
  where
    go z [] = z
    go z (x:xs) =
      let z' = f z x
      in z' `seq` go z' xs

The caller of foldl' has no such control. A saturated call to go with a non-empty list \beta-reduces to a call to go 1. Thus, even if f doesn’t force anything then we still traverse the spine of the list.

Optimizing “foldl-as-foldr”

To demonstrate how GHC optimizes foldl' when defined through foldr, consider the following example:

module Test where

import Prelude hiding (foldr)

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z =
  let go xs0 = case xs0 of
        [] -> z
        x:xs -> f x (go xs)
  in go


foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f z0 xs0 =
  let g a b z =
        let z' = f z a
        in z' `seq` b z'
  in foldr g id xs0 z0

Dumping the core we see exactly the same core that would be generated for the standard foldl' definition:

foldl' :: forall b a. (b -> a -> b) -> b -> [a] -> b
foldl'
  = \ (@b_ave)
      (@a_avf)
      (f_au5 :: b_ave -> a_avf -> b_ave)
      (z0_au6 :: b_ave)
      (xs0_au7 :: [a_avf]) ->
      joinrec {
        go_syk [Occ=LoopBreaker] :: [a_avf] -> b_ave -> b_ave
        [LclId[JoinId(2)], Arity=2, Str=<S,1*U><L,U>, Unf=OtherCon []]
        go_syk (xs1_agz :: [a_avf]) (eta_B0 :: b_ave)
          = case xs1_agz of {
              [] -> eta_B0;
              : x_agA xs_agB ->
                case f_au5 eta_B0 x_agA of z'_auc { __DEFAULT ->
                jump go_syk xs_agB z'_auc
                }
            }; } in
      jump go_syk xs0_au7 z0_au6

To understand how this is achieved, let’s break this down into a sequence of transformations. We use foldl rather than foldl' for simplicity.

The “let-float-from-application” transformation is described by the first example in §3.3 of (Peyton Jones, Partain, and Santos 1996). The analysis GHC performs to determine if a term should be \eta-expanded is described in (Breitner 2018).

Generalized foldl-as-foldr

Defining foldl as foldr is an application of the general technique of returning a function from foldr (See \text{\S}5 of (Hutton 1999)). We can use this technique to define folds that blend the benefits of foldl and foldr. For example, we can define tail-recursive loops that may terminate early. Defining loops in this way is more than a convenience, as these functions may benefit from foldr/build fusion (Gill, Launchbury, and Peyton Jones 1993) (Breitner 2018).

An important observation about this function-generating technique should be made regarding the b parameter in: \textit{foldr} \; (\lambda \; a \; \textcolor{red}{b} \; st. \; \ldots) \; \ldots The observation is that this parameter becomes a tail recursive call in the generated loop.

To demonstrate this consider the following:

foldl as foldr unconditionally calls b in the argument to foldr, so it unconditionally tail calls itself. If it instead called b conditionally then we have a left fold that can terminate early.

Product example

Let’s return to the foldr product example:

module Test where

product :: [Int] -> Int
product = foldr (\a b -> if a == 0 then 0 else a * b) 1

We have the early termination that we want, but this function isn’t tail-recursive.

Dumping the core and focusing just on the worker of the worker/wrapper transformation we see:

Rec {
-- RHS size: {terms: 21, types: 11, coercions: 0, joins: 0/0}
Test.$wproduct [InlPrag=NOUSERINLINE[2], Occ=LoopBreaker]
  :: [Int] -> GHC.Prim.Int#
[GblId, Arity=1, Caf=NoCafRefs, Str=<S,1*U>, Unf=OtherCon []]
Test.$wproduct
  = \ (w_s1AY :: [Int]) ->
      case w_s1AY of {
        [] -> 1#;
        : y_a1yn ys_a1yo ->
          case y_a1yn of { GHC.Types.I# x_a1zf ->
          case x_a1zf of wild2_Xr {
            __DEFAULT ->
              case Test.$wproduct ys_a1yo of ww_s1B1 { __DEFAULT ->
              GHC.Prim.*# wild2_Xr ww_s1B1
              };
            0# -> 0#
          }
          }
      }
end Rec }

It is plain in the core that we recursively call $wproduct then multiply the result of the recursive call, ww_s1B1, by the list element wild2_Xr.

If we want to optimize this function by making it tail-recursive while retaining the early-termination, we can utilize the function-generating foldr approach.

module Test where

product :: [Int] -> Int
product xs0 = foldr f id xs0 1
  where
    f :: Int -> (Int -> Int) -> Int -> Int
    f nextNum tailRecursiveCall productSoFar = case nextNum == 0 of
      True -> 0
      False ->
        let productSoFar' = productSoFar * nextNum
        in productSoFar' `seq` tailRecursiveCall productSoFar'

Dumping the core for the worker again we see the desired result: a tail-recursive left fold over the list that terminates early if it encounters 0.

Rec {
-- RHS size: {terms: 21, types: 11, coercions: 0, joins: 0/0}
Test.product_$sgo1 [Occ=LoopBreaker]
  :: GHC.Prim.Int# -> [Int] -> Int
[GblId, Arity=2, Caf=NoCafRefs, Str=<L,U><S,1*U>m, Unf=OtherCon []]
Test.product_$sgo1
  = \ (sc_s1JB :: GHC.Prim.Int#) (sc1_s1JA :: [Int]) ->
      case sc1_s1JA of {
        [] -> GHC.Types.I# sc_s1JB;
        : y_a1Gg ys_a1Gh ->
          case y_a1Gg of { GHC.Types.I# x_a1H8 ->
          case x_a1H8 of wild2_Xs {
            __DEFAULT ->
              Test.product_$sgo1 (GHC.Prim.*# sc_s1JB wild2_Xs) ys_a1Gh;
            0# -> lvl_r1JX
          }
          }
      }
end Rec }

-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
lvl_r1JX :: Int
[GblId, Caf=NoCafRefs, Str=m, Unf=OtherCon []]
lvl_r1JX = GHC.Types.I# 0#

References

Breitner, Joachim. 2018. “Call Arity.” Computer Languages, Systems &Amp; Structures 52 (June): 65–91. https://doi.org/10.1016/j.cl.2017.03.001.
Gill, Andrew, John Launchbury, and Simon L. Peyton Jones. 1993. “A Short Cut to Deforestation.” Proceedings of the Conference on Functional Programming Languages and Computer Architecture, July. https://doi.org/10.1145/165180.165214.
Hutton, Graham. 1999. “A Tutorial on the Universality and Expressiveness of Fold.” Journal of Functional Programming 9 (4): 355–72. https://doi.org/10.1017/s0956796899003500.
Peyton Jones, Simon, Will Partain, and André Santos. 1996. “Let-Floating.” Proceedings of the First ACM SIGPLAN International Conference on Functional Programming - ICFP ’96. https://doi.org/10.1145/232627.232630.

  1. Don’t be confused by seq. It is a compiler builtin to control laziness and doesn’t affect the fact that go is tail-recursive. This can be observed in the core generated by this definition of foldl':

    foldl' :: forall b a. (b -> a -> b) -> b -> [a] -> b
    foldl'
      = \ (@ b_atX)
          (@ a_atY)
          (f_agd :: b_atX -> a_atY -> b_atX)
          (eta_B2 :: b_atX)
          (eta1_B1 :: [a_atY]) ->
          joinrec {
            go_svc [Occ=LoopBreaker] :: b_atX -> [a_atY] -> b_atX
            [LclId[JoinId(2)], Arity=2, Str=<L,U><S,1*U>, Unf=OtherCon []]
            go_svc (z_agf :: b_atX) (ds_duS :: [a_atY])
              = case ds_duS of {
                  [] -> z_agf;
                  : x_agh xs_agi ->
                    case f_agd z_agf x_agh of z'_atC { __DEFAULT ->
                    jump go_svc z'_atC xs_agi
                    }
                }; } in
          jump go_svc eta_B2 eta1_B1
    ↩︎