Skip to content

Fix a space leak in list traversal#27

Open
treeowl wants to merge 1 commit intohaskellari:masterfrom
treeowl:list-fusion
Open

Fix a space leak in list traversal#27
treeowl wants to merge 1 commit intohaskellari:masterfrom
treeowl:list-fusion

Conversation

@treeowl
Copy link

@treeowl treeowl commented Dec 26, 2022

Previously, we used zip to define itraverse for lists. This led to two problems:

  1. Because the zip fused with the index generator, it could not fuse with the argument.

  2. I ran into situations where the zip didn't fuse with the index generator, so my code ended up actually building and saving [0..] as a CAF. That's a nasty space leak, as well as slow.

Writing itraverse for lists using foldr directly seems to clear up these issues. Unboxing the counter manually should prevent Int boxes from being allocated when the passed function doesn't need them.

itraverseListStarting :: forall f a b. Applicative f => Int -> (Int -> a -> f b) -> [a] -> f [b]
itraverseListStarting (I# i0) f = \xs -> foldr go stop xs i0
where
go x r !i = liftA2 (:) (f (I# i) x) (r (i +# 1#))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be possible to do without relying on more unsafe (Safe Haskell wise) modules. I won't import GHC.Exts.

Copy link
Collaborator

@phadej phadej Dec 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And apparently this is one additional instance where -ffull-lazyness is a pessimisation.

I won't accept this kind of patch without a reference to a GHC issue, which IMO this is. The [0..] in itraverse implementation shouldn't be memoized and GHC should have a means to express that, or it should not use -ffull-lazyness by default.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phadej Please propose an alternative to GHC.Exts that I can use.

Copy link
Collaborator

@phadej phadej Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we unbox the counter by hand? GHC /can/ do that itself, but for some
reason it only happens with @-O2@, and we use the standard @-O1@.

Report a GHC issue. We should know why we write code the way we do. It might be a GHC bug, or it might be an inherent limitation (i.e. GHC cannot do better). We should know the reason.

Copy link
Collaborator

@phadej phadej Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we used `zip` to define `itraverse` for lists. This led to
two problems:

1. Because the zip fused with the index generator, it could *not* fuse
   with the argument.

2. I ran into situations where the zip *didn't* fuse with the index
   generator, so my code ended up actually building *and saving* `[0..]`
   as a CAF. That's a nasty space leak, as well as slow.

Writing `itraverse` for lists using `foldr` directly seems to clear up
these issues. Unboxing the counter manually should prevent `Int` boxes
from being allocated when the passed function doesn't need them.
Copy link
Collaborator

@phadej phadej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize:

  • I'm not a fan of needing to unbox Int manually. I'd like to understand what makes -O2 do it for us.

  • FunctorWithIndex and FoldableWithIndex instances are not using zip [0..] pattern, and it makes sense to be uniform. The imap and ifoldr are not written using foldr, so either:

    • They should be rewritten with foldr as well
    • Or itraverse can be written using manual recursion too (and probably GHC will be smart enough to unbox Int then). This is "worse" because list fusion won't fire, but I actually don't care, as list fusion is unreliably anyway. (And e.g. RULES dance around GHC.Base.map is too complicated).

    So I lean towards the latter option: writing itraverse for lists with explicit recursion.

    The itraverseListStarting helper is good, as it's useful in NonEmpty instance as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants