20 March 2014

Frame heap idea for Ruby

After listening to @srawlins great talk about new allocation stats features from @tmm1 in ruby 2.1, I got thinking about garbage collection in ruby.  BTW, MWRC 2014 is awesome so far!

I was wondering if there is some hybrid model that takes advantage of the call stack patterns that exist in server-side apps, or in apps that do a lot of iterative processing work.

If I could annotate a certain stack frame as a collection barrier, then I could write something that would modify collection behavior at that call boundary.  On the way in, I could pre-allocate per-call heaps that would be used up to capacity by all code from that point in the call graph on down.  On the way out, collection could be triggered to get rid of the garbage.

The benefits I can see here are as follows:
  • there is greater locality of reference in the heap for objects allocated within a certain call graph
  • objects that survive the frame heap collection would be merged into the global heap
  • objects that overflow the frame heap space would either cause another linked frame heap to be allocated, or would overflow to the global heap
  • there is greater opportunity for compacting the heap, per-call-graph collections imply compaction
  • different collection barriers can be collected in parallel, because unless the call graphs won't overlap, even if they happen concurrently, so concurrent collection logic is potentially lock-free

Possible points of annotation as collection barrier:
  • rack request
  • http client request
  • json parse / save
  • File.readlines w/ block
  • long-running loop{}s early in the stack

My thinking about heap and garbage collection is formed by what I've seen happen over the years in the HotSpot JVM heap structures.  First there was mark/sweep, with large pauses.  Then generational garbage collection came along, optimized for collecting new garbage, and only doing mark/sweep when an object has survived many early collection cycles.  Then there was the garbage-first collector which did memory region analysis, prioritizing entire regions of heap that are likely to contain the most garbage.

Ruby has seen some garbage collection innovation recently, and there is more to come in ruby 2.2.  However, maybe this is a novel idea that can help out.   Or maybe this idea isn't new and has been tried and has failed, but it was at least worth a try to get this out there.

2 comments:

  1. That's a little tricky. Ruby's current GC can't move things around in memory because not all references have write barriers. C-extensions, and some parts of the interpreter hold onto addresses in memory, which means it'd be hard to have a more typical generational GC with separate heaps. They are working on adding write barriers, but I'd guess it'll take a while for them to be adopted.

    ReplyDelete
    Replies
    1. I see, thank you for explaining that. It would take a change in how references are treated.

      Delete