When Your Hash Becomes a String: Hunting Ruby’s Million-to-One Memory Bug

72 days ago 12 views Closer to Code mensfeld.pl

Table of Contents

  • 1 The Impossible Error
  • 2 Investigating the musl Hypothesis
  • 3 The Moment Everything Stopped Making Sense
  • 4 Down the Rabbit Hole
  • 5 Reproducing the Bug
  • 6 The Microsecond Window
  • 7 What This Means for Ruby's Memory Model
  • 8 The Fix and The Future
  • 9 Lessons From the Hunt
  • 10 Acknowledgments
  • 11 The Bottom Line

Every developer who maintains Ruby gems knows that sinking feeling when a user reports an error that shouldn't be possible. Not "difficult to reproduce", but truly impossible according to everything you know about how your code works.

That's exactly what hit me when Karafka user's error tracker logged 2,700 identical errors in a single incident:

NoMethodError: undefined method 'default' for an instance of String vendor/bundle/ruby/3.4.0/gems/karafka-rdkafka-0.22.2-x86_64-linux-musl/lib/rdkafka/consumer/topic_partition_list.rb:112 FFI::Struct#[]

NoMethodError: undefined method 'default' for an instance of String vendor/bundle/ruby/3.4.0/gems/karafka-rdkafka-0.22.2-x86_64-linux-musl/lib/rdkafka/consumer/topic_partition_list.rb:112 FFI::Struct#[]

The error was because something was calling #default on a String. I had never used a #default