TJ Singleton Software Engineer, Baptist Preacher

1Jul/104

Regex Result Access Benchmark

The question came on forrst.com about which of the following two styles of accessing the results of a regex match were preferred:

"qqq100601.txt"[/\A([a-z]+)/, 1]
"qqq100601.txt".match(/\A([a-z]+)/)[0]

So I benchmarked it and was surprised that there was such a difference in the performance. Except on jruby, the array style access is the clear winner.

Benchmark and Raw Results

Update I went ahead and ran this through RubyProf in 1.8.7. It turns out that #[/REGEXP/] is optimized to one method call and doesn't instantiate the MatchData object. String#match is delegated to Regexp which instantiates MatchData and then accesses the result for a total of 3 method calls. So the real savings is less object churn and method calls.

Tagged as: Leave a comment
Comments (4) Trackbacks (0)
  1. Thanks, this is good to know.

    Why is this?

  2. I figured it would be something like that. Thanks for the explanation. So what’s the story with jRuby and Rubinius?


Leave a comment

(required)

No trackbacks yet.