1Jul/104
Regex Result Access Benchmark
The question came on forrst.com about which of the following two styles of accessing the results of a regex match were preferred:
"qqq100601.txt"[/\A([a-z]+)/, 1] "qqq100601.txt".match(/\A([a-z]+)/)[0]
So I benchmarked it and was surprised that there was such a difference in the performance. Except on jruby, the array style access is the clear winner.
Update I went ahead and ran this through RubyProf in 1.8.7. It turns out that #[/REGEXP/] is optimized to one method call and doesn't instantiate the MatchData object. String#match is delegated to Regexp which instantiates MatchData and then accesses the result for a total of 3 method calls. So the real savings is less object churn and method calls.


July 1st, 2010 - 11:38
Thanks, this is good to know.
Why is this?
July 1st, 2010 - 12:27
I was curious about this too, but didn’t take the time to investigate last night. I updated the post with results from ruby-prof. Thanks for motivating me.
July 1st, 2010 - 20:33
I figured it would be something like that. Thanks for the explanation. So what’s the story with jRuby and Rubinius?
July 2nd, 2010 - 02:04
I don’t know enough about the specific implementations of jRuby and Rubinus to say.