Regex Result Access Benchmark
The question came on forrst.com about which of the following two styles of accessing the results of a regex match were preferred:
"qqq100601.txt"[/\A([a-z]+)/, 1] "qqq100601.txt".match(/\A([a-z]+)/)[0]
So I benchmarked it and was surprised that there was such a difference in the performance. Except on jruby, the array style access is the clear winner.
Update I went ahead and ran this through RubyProf in 1.8.7. It turns out that #[/REGEXP/] is optimized to one method call and doesn't instantiate the MatchData object. String#match is delegated to Regexp which instantiates MatchData and then accesses the result for a total of 3 method calls. So the real savings is less object churn and method calls.
Speeding up the Haversine Formula in Ruby
A friend of mine's work deals a lot with geographical data. He was working on clustering large sets of coordinates into .25 mile groups and sent me over a small sampling to play with. By small, I mean a 8,736 coordinates. His real data set is much, much larger.
He had started off by utilizing GeoKit, but I noticed that the only thing he was using it for was to calculate distance. His was nicely encapsulated, but this is basically what it was doing.
point_a = Geokit::LatLng.new(39.416454,-118.841204) point_b = Geokit::LatLng.new(39.476181,-118.783931) point_a.distance_to(point_b, :units => :miles)
I assumed there was some overhead involved in the library, so I wrote my own implementation of the haversine formula. I did see a slight performance increase, but not much. Geokit seemed to do a pretty good job at staying efficient.
Enter RubyInline. RubyInline is a gem that allows you to write other languages inside your ruby code. It comes with out of the box support for C/C++. I had been wanting to play with RubyInline for awhile, but nothing had come across my plate that needed that level of optimization.
I translated the haversine formula to C++ using RubyInline and the results were amazing. On my machine I was able to reduce the time it took to calculate 1,000,000 distances from 11.247408 seconds to 0.679327 of a second. It only took me a few minutes to rewrite the method and from an API standpoint it is identical to the pure ruby version.
The next time you're doing work that is easily optimized with C, like this math formula, give a shot and be reap the rewards. The code and benchmarks are on github.
The Danger of the Silent Fail
I came across a bug in an rails app I wrote today. This thing was driving me crazy. I was generating a random 10 digit number for my model, however about one-third of the time in production the number was getting set to the same thing, 2147483647.
I wrote some specs but couldn't reproduce the behavior. I was pretty frustrated. I logged into the server and went into the console to test it out, and it was failing. I'd update the number, save it, reload the instance and there to my dismay was the haunting 2147483647.
What was it? Locally I was using sqlite to run the test, while on the server I was using mysql. 2147483647 is the largest 32bit signed integer. I was overflowing the column, however mysql was silently accepting the larger number, and truncating it to 2147483647.
I would have assumed that trying to set a value greater than available would have raised an exception. Sadly, this wasn't the first time mysql has silently failed on me. It probably won't be the last. Still goes to demonstrate the frustration that can result from unexpected behavior.
Sometimes programming problems are best solved AFK
This afternoon I ran into a problem. I was working on adding in some new form fields on an edit screen. The fields were for an associated model. I was using fields_for in the view and accepts_nested_attributes_for in the model. Everything looked great, but my specs were still failing. The form just wasn't updating the association.
I desk checked the code. I tried it in script/console. I checked the controller logs. I tried everything I could think of. I struggled for an hour or so before I called in for help. By then end of the day, neither a more experienced developer or myself could figure out why it wasn't working. We were ready to blame plugins, rails, or anything else. Looking over the code, everything looked correct.
Calling the @my_model.my_association_attributes= worked, but the controller was never calling it! Can you guess what the culprit was? The model had attr_accessible set!
So is the moral of the story to add the association_attributes on models with attr_accessible? Not exactly, see I didn't solve the problem at my desk. I solved the problem on the way home from church about two and a half hours later. I wasn't even thinking about the problem when it popped in my head.
Sometimes the right course of action is to just step away from the problem for awhile, and let our mind work on it. A week or so ago, I had struggled for a couple hours trying to figure out how to solve a particular problem cleanly. I had about half a dozen ways to solve it, but thinking of actually implementing any of them made me feel dirty. I knew there was a good way to solve this problem.
How did I come up with a solution? I slept. When I woke up, brushed my teeth, and fired up textmate the next morning in my head was a elegant (and obvious) solution to the problem at hand. Sometimes our minds just need a break.
Next time you get frustrated, step away for a bit. Let your mind relax and refresh. AFK isn't as bad as it sounds, and sometimes it's medicinal.
You Failed. Autotest with a voice.
While listening to a talk from Ruby Conf 08, I heard Joe Martinez mention the say command in OS X. I figured I'd pop open my .autotest file and wire it up so it'd give me some motivation for when my specs go red.
Here is what I added:
Autotest.add_hook :red do |autotest| `say -v "Good News" "You're doing it wrong"` end

