login

Implicit Scope published on Oct 28, tagged with ruby, rails, work

No one can deny that rails likes to do things for you. The term “auto-magically” comes to mind. This can be a blessing and a curse.

For the most part, rails tries to give you “outs” – a few hoops here and there that, if jumped though, will let you do things in different or more manual ways. Sometimes though, it doesn’t.

Find In Batches

One of the many ORM helpers provided by rails is find_in_batches. It will repeatedly query the database with a limit and offset, handing you chunks of records to work through in sequence. Perfect for processing a very large result set in constant memory.

Order.find_in_batches(:batch_size => 10) do |orders|
  orders.length # => 10

  orders.each do |order|

    # yay order!

  end
end

The problem is that any conditions you add to find_in_batches are inherited by any and all sql performed within its block. This is called “implicit scope” and there’s no way around it.

Why is this an issue? I’m glad you asked, here’s a real life example:

#
# SELECT * from orders
# WHERE orders.status = 'pending'
# LIMIT 0, 10;
#
# adjusting LIMIT each time round
#
Order.find_in_batches(:batch_size => 10,
                      :conditions => 'orders.status' = 'pending') do |orders|

  orders.each do |order|
    #
    # UPDATE orders SET orders.status = 'timing_out'
    # WHERE orders.id     = ?
    #   AND orders.status = 'pending'; <-- oh-hey implicit scope
    #
    order.update_attribute(:status, 'timing_out')

    #
    # some long-running logic to actually "time out" the order...
    #

    #
    # UPDATE orders SET orders.status = 'timed_out'
    # WHERE orders.id     = ?
    #   AND orders.status = 'pending';
    #
    order.update_attribute(:status, 'timed_out')
  end
end

Do you see the problem? The second update fails because it can’t find the order due to the implicit scope. The first update was only successful due to coincidence.

Workaround

I would love to find a simple remove_implicit_scope macro that can get around this issue, but it’s just not there.

I even went so far as to put the update logic in a Proc or lambda hoping to bring in a binding without the implicit scope – no joy.

I had to resort to simply not using find_in_batches.

At the time, I just rewrote that piece of the code to use a while true loop. Thinking about it now, I realize I could’ve factored it out into my own find_in_batches; also, I could put it in a module so you can extend it in your model to have the better (IMO) behavior…

module FindNoScope

  def find_in_batches(options)
    limit = options.delete(:batch_size)
    options.merge!(:limit => limit)

    offset = 0

    while true
      chunk = all(options.merge(:offset => offset))

      break if chunk.empty?

      yield chunk
    end

    offset += limit
  end

end

class Order < ActiveRecord::Base
  extend FindNoScope

  # ...

end

Note that the above was written blind, is completely untested, and will likely not work

Ruby Eval published on Oct 25, tagged with ruby

Ruby’s intance_eval and class_eval are awesome tricks of the language that can really cut down on redundant code or let you do truly dynamic things that you’d have never thought possible.

There’s one piece of confusion around these methods that each book I’ve read goes about explaining in a slightly different way. None of them really clicked for me, so why not write my own?

The two entirely accurate but seemingly paradoxical statements are this:

Use class_eval to define instance methods

Use instance_eval to define class methods

The reason for the backwards-ness is often explained something like this:

x.class_eval treats x as a Class, so any methods you create will be instance methods.

x.intance_eval treats x as an instance, so any methods you create will be class methods.

Well that’s clear as mud…

My take

Here’s how I think about it:

Any methods you define inside of x.instance_eval will be as if you had defined them on the instance x.

Any methods you define inside of x.class_eval will be as if you had written it in the Class x.

Examples should help…

class_eval

Here’s an example of class_eval

class MyClass
  def my_method
    "foo"
  end
end

MyClass.class_eval do
  def my_other_method
    "bar"
  end
end

c = MyClass.new
c.my_other_method
=> "bar"

This is exactly as if you had done the following:

class MyClass
  def my_method
    "foo"
  end

  # oh... the files are /inside/ the computer!
  def my_other_method
    "bar"
  end
end

c = MyClass.new
c.my_other_method
=> "bar"

So we used class_eval to define an instance method. Just like the book said.

Funny thing is, you can easily use class_eval to define class methods too.

class MyClass
end

MyClass.class_eval do
  def self.foo
    "foo"
  end
end

MyClass.foo
=> "foo"

So I think that whole mindset is incorrect. It’s about the context your code is evaluated in, not what you’re intending that matters.

instance_eval

Similarly, here’s how I think when I’m writing something with instance_eval:

c = MyClass.new

# notice we act *on* an instance
c.instance_eval do
  def my_other_other_method
    "baz"
  end
end

c.my_other_other_method
=> "baz"

# we've written that method *on* c, so it only exists for that 
# *instance*...
d = MyClass.new
d.my_other_other_method
=> Error...

This code is identical to

c = MyClass.new

# definition on c
def c.my_other_other_method
  "baz"
end

c.my_other_other_method
=> "baz"

In the second form, it’s clearer that the method only exists on that specific instance.

One other way to look at it is this:

Methods defined with class_eval will be available to every instance of that class (making them instance methods).

Methods defined with instance_eval will only be available to that specific instance; why they’re called “class methods”, I do not know.

Anyway, hope this helps…

Test Driven Development published on Oct 2, tagged with linux, ruby, work

With my recent job shift, I’ve found myself in a much more sophisticated environment than I’m used to with respect to Software Engineering.

At my last position, there wasn’t much existing work in the X++ realm; We were breaking new ground, no one cared about elegance; if you got the thing working – more power to you.

Here, it’s slightly different.

People here are working in a sane, documented, open-source world; and they’re good. Everyone is acutely aware of what’s good design and what’s not. There’s a focus on elegant code, industry standards, solid OOP principles, and most importantly, we practice Test Driven Development.

I’m completely new to this method for development, and I gotta say, it’s quite nice.

Now, I’m not going to say that this is the be-all-end-all of development styles (I’m a functional, strictly-typed, compiler-checked code guy at heart), but I do find it quite interesting – and effective.

So why not do a write-up on it?

Test Framework

The prerequisite for doing anything in TDD is a good test framework. Luckily, ruby is pretty strong in this area. The way it works is the following:

You subclass Test::Unit and define methods that start with test_ where you execute system logic and make assertions about certain results; and then you run that class.

Ruby looks for those methods named as test_whatever and runs them “as tests”. Running a method as a test means that errors and failures (any of your assert methods returning false) will be logged and displayed at the end as part of the “test report”.

All of these test classes can be run automatically by a build-bot and (depending on your test coverage) you get good visibility into what’s working and what’s not.

This is super convenient and empowering in its own right. In a dynamic language like ruby, tests are the only way you have any level of confidence that your most recent code change doesn’t blow up in production.

So now that you’ve got this ability to write and run tests against your code base, here’s a wacky idea, write the tests first.

Test Driven

It’s amazing what this approach does to the design process.

I’ve always been the type that just starts coding. I’m completely comfortable throwing out 6 hours worth of code and starting over. I know my “first draft” isn’t going to be right (though it will be useful). I whole-heartedly believe in refactorings, etc. But most importantly, I need to code to sketch things out. It’s how I’ve always worked.

TDD is sort of the same thing. You do a “rough sketch” of the functionality you’ll add simply by writing tests that enforce that functionality.

You think of this opaque object – a black box. You don’t know how it does what it does, but you’re trying to test it doing it.

This automatically gives you an end-user perspective. You now focus solely on the interface, the input and the output.

This is a wise position to design from.

You also tend to design small self-contained pieces of functionality. Methods that don’t care about state, return the same output for a given input, and generally do one simple thing. Of course, you do this because these are the easiest kind of methods to test.

So, out of sheer laziness, you design a cohesive, easy to use, and completely simple interface, an API.

Now you just have to “plumb it up”. Hack until the tests pass, and you’re done. That might be an over-simplification, but it’s not off by much…

Come to think of it, this is exactly the type of design Haskell favors. With gratuitous use of undefined, the super-high-level logic of a Haskell program can be written out with named functions to “do the heavy lifting”. If you make these functions simple enough and give them descriptive enough names, they practically write themselves.

So that’s TDD (at least my take on it). So far, I like it.