login

Implicit Scope published on Oct 28, tagged with ruby, rails, work

No one can deny that rails likes to do things for you. The term “auto-magically” comes to mind. This can be a blessing and a curse.

For the most part, rails tries to give you “outs” – a few hoops here and there that, if jumped though, will let you do things in different or more manual ways. Sometimes though, it doesn’t.

Find In Batches

One of the many ORM helpers provided by rails is find_in_batches. It will repeatedly query the database with a limit and offset, handing you chunks of records to work through in sequence. Perfect for processing a very large result set in constant memory.

Order.find_in_batches(:batch_size => 10) do |orders|
  orders.length # => 10

  orders.each do |order|

    # yay order!

  end
end

The problem is that any conditions you add to find_in_batches are inherited by any and all sql performed within its block. This is called “implicit scope” and there’s no way around it.

Why is this an issue? I’m glad you asked, here’s a real life example:

#
# SELECT * from orders
# WHERE orders.status = 'pending'
# LIMIT 0, 10;
#
# adjusting LIMIT each time round
#
Order.find_in_batches(:batch_size => 10,
                      :conditions => 'orders.status' = 'pending') do |orders|

  orders.each do |order|
    #
    # UPDATE orders SET orders.status = 'timing_out'
    # WHERE orders.id     = ?
    #   AND orders.status = 'pending'; <-- oh-hey implicit scope
    #
    order.update_attribute(:status, 'timing_out')

    #
    # some long-running logic to actually "time out" the order...
    #

    #
    # UPDATE orders SET orders.status = 'timed_out'
    # WHERE orders.id     = ?
    #   AND orders.status = 'pending';
    #
    order.update_attribute(:status, 'timed_out')
  end
end

Do you see the problem? The second update fails because it can’t find the order due to the implicit scope. The first update was only successful due to coincidence.

Workaround

I would love to find a simple remove_implicit_scope macro that can get around this issue, but it’s just not there.

I even went so far as to put the update logic in a Proc or lambda hoping to bring in a binding without the implicit scope – no joy.

I had to resort to simply not using find_in_batches.

At the time, I just rewrote that piece of the code to use a while true loop. Thinking about it now, I realize I could’ve factored it out into my own find_in_batches; also, I could put it in a module so you can extend it in your model to have the better (IMO) behavior…

module FindNoScope

  def find_in_batches(options)
    limit = options.delete(:batch_size)
    options.merge!(:limit => limit)

    offset = 0

    while true
      chunk = all(options.merge(:offset => offset))

      break if chunk.empty?

      yield chunk
    end

    offset += limit
  end

end

class Order < ActiveRecord::Base
  extend FindNoScope

  # ...

end

Note that the above was written blind, is completely untested, and will likely not work

Developing In OS X published on Oct 13, tagged with work, mac

As everyone who happens upon this site probably knows, I prefer to develop software in linux. The toolset is just better. Having and being proficient with a good shell is an invaluable tool for working with files. And regardless of what windows-ey, gui-IDE-ey developers like to say – software development is working with plain text files.

My work computer is now a Macbook. It’s about a million steps in the right direction from my last work-provided computer, but it’s still not linux.

That said, it’s damn close. It’s a unix variant originally based on BSD, it’s got a good shell and just about every linux tool I’ve grown accustomed to can be easily and quickly installed and utilized here.

So, this post is intended to describe the things I’ve installed and configured to get my development environment the way I like it on this platform.

The Terminal

It all starts with the terminal… and Terminal.app ain’t it. For a long time, I used iTerm simply because it supported 256 colors which no other Mac terminal does (to be fair, and to paraphrase Jon Stewart, #IDontHaveFactsToBackThatUp)

It was recently that I noticed there was a general lag when scrolling line by line in commandline-vim inside iTerm. This was unacceptable and prompted me to try working in MacVim for some time.

MacVim was fine and all, but then I found there was an iTerm2. There’s no lag in the newer terminal version, the preferences pane seems more thought out, and it’s just generally better. So go out and download iTerm2 as your terminal-of-choice on the Mac.

The Multiplexer

A terminal multiplexer offers a number of benefits. Of these, the biggest ones in my opinion are:

  1. Detach and reattach sessions

If you work in a multiplexer, your terminal never closes. All of your work is bundled up in this workspace-terminal that’s running inside and on top of your real terminal. If your ssh connection dies, your terminal crashes, or you actively “detach”, your work is still sitting there in that workspace. You can pull it up and reattach it to some other terminal whenever you want.

You can also have multiple named sessions which you can detach and reattach to shift gears or just stay generally organized.

  1. Split into regions

In linux I have a great tiling window manager. My desktop can be neatly split into multiple terminals where I can spread out my work.

I don’t really have that on the Mac. I tried for a while to get a good WM going in X11, but it just never clicked. So as an alternative, I can use a multiplexer to split one full-screen iTerm instance into any number of tabs, and/or vertical and horizontal regions.

I typically leave one half-term column for vim (which itself can be split any number of ways) then use the other side for running a tail -f on the log, a mysql console, and possibly autotest or watchr.

  1. Keyboard driven navigation

Navigation between regions, copy/paste, and everything else is completed by fully configurable key bindings. Not needing to reach for the mouse is a huge productivity win for me.

So, what multiplexer?

Well, in my opinion screen does 1 and 3 great. It’s what I use and will always use on linux – when I have the WM to do the screen-splitting.

However, tmux owns in the screen-splitting department. So on the Mac, I recommend tmux. Google around for a good tmux.conf and spend some time with the manpage; you won’t regret it.

The Editor

In my opinion there is no alternative to a good vim setup. Luckily, it works just fine on the Mac. In fact, my vim-config worked without any modifications whatsoever.

If anyone’s interested, here are the plugins I currently roll with:

ls ~/.vim/bundle
additional-surroundings
command-t
haskellmode
hoogle
nerdcommenter
simplefold
supertab
surround
vim-endwise
vim-fugitive
vim-git
vim-rails
vim-ruby

And If you’re not using pathogen, get to googlin.

The Other Stuff

Pretty much any unix commandline utility can be installed via ports or homebrew. I recommend grabbing GNU coreutils so you’ve got a better ls and friends. bash-completion and proctools are two others that will make things feel a bit more linux-ey.

Also do yourself a favor and upgrade bash to 4.0. It comes with globstar which itself is more than worth it.

The Bottom Line

Learn to live in a terminal – use an editor and utilities that fit there. Use a multiplexer like tmux or screen in a quality terminal like iTerm2.

Test Driven Development published on Oct 2, tagged with linux, ruby, work

With my recent job shift, I’ve found myself in a much more sophisticated environment than I’m used to with respect to Software Engineering.

At my last position, there wasn’t much existing work in the X++ realm; We were breaking new ground, no one cared about elegance; if you got the thing working – more power to you.

Here, it’s slightly different.

People here are working in a sane, documented, open-source world; and they’re good. Everyone is acutely aware of what’s good design and what’s not. There’s a focus on elegant code, industry standards, solid OOP principles, and most importantly, we practice Test Driven Development.

I’m completely new to this method for development, and I gotta say, it’s quite nice.

Now, I’m not going to say that this is the be-all-end-all of development styles (I’m a functional, strictly-typed, compiler-checked code guy at heart), but I do find it quite interesting – and effective.

So why not do a write-up on it?

Test Framework

The prerequisite for doing anything in TDD is a good test framework. Luckily, ruby is pretty strong in this area. The way it works is the following:

You subclass Test::Unit and define methods that start with test_ where you execute system logic and make assertions about certain results; and then you run that class.

Ruby looks for those methods named as test_whatever and runs them “as tests”. Running a method as a test means that errors and failures (any of your assert methods returning false) will be logged and displayed at the end as part of the “test report”.

All of these test classes can be run automatically by a build-bot and (depending on your test coverage) you get good visibility into what’s working and what’s not.

This is super convenient and empowering in its own right. In a dynamic language like ruby, tests are the only way you have any level of confidence that your most recent code change doesn’t blow up in production.

So now that you’ve got this ability to write and run tests against your code base, here’s a wacky idea, write the tests first.

Test Driven

It’s amazing what this approach does to the design process.

I’ve always been the type that just starts coding. I’m completely comfortable throwing out 6 hours worth of code and starting over. I know my “first draft” isn’t going to be right (though it will be useful). I whole-heartedly believe in refactorings, etc. But most importantly, I need to code to sketch things out. It’s how I’ve always worked.

TDD is sort of the same thing. You do a “rough sketch” of the functionality you’ll add simply by writing tests that enforce that functionality.

You think of this opaque object – a black box. You don’t know how it does what it does, but you’re trying to test it doing it.

This automatically gives you an end-user perspective. You now focus solely on the interface, the input and the output.

This is a wise position to design from.

You also tend to design small self-contained pieces of functionality. Methods that don’t care about state, return the same output for a given input, and generally do one simple thing. Of course, you do this because these are the easiest kind of methods to test.

So, out of sheer laziness, you design a cohesive, easy to use, and completely simple interface, an API.

Now you just have to “plumb it up”. Hack until the tests pass, and you’re done. That might be an over-simplification, but it’s not off by much…

Come to think of it, this is exactly the type of design Haskell favors. With gratuitous use of undefined, the super-high-level logic of a Haskell program can be written out with named functions to “do the heavy lifting”. If you make these functions simple enough and give them descriptive enough names, they practically write themselves.

So that’s TDD (at least my take on it). So far, I like it.