Rodrigo Flores's Corner Code, Cats, Books, Coffee

Book Review: Javascript: The good parts

After reading Douglas Crockford chapter on Coders at Work (and his criticism on Javascript) and being recommended to read it by several friends, I got eager to read his book about Javascript: Javascript: The good parts. So, I spent last carnival holiday to read it.

I think the main proposal of this book is to not get really deep on all Javascript features, but to overview good features of the language (and also the most common pitfalls). It could also serve as a introduction to Javascript to someone that knows his way around programming (i.e. knows what a for, a while and a switch means), as it briefly covers the Javascript syntax (but does not goes deep on it). It also covers arrays, objects, regular expressions and how to deal with exceptions.

Object oriented Javascript

As Javascript is normally used for small manipulation of pages (like filling a cities select input through AJAX after selecting a state), it can be written as a procedural script. But as things evolve, more and more code like this end up being added, and like a snowball you Javascript code end up becaming a mess and every time you have to touch it, you start considering rewriting it.

On being a Object Oriented language, it is possible to split responsabilities in objects and use important concepts of the Object Oriented world like encapsulation and inheritance. The book also gives a good overview on how prototypes work and how to use them to create objects based on other objects.

Functions as objects

One of the greatest features of Javascript is that functions are objects (just like a number, a string or an array), so it is possible to store functions on a variable, pass them as an argument or make functions return functions. If you know how to listen to events, you probably did this by adding a function to be executed when an event is triggered.

The book explains the benefits of this property: cascading, currying, how to create a memoization mechanism, and callbacks. It also explains important concepts on functions: dealing with an arbitrary number of arguments, closures and how scopes in JS work.

Language downsides and JSLint

This chapters are on the appendix, but I think they deserve a mention as they explain why caring for good code in JavaScript is important (and how JSLint you can help you with this).

JavaScript has a lot of pitfalls: a naive a = 1 expression, when a was not declared turns a it into a global variable, the typeof behaviour for NaN is strange (who would think that typeof NaN is Number), the weak-typing misleadings ('1' + 1 being '11') and the evil == that does type coercion to both sides of the expression and end up doing nasty things like evaluation to true '0' == false and 0 == false. The book goes through all these things and explains the problems with them.

Fortunately, there is a tool (created by the author) to catch this problems before your code goes to production: JSLint . You can use this tool to check for some traps in your Javascript code (and you can also specify when you want to do some things in purpose, so JSLint will not complain about them). The book has an appendix that explains the options on JSLint.

API Documentation

One part that I think could be improved on this book is the API documentation or a big method overview . I think book pages are not the right place for an API documentation, as it is really boring to read and although you may learn a thing or another, the biggest part of it you end up reading and forgetting a few days later.

There is a chapter that goes through small set of methods of all the standard types: Strings, Arrays, Numbers, Regexps etc. It is really well written and contains good examples but I don’t think a book telling the language good parts should do a method overview.

Conclusion

This book is a must-read if you touch any JavaScript code, as it is normally easy to write JS code, but difficult to write good quality Javascript code due to its inumerous traps and OO and function features of the that are not known to a major extent of the developer community. Even if you consider yourself fluent in JavaScript and write it for a living, you will for sure learn a new trick or something new on the language by reading this book.

The author wrote an article about how JavaScript is misunderstood, that is also worth reading if you are not a big JS fan.

Letting Git Bisect Help You

Git has a wealth of tools that can improve our workflow. One of them is bisect, which is not as well known since it doesn’t get used too much, but can still prove itself very useful when you need to discover that exact point where a good branch became a bad branch. Which commit broke something? That’s where bisect comes into play.

Bisect is based on the binary search algorithm. Given a sorted array of elements, it returns the index where an element can be found (or whether it doesn’t exist in the collection), based on the following invariant: once you have compared an element, you can discard all elements that come before or after based on the outcome of the comparison.

If you’ve ever been to a library, you may have noticed that books are sorted by author within a particular section. Say you’re looking for a book whose author’s last name is Martin, and you have just found out the author of the last book on a given shelf starts with an F, you can then be sure the right shelf is somewhere after this one. By following this method, you can narrow your search until you find the book you’re looking for.

Binary search works the same way. If you’re searching for an x element in an array with n elements, you pick the n/2 element and compare it with x. If it is greater, you do the same with the sub-array starting at (n/2)+1 and ending at n. If it is less, you do the same with the sub-array that starts at 1 and ends at n/2, and then proceed recursively. Here’s an interesting demonstration of this algorithm.

As for the number of comparisons, it is on average a lot smaller than you would get from the naive search algorithm (where you compare each element in the array). It is also faster on average and very useful to find elements on a sorted collection.

Bisect

Bisect uses binary search to find the commit (i.e. the diff) that introduced a particular change to your branch.

To be able to know where a bug appeared, first you need a way to detect it. You can write a test for it – or, if that’s not possible, try to have at least a list of steps that reproduce it.

With this test in hand, you can start bisecting your code. Remember you must have two commits: one where you’re sure the bug exists and one where you’re sure that the bug does not exist. You can use hashes and tags to point to these commits. When you bisect you mark a commit as good (the test passes) or bad (the test fails).

In the diagram above, you can see the steps that the bisect performs (green points are good commits, and red ones are bad commits): it moves to the middle of the commits after this (in case of a good commit) or to the middle of the commits before this (in case of a bad commit), ending when it finds the first bad commit (in our case the commit number 6) after 3 steps.

I have generated a git repository with 1024 commits appending integers (from 1 to 1024) on a text file, one integer per line. Our mission is to find which commit introduced the integer 1013 (suppose having this number in our file is a bug).

First, we find a way to quickly determine whether this number is present. Let’s call this the test. Doing this is very simple:

$ grep 1013 file.txt

With this command we can start doing our bisect:

$ git bisect start

Running this command on master returns the not-desired result, so we will mark it as bad.

$ git bisect bad

Now, let’s point out a good commit: suppose the first commit (7c0dcfa) is free of this bug. So we can either checkout to this commit and mark it as good or run git bisect bad with the hash commit.

$ git bisect good 7c0dcfa
Bisecting: 511 revisions left to test after this (roughly 9 steps)
[8950f7db7e7cad0b2dc394ff9b75fc3d38c9d72a] added 512

A short way to roll those three commands into one is this:

$ git bisect start master 7c0dcfa

The start command accepts the bad and the good (in this order) as arguments.

Nice, git bisect has put us in the middle of the collection. Running the test here doesn’t reveal anything, so we mark it as good.

$ git bisect good
Bisecting: 255 revisions left to test after this (roughly 8 steps)
[a01ba83f3500b48da97c5f5c33052623aaa4161a] added 768

Again, the test returns the desired state. Good it is.

$ git bisect good
Bisecting: 127 revisions left to test after this (roughly 7 steps)
[4a4a668bf3363d09af5fd1906bc4272aacdb4495] added 896

Good again.

$ git bisect good
Bisecting: 63 revisions left to test after this (roughly 6 steps)
[9059c5b8b898159e8d1d797bff3b1febd1fd6a1c] added 960

Take a look at the message: apart from the current commit and how many revisions are between this one and the first bad one, it tells you how many times you are expected to make this comparison: 6 times. As it is good again, we may mark it as good.

$ git bisect good
Bisecting: 31 revisions left to test after this (roughly 5 steps)
[0c844d0b33ef297b742206ebc293f4925705b083] added 992

Again, we’re in a good state.

$ git bisect good
Bisecting: 15 revisions left to test after this (roughly 4 steps)
[0ee17eb17bd96b321a01c73eb13a8929a68b1239] added 1008

4 steps remaining, and we’re in a good state.

$ git bisect good
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[dfb1e71736dcfffa2a30aecd7299f45f757c057e] added 1016

But now, running the test shows us the non-desired result. So, let’s mark it as bad.

$ git bisect bad
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[6e6d08c374df5162fed65fed82859b69f86b936e] added 1012

We’re near the end and we have a desired result, so let’s mark it as good.

$ git bisect good
Bisecting: 1 revision left to test after this (roughly 1 step)
[1d23b7045a8accd254efa859d7fc66f1f58a59f0] added 1014

Almost at the end, we have a bad commit.

$ git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[458eab0eb8d808e16d98ec7039a7c53855dd9ed6] added 1013

Last step and bad it is.

$ git bisect bad
458eab0eb8d808e16d98ec7039a7c53855dd9ed6 is the first bad commit
commit 458eab0eb8d808e16d98ec7039a7c53855dd9ed6
Author: Rodrigo Flores <mail@rodrigoflores.org>
Date:   Tue Oct 21 22:31:05 2014 -0200

    added 1013

:100644 100644 7bc3db7f48a43ccf1a8cc7c26146912cc88c1009 b393a2138a96c1530f41f701ab43cca893226976 M  file.txt

So, now we finally have the commit that introduced the number 1013, as we wanted. You can review the steps taken by using git bisect log.

git bisect start
# bad: [740cdf012013dc41a39b41d4b09b57a970bfe38f] added 1024
git bisect bad 740cdf012013dc41a39b41d4b09b57a970bfe38f
# good: [7c0dcfa7514379151e0d83ffbf805850d2093538] added 1
git bisect good 7c0dcfa7514379151e0d83ffbf805850d2093538
# good: [8950f7db7e7cad0b2dc394ff9b75fc3d38c9d72a] added 512
git bisect good 8950f7db7e7cad0b2dc394ff9b75fc3d38c9d72a
# good: [a01ba83f3500b48da97c5f5c33052623aaa4161a] added 768
git bisect good a01ba83f3500b48da97c5f5c33052623aaa4161a
# good: [4a4a668bf3363d09af5fd1906bc4272aacdb4495] added 896
git bisect good 4a4a668bf3363d09af5fd1906bc4272aacdb4495
# good: [9059c5b8b898159e8d1d797bff3b1febd1fd6a1c] added 960
git bisect good 9059c5b8b898159e8d1d797bff3b1febd1fd6a1c
# good: [0c844d0b33ef297b742206ebc293f4925705b083] added 992
git bisect good 0c844d0b33ef297b742206ebc293f4925705b083
# good: [0ee17eb17bd96b321a01c73eb13a8929a68b1239] added 1008
git bisect good 0ee17eb17bd96b321a01c73eb13a8929a68b1239
# bad: [dfb1e71736dcfffa2a30aecd7299f45f757c057e] added 1016
git bisect bad dfb1e71736dcfffa2a30aecd7299f45f757c057e
# good: [6e6d08c374df5162fed65fed82859b69f86b936e] added 1012
git bisect good 6e6d08c374df5162fed65fed82859b69f86b936e
# bad: [1d23b7045a8accd254efa859d7fc66f1f58a59f0] added 1014
git bisect bad 1d23b7045a8accd254efa859d7fc66f1f58a59f0
# bad: [458eab0eb8d808e16d98ec7039a7c53855dd9ed6] added 1013
git bisect bad 458eab0eb8d808e16d98ec7039a7c53855dd9ed6
# first bad commit: [458eab0eb8d808e16d98ec7039a7c53855dd9ed6] added 1013

We have on this example 1024 commits, and traversing through all of them took 10 steps. If we had twice this number of commits, it would take us only one more step to find the desired commit, as the efficiency in the worst case scenario is O(log n).

Even so, it is a tedious task to run the same command 10 times. So, let’s think about a way to automate it.

Automating

Git bisect can use a script return code to determine whether a revision is good or bad. If it is 0, it is good; if it is not 0, it is considered bad. Luckily, grep has a return code of 0 if it finds a line with the desired pattern and 1 if it doesn’t find a line. You can check this by running echo $? after running the test.

This is the opposite of good and bad, so we need to write a shell script that negates the result (I’m not used to writing shell script, so please let me know if there’s a cleaner way to do so).

#!/bin/sh

if [[ `grep 1013 file.txt` ]]; then
  exit 1
else
  exit 0
fi

Let’s save it to a file called test.sh and give it permission to execute.

Then, after marking the good and the bad commits, you just need to run:

git bisect run ./test.sh
running ./ola.sh
Bisecting: 255 revisions left to test after this (roughly 8 steps)
[a01ba83f3500b48da97c5f5c33052623aaa4161a] added 768
running ./ola.sh
Bisecting: 127 revisions left to test after this (roughly 7 steps)
[4a4a668bf3363d09af5fd1906bc4272aacdb4495] added 896
running ./ola.sh
Bisecting: 63 revisions left to test after this (roughly 6 steps)
[9059c5b8b898159e8d1d797bff3b1febd1fd6a1c] added 960
running ./ola.sh
Bisecting: 31 revisions left to test after this (roughly 5 steps)
[0c844d0b33ef297b742206ebc293f4925705b083] added 992
running ./ola.sh
Bisecting: 15 revisions left to test after this (roughly 4 steps)
[0ee17eb17bd96b321a01c73eb13a8929a68b1239] added 1008
running ./ola.sh
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[dfb1e71736dcfffa2a30aecd7299f45f757c057e] added 1016
running ./ola.sh
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[6e6d08c374df5162fed65fed82859b69f86b936e] added 1012
running ./ola.sh
Bisecting: 1 revision left to test after this (roughly 1 step)
[1d23b7045a8accd254efa859d7fc66f1f58a59f0] added 1014
running ./ola.sh
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[458eab0eb8d808e16d98ec7039a7c53855dd9ed6] added 1013
running ./ola.sh
458eab0eb8d808e16d98ec7039a7c53855dd9ed6 is the first bad commit
commit 458eab0eb8d808e16d98ec7039a7c53855dd9ed6
Author: Rodrigo Flores <mail@rodrigoflores.org>
Date:   Tue Oct 21 22:31:05 2014 -0200

    added 1013

:100644 100644 7bc3db7f48a43ccf1a8cc7c26146912cc88c1009 b393a2138a96c1530f41f701ab43cca893226976 M  file.txt
bisect run success

And then, you’ve found your broken commit without hassle.

Conclusion

You probably you won’t use it everyday. Maybe once a week or even once a month. But git bisect can give you a hand to find a broken commit. Keeping the practice of doing small commits, while not required, can help this process a lot. Finding out the culprit commit won’t be of much help if you’re forced to look through a gigantic diff. So, having small commits may help you in the long run, even if that means having too many commits.

What about you? Do you use bisect frequently?

Kudos for the friends @mdepolli and @p_balduino for the kind reviews!

Ruby Patterns: Webservice object

This is a series about Ruby Patterns, which will explain some common uses of Ruby syntax. The second post is about a webservice based. I like to call it a pattern because it is very common and tends to repeat (on a not-duplicated way) on Service Oriented Architecture based applications. Of course, this code may be too sophisticated for such a small script like this, but it may be a good way to handle things on a more complex application.

So, you’re given the task to write a class that access a webservice and returns the info on it (e.g. the github repos for a given organization). A simplistic implemetation can be like this:

require 'faraday'
require 'json'

def retrieve_repos_for(org)
  connection = Faraday.new(:url => 'https://api.github.com') do |faraday|
    faraday.adapter  Faraday.default_adapter
  end

  response = JSON.parse(connection.get("/orgs/#{org}/repos").body)
  response
end

retrieve_repos_for('github').each do |repo|
  puts repo['clone_url']
end

Obviously, this is an example of a procedural implementation, so, let’s make it more object oriented.

require 'faraday'
require 'json'

module Github
  class Organization
    def initialize(organization)
      @organization = organization
    end

    def repos
      connection = Faraday.new(:url => 'https://api.github.com') do |faraday|
        faraday.adapter  Faraday.default_adapter
      end

      response = JSON.parse(connection.get("/orgs/#{@organization}/repos").body)
      response
    end
  end
end

Github::Organization.new('github').repos.each do |repo|
  puts repo['clone_url']
end

Nice, we now have it inside a class. But we can extract some private methods here.

require 'faraday'
require 'json'

module Github
  class Organization
    def initialize(organization)
      @organization = organization
    end

    def repos
      response = JSON.parse(connection.get(repos_url).body)
      response
    end

    private

    def connection
      Faraday.new(:url => 'https://api.github.com') do |faraday|
        faraday.adapter  Faraday.default_adapter
      end
    end

    def repos_url
      "/orgs/#{@organization}/repos"
    end
  end
end

Github::Organization.new('github').repos.each do |repo|
  puts repo['clone_url']
end

Well, the public methods seems to be more concise now and we have extracted some methods that can be more easily reused. But there is a few flaws: if we call the repos method twice, it will make two requests, but this is easy to solve: just add some memoization.

require 'faraday'
require 'json'

module Github
  class Organization
    def initialize(organization)
      @organization = organization
    end

    def repos
      @repos ||= JSON.parse(connection.get(repos_url).body)
    end

    private

    def connection
      Faraday.new(:url => 'https://api.github.com') do |faraday|
        faraday.adapter  Faraday.default_adapter
      end
    end

    def repos_url
      "/orgs/#{@organization}/repos"
    end
  end
end

Github::Organization.new('github').repos.each do |repo|
  puts repo['clone_url']
end

We’re almost done here. I’m not satisfied with the JSON.parse(connection.get(repos_url).body), it seems such a complex line. Let’s extract some methods here.

require 'faraday'
require 'json'

module Github
  class Organization
    def initialize(organization)
      @organization = organization
    end

    def repos
      @repos ||= get(repos_url)
    end

    private

    def connection
      Faraday.new(:url => 'https://api.github.com') do |faraday|
        faraday.adapter  Faraday.default_adapter
      end
    end

    def get(url)
      JSON.parse(connection.get(url).body)
    end

    def repos_url
      "/orgs/#{@organization}/repos"
    end
  end
end

Github::Organization.new('github').repos.each do |repo|
  puts repo['clone_url']
end

The repos method seems simple enough now, and we have moved the parsing responsability to the get method. But we can get rid of it delegating to someone else to do that. There is a great gem called faraday-middleware that parses it for me, based on the content type header and returns a hash, so, let’s use it.

require 'faraday'
require 'faraday_middleware'

module Github
  class Organization
    def initialize(organization)
      @organization = organization
    end

    def repos
      @repos ||= get(repos_url)
    end

    private

    def connection
      @connection ||= Faraday.new(:url => 'https://api.github.com') do |faraday|
        faraday.adapter  Faraday.default_adapter
        faraday.response :json, :content_type => /\bjson$/
      end
    end

    def get(url)
      connection.get(url).body
    end

    def repos_url
      "/orgs/#{@organization}/repos"
    end
  end
end

I’ve also added a memoization on the connection (we don’t need to instantiate a new one every time).

Two days later, a new requirement: get the organization info and add it on the api. This implementation makes it really easy:

require 'faraday'
require 'faraday_middleware'

module Github
  class Organization
    def initialize(organization)
      @organization = organization
    end

    def repos
      @repos ||= get(repos_url)
    end

    def info
      @info ||= get(info_url)
    end

    private

    def connection
      @connection ||= Faraday.new(:url => 'https://api.github.com') do |faraday|
        faraday.adapter  Faraday.default_adapter
        faraday.response :json, :content_type => /\bjson$/
      end
    end

    def get(url)
      connection.get(url).body
    end

    def repos_url
      "/orgs/#{@organization}/repos"
    end

    def info_url
      "/orgs/#{@organization}"
    end
  end
end

org = Github::Organization.new('github')

puts org.info['name']
org.repos.each do |repo|
  puts repo['clone_url']
end

Neat! It is indeed really easy to add new endpoints support to our class. But I think it has a lot of responsability: it is dealing with the connection to the API. Let’s extract a new class that does that and refer to it on the client method.

require 'faraday'
require 'faraday_middleware'

module Github
  class Client
    def initialize
      @connection = Faraday.new(:url => 'https://api.github.com') do |faraday|
        faraday.adapter  Faraday.default_adapter
        faraday.response :json, :content_type => /\bjson$/
      end
    end

    def get(url)
      @connection.get(url).body
    end
  end

  class Organization
    def initialize(organization)
      @organization = organization
    end

    def repos
      @repos ||= client.get(repos_url)
    end

    def info
      @info ||= client.get(info_url)
    end

    private

    def client
      @client ||= Github::Client.new
    end

    def repos_url
      "/orgs/#{@organization}/repos"
    end

    def info_url
      "/orgs/#{@organization}"
    end
  end
end

org = Github::Organization.new('github')

puts org.info['name']
org.repos.each do |repo|
  puts repo['clone_url']
end

Now we have a pretty simple class, which I finally consider a final implementation, it splits the responsability to parse to another place and now I only have to specify the endpoints and get (or post/put/patch/delete) it. Another improvements may be to add a condition to do something when we have a 404 on an endpoint.

What about you ? Would you recommend another improvement ? Do you use something similar ?