(james pozdena) Blog Tokyo Cabinet benchmarking

18 Feb 2009

Tonight I did a little bit more hacking with Tokyo Cabinet. For a current project I’m looking for a super fast ultra light db that I can quickly dump CSVs into and quickly run some simple queries to get data out. And I thought Tokyo Cabinet would be a great candidate. So to be sure, I decided to put Tokyo Cabinet vs. Rufus-Tokyo vs. DataMapper (using do_posgres).

The result were a little surprising. Tokyo Cabinets B-tree database engine is amazingly fast, eating 10,000 inserts in less than 0.15 seconds. The hash database engine came in a close second taking a little less than half a second for the same 10,000 inserts. And coming 91 times slower than the B-tree and 24 times slower than the hash is DataMapper with a little over 11 seconds.

That’s all well and good and I assumed I would be getting just about the same results when I tried to pull data out, but Rufus and DataMapper switched. Although Tokyo Cabinet’s B-tree database engine was still the fastest, it’s hash tree engine was now slower than Datamapper.

This came as quite a disappointment. Tokyo Cabinet is so easy to use and I really loved how the databases could be so easily created and destroyed. I can’t use the B-tree engine because it just won’t allow me to do the queries I want to do and I can’t justify using a database that is 43 times slower than it’s competitor. I’m hoping another API (maybe the dm-tokyo-cabinet-adapter) will be faster, but I think that’s enough benchmarking for one night.

The details below:

Insert 10,000 records from CSV (using FasterCSV)

user system total real
Tokyo: B-tree 0.120000 0.020000 0.140000 (0.142824)
Tokyo: Rufus 0.450000 0.110000 0.560000 (0.572258)
DataMapper: Postgres 11.230000 0.110000 12.210000 (21.524662)

Get 100

user system total real
Tokyo: B-tree 0.000000 0.000000 0.000000 ( 0.001434)
Tokyo: Rufus 1.730000 0.010000 1.740000 ( 1.771523)
DataMapper: Postgres 0.040000 0.010000 0.050000 ( 0.268635)

Code used for beachmarking:

require 'rubygems'  
require 'fastercsv'  
  
require "tokyocabinet"  
require 'rufus/tokyo'  
require 'dm-core'  
  
require 'benchmark'  
  
include TokyoCabinet  
  
# setup for btree  
`rm btree.bdb`  
bdb = BDB::new  
bdb.open("btree.bdb", BDB::OWRITER | BDB::OCREAT)  
  
# setup for hash  
`rm hash.tdb`  
t = Rufus::Tokyo::Table.new('hash.tdb')  
  
# setup for DataMapper  
  
DataMapper.setup(:default, 'postgres://localhost/dm_core_test')  
  
class Cell  
  include DataMapper::Resource  
  
  property :id,         Serial  
  property :x_cord,     Integer  
  property :y_cord,     Integer  
  property :value,      String  
end  
  
Cell.auto_migrate!  
  
x = 0  
y = 0  
  
Benchmark.bm do |bench|  
  bench.report("Tokyo: B-tree") {  
    FasterCSV.foreach("big.csv") do |row|  
      y += 1  
      row.each do |cell|  
        x += 1  
        bdb.put("#{x},#{y}", "#{cell}")  
      end  
      x = 0  
    end  
    y = 0  
  }  
  bench.report("Tokyo: Rufus") {  
    FasterCSV.foreach("big.csv") do |row|  
      y += 1  
      row.each do |cell|  
        x += 1  
        t[cell.to_s] = { 'x' => x.to_s, 'y' => y.to_s }  
      end  
      x = 0  
    end  
    y = 0  
  }  
  bench.report("DataMapper: Postgres") {  
    FasterCSV.foreach("big.csv") do |row|  
      y += 1  
      row.each do |cell|  
        x += 1  
        Cell.create( :x_cord => x, :y_cord => y, :value => cell.to_s)  
      end  
      x = 0  
    end  
    y = 0  
  }  
end  
  
Benchmark.bm do |bench|  
  bench.report("Tokyo: B-tree") {  
    100.times do |x|  
      bdb.get("#{x+1},#{x+1}")  
    end  
  }  
  bench.report("Tokyo: Rufus") {  
    100.times do |x|  
      t.query { |q|  
          q.add 'x', :equals, (x+1).to_s  
          q.add 'y', :equals, (x+1).to_s  
      }  
    end  
  }  
  bench.report("DataMapper: Postgres") {  
    100.times do |x|  
      Cell.first(:x_cord => (x+1), :y_cord => (x+1)).value  
    end  
  }  
end
blog comments powered by Disqus
contact