Ruby vs. Python Date Parsing, Am I Doing Something Wrong?

Why is DateTime.strptime so slow?

1
2
3
4
5
require 'date'

for i in (1..10000)
  DateTime.strptime("27/Nov/2007:15:01:43 -0800", "%d/%b/%Y:%H:%M:%S %z")
end

Timing this on a P4 2.66GHz CPU produces:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
garry@ubuntu:~/p/script$ ruby -v
ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-linux]
garry@ubuntu:~/p/script$ time ruby test.rb 

real    0m9.719s
user    0m7.520s
sys     0m0.020s
garry@ubuntu:~/p/script$ time ruby test.rb 

real    0m9.130s
user    0m7.520s
sys     0m0.000s
garry@ubuntu:~/p/script$ time ruby test.rb 

real    0m8.980s
user    0m7.480s
sys     0m0.010s
garry@ubuntu:~/p/script$

9.276s on average

Same thing in Python:

1
2
3
4
import time

for i in range(10000):
  time.strptime("27/Nov/2007:15:01:43", "%d/%b/%Y:%H:%M:%S")

Note: %z and “-0800” is left off the end because Python will throw an error, something about “z” not being supported on all platforms. I’m not sure if this will affect the performance greatly.

Anyway, that code produces:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
garry@ubuntu:~/p/script$ python
Python 2.4.3 (#2, Oct  6 2006, 07:52:30) 
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 
garry@ubuntu:~/p/script$ time python test.py

real    0m0.751s
user    0m0.610s
sys     0m0.020s
garry@ubuntu:~/p/script$ time python test.py

real    0m0.737s
user    0m0.610s
sys     0m0.000s
garry@ubuntu:~/p/script$ time python test.py

real    0m0.727s
user    0m0.620s
sys     0m0.000s
garry@ubuntu:~/p/script$

0.738s on average

The Python version is about 12.5x faster. Is there a different method I should be using to parse dates in Ruby? Time.parse() will not parse the string given (“27/Nov/2007:15:01:43 -0800”).

Charlie Savage does some analysis on Time and DateTime here.

Looking at date.rb and date/format.rb a bit, it seems strptime is implemented all in Ruby. Perhaps the Python version hands it off to libc?

I asked about this in #ruby, but no one seemed to be around. So I open the question to you guys… :)

Comments