Wikitext speed improvements

As twittered earlier, the already-fast Wikitext module has seen some pretty spectacular speed improvements lately.

Before:

short slab of ASCII text    2.010000   0.020000   2.030000 (  2.133733)
short slab of UTF-8 text    3.990000   0.040000   4.030000 (  4.174043)
longer slab of ASCII text  16.700000   0.120000  16.820000 ( 17.302634)
longer slab of UTF-8 text  50.010000   0.400000  50.410000 ( 54.708712)

[Clickety, click, click, click]:

$ git diff master --stat
 benchmarks/NOTES.txt          |   14 +
 benchmarks/parsing.rb         |  171 ++++++-
 benchmarks/profile_parsing.rb |  160 ++++++
 ext/ary.h                     |    4 -
 ext/parser.c                  | 1107 ++++++++++++++++++++++-------------------
 ext/str.c                     |   49 +--
 ext/str.h                     |   16 +-
 ext/token.h                   |    2 +-
 spec/external_link_spec.rb    |   17 +
 spec/internal_link_spec.rb    |    8 +-
 10 files changed, 969 insertions(+), 579 deletions(-)

After:

short slab of ASCII text    1.550000   0.010000   1.560000 (  1.572018)
short slab of UTF-8 text    2.310000   0.020000   2.330000 (  2.352641)
longer slab of ASCII text  13.780000   0.100000  13.880000 ( 14.034015)
longer slab of UTF-8 text  23.150000   0.130000  23.280000 ( 23.505007)

The most spectacular gains were seen for the "longer slab of UTF-8 text" case — 2 kilobytes of worst-case UTF-8 input like "ñ€w pärägräph wîthîñ blöckquöt€", translated 100,000 times — which more than doubled in speed.

That’s 196.5 megabytes of the worst possible wikitext markup translated in just 23.5 seconds; over 8 megabytes per second on this old iMac, and now almost as fast as the best-case-scenario input (pure ASCII) which clocks in at about 9.6 megabytes per second.