Here is the data for the compressed histogram hash program for strings. Source code for this program can be found here:
http://mcky.net/bin/hsh/hshs.tar.gz
http://mcky.net/bin/hsh/hshs.zip
Here is the annotated program output for 10 million strings:
strings | 10000000 | Create 10 million 4 to 24 character strings |
characters | 154993373 | About 155 million characters |
count sort time | 5.7200 | Count sort time in seconds |
count sort | ok | Count sort output checked |
unique strings | 9999872 | Removed duplicate strings |
characters in unique strings | 144991653 | About 145 million characters |
data size | 184991141 | Unique string characters plus size of indices in byts |
hash time | 4.1100 | Time to create hash in seconds |
hash bytes | 147658209 | Size of the hash in bytes |
compression | 0.798191 | Hash size divided by data size |
hash check | ok | More than 19 million accesses |
hash check time | 4.8800 | Hash check time in seconds |
quick sort time | 11.1300 | Quick sort time in seconds |
quick sort | ok | Quick sort results checked |
Here is the collated output of several runs of the string hash program
String hash results |
|||||||
strings | 100000 | 200000 | 500000 | 1000000 | 2000000 | 5000000 | 10000000 |
characters | 1550458 | 3099449 | 7750695 | 15507989 | 31008965 | 77510230 | 154993373 |
count sort time | 0.0300 | 0.0600 | 0.1600 | 0.3600 | 0.8200 | 2.6600 | 5.9200 |
count sort | ok | ok | ok | ok | ok | ok | ok |
unique strings | 99840 | 199936 | 499968 | 999936 | 1999872 | 4999936 | 9999872 |
characters in unique strings | 1448086 | 2898463 | 7250167 | 14507026 | 29007083 | 72509340 | 144991653 |
data size | 1847446 | 3698207 | 9250039 | 18506770 | 37006571 | 92509084 | 184991141 |
hash time | 0.0200 | 0.0400 | 0.1400 | 0.3300 | 0.7100 | 1.9700 | 4.1700 |
hash bytes | 1584205 | 3141180 | 7792923 | 15549750 | 31049743 | 73098644 | 147658209 |
compression | 0.857511 | 0.849379 | 0.842475 | 0.840220 | 0.839033 | 0.790178 | 0.798191 |
hash check | ok | ok | ok | ok | ok | ok | ok |
hash check time | 0.0500 | 0.0800 | 0.2900 | 0.6600 | 1.3100 | 2.3500 | 4.9000 |
quick sort time | 0.0500 | 0.1200 | 0.3400 | 0.7700 | 1.7300 | 5.0000 | 11.2400 |
quick sort | ok | ok | ok | ok | ok | ok | ok |