He posted the project 36 days ago, 35 days ago, and today. The commit history appears to show that he has been working on it, but that's no excuse for this sort of re-posting.
The guy has a pattern of not just posting his own projects but contributing nothing meaningful to discussion. Either he makes a sales pitch when introducing the project or he thanks people for their feedback and fails to engage in any way with substantive feedback or criticism.
I'm not a moderator (thank God) but it's patently clear to me that there is an etiquette, essential for any productive conversation, that this guy isn't following and that his submissions should be ignored.
The fact that even in the test example it took ripgrep 0.115 s with 97% CPU usage vs Krep's 0.106 s with 328% CPU usage is a testament of why ripgrep is still preferable. Such a huge tradeoff in resource consumption for such a small speed gain is hard to justify
There's a few issues with the Horspool implementation.
To obtain non case sensitive searching, you set the shift in the bad char table for both the upper and lower case variants. That's fine. But when you come to do the search, you ignore this and only look up the lower case version if it's non case sensitive. This is unnecessary - the valid shifts for both cases are already in the table, since you put them there. So you can avoid doing a lookup on the lower table.
I think you also only ever shift by one if you get a full match, but this is also unnecessary. You can shift by whatever the last char bad char shift is and you won't miss any patterns, even if they can overlap.
Finally, you have a test to see if the shift is zero, and to set it to one if it is. You should never get a shift of zero from the bad char table (you don't process the last char of the pattern when creating the bad char table, so there is never a distance of zero from the end).
1. Always use the raw bad_char_table[text[i + pattern_len‑1]] shift—no extra lowercase lookup at runtime.
2. After a full match, advance by that same table shift instead of just doing i++, so you never miss overlapping occurrences.
3. Removed the “if shift==0 then shift=1” guard—since we build the table excluding the last pattern byte, zero shifts can’t occur.
Looks better! A couple of other minor comments, since I'm looking at it:
1. You don't need to assign the last char of the pattern inside the search loop every time around the loop. It will never change (line 1143). Just do it once before the search loop starts.
2. tc_last is the text aligned with the end of the pattern window (line 1142). You don't need to get it again when getting the shift (lines 1201, 1208). bad char is the last char which you already have.
I've found a lot of real world performance comes from doing as little as possible inside the main search loop.
UPDATE: It's often the case that a simpler algorithm outperforms a theoretically better one. For example, you are using Boyer Moore Horspool, which is the simpler cousin of the original Boyer Moore. BM can get better shifts than Horspool, but it's often slower in practice.
An implication of doing as little as possible in the loop is you could boost performance by creating variants of the search algorithm for different types of search. Then you won't incur the penalty or constantly testing "if it's this kind of search do this else do that" inside the search loop.
Posted here about a month ago: https://news.ycombinator.com/item?id=43333946
With this comment showing that search was inconsistent and gave incorrect results: https://news.ycombinator.com/item?id=43335277
why the repost?
He posted the project 36 days ago, 35 days ago, and today. The commit history appears to show that he has been working on it, but that's no excuse for this sort of re-posting.
The guy has a pattern of not just posting his own projects but contributing nothing meaningful to discussion. Either he makes a sales pitch when introducing the project or he thanks people for their feedback and fails to engage in any way with substantive feedback or criticism.
Last time he posted this project he couldn't even be bothered to reply to a very generous response by the author of ripgrep: https://news.ycombinator.com/context?id=43334661
I'm not a moderator (thank God) but it's patently clear to me that there is an etiquette, essential for any productive conversation, that this guy isn't following and that his submissions should be ignored.
He also posts threads on reddit and then deletes everything:
https://old.reddit.com/r/programming/comments/1jra9y0/introd...
https://old.reddit.com/r/programming/comments/1jpk8sw/krep_a...
https://old.reddit.com/r/commandline/comments/1jqhssc/krep_b...
I reposted the project because I have released a new version fixing everything that users have encountered.
The fact that even in the test example it took ripgrep 0.115 s with 97% CPU usage vs Krep's 0.106 s with 328% CPU usage is a testament of why ripgrep is still preferable. Such a huge tradeoff in resource consumption for such a small speed gain is hard to justify
ripgrep author here.
Like many grep tools, it also has significant performance cliffs that are very easy to hit:
ripgrep has cliffs too, but they tend to be rarer and/or still give decent perf.There's a few issues with the Horspool implementation.
To obtain non case sensitive searching, you set the shift in the bad char table for both the upper and lower case variants. That's fine. But when you come to do the search, you ignore this and only look up the lower case version if it's non case sensitive. This is unnecessary - the valid shifts for both cases are already in the table, since you put them there. So you can avoid doing a lookup on the lower table.
I think you also only ever shift by one if you get a full match, but this is also unnecessary. You can shift by whatever the last char bad char shift is and you won't miss any patterns, even if they can overlap.
Finally, you have a test to see if the shift is zero, and to set it to one if it is. You should never get a shift of zero from the bad char table (you don't process the last char of the pattern when creating the bad char table, so there is never a distance of zero from the end).
Thank you for the report! Here’s what I changed:
1. Always use the raw bad_char_table[text[i + pattern_len‑1]] shift—no extra lowercase lookup at runtime. 2. After a full match, advance by that same table shift instead of just doing i++, so you never miss overlapping occurrences. 3. Removed the “if shift==0 then shift=1” guard—since we build the table excluding the last pattern byte, zero shifts can’t occur.
https://github.com/davidesantangelo/krep/releases/tag/v1.0.6
Looks better! A couple of other minor comments, since I'm looking at it:
1. You don't need to assign the last char of the pattern inside the search loop every time around the loop. It will never change (line 1143). Just do it once before the search loop starts.
2. tc_last is the text aligned with the end of the pattern window (line 1142). You don't need to get it again when getting the shift (lines 1201, 1208). bad char is the last char which you already have.
I have applied your suggestions. Thank you again.
https://github.com/davidesantangelo/krep/releases/tag/v1.1.1
No problem, hope it was useful.
I've found a lot of real world performance comes from doing as little as possible inside the main search loop.
UPDATE: It's often the case that a simpler algorithm outperforms a theoretically better one. For example, you are using Boyer Moore Horspool, which is the simpler cousin of the original Boyer Moore. BM can get better shifts than Horspool, but it's often slower in practice.
An implication of doing as little as possible in the loop is you could boost performance by creating variants of the search algorithm for different types of search. Then you won't incur the penalty or constantly testing "if it's this kind of search do this else do that" inside the search loop.
I have just released version 1.1.0 which should fix the problems reported by users.
https://github.com/davidesantangelo/krep/releases/tag/v1.1.0