The only problem I have with Go's regexp is that it's not very fast. In the slides you are dealing with the worst-case performance, but I guess the everyday performance suffers a little as a consequence of the Go approach (or perhaps it is just not very optimised yet?).
Yeah, Go's (Thompson's NFA) algorithm's edge comes when trying to match pathological strings, which I also explained in this comment here: https://lobste.rs/c/lz68me
So, the performance would be similar to other implementations when trying to match small or non-pathological regexes.
Cool, thanks, please do post it when available. Also, do you know what the tradeoffs are of choosing this method, I think it implies we can't have negative lookahead and things like that?
That, and also slightly larger footprint, as the machine exists in several states at the same time, and also the algorithm caches the NFA to convert it into a DFA [which I didn't include in my talk, cause it'll confuse beginners, who're the target audience here]
Hi. I am the author of the slides, and the talk.
Let me know in case of any doubts or suggestions :)
The only problem I have with Go's regexp is that it's not very fast. In the slides you are dealing with the worst-case performance, but I guess the everyday performance suffers a little as a consequence of the Go approach (or perhaps it is just not very optimised yet?).
https://awmanoj.github.io/tech/2016/09/08/on-slow-regular-expressions-golang/
Yeah, Go's (Thompson's NFA) algorithm's edge comes when trying to match pathological strings, which I also explained in this comment here: https://lobste.rs/c/lz68me
So, the performance would be similar to other implementations when trying to match small or non-pathological regexes.
Thanks for the slides, is there video of the talk anywhere, it looks interesting?
I really like these Russ Cox articles on this topic, they go into lots of detail: https://swtch.com/~rsc/regexp/
Yeah, Russ's articles are the best, in this domain!
Also, the video should be up in a day or so. Will share here once it's up :)
Cool, thanks, please do post it when available. Also, do you know what the tradeoffs are of choosing this method, I think it implies we can't have negative lookahead and things like that?
That, and also slightly larger footprint, as the machine exists in several states at the same time, and also the algorithm caches the NFA to convert it into a DFA [which I didn't include in my talk, cause it'll confuse beginners, who're the target audience here]