StrSearcher: Implement the complete reverse case for the two way algorithm Fix quadratic behavior in StrSearcher in reverse search with periodic needles. This commit adds the missing pieces for the "short period" case in reverse search. The short case will show up when the needle is literally periodic, for example "abababab". Two way uses a "critical factorization" of the needle: x = u v. Searching matches v first, if mismatch at character k, skip k forward. Matching u, if mismatch, skip period(x) forward. To avoid O(mn) behavior after mismatch in u, memorize the already matched prefix. The short period case requires that |u| < period(x). For the reverse search we need to compute a different critical factorization x = u' v' where |v'| < period(x), because we are searching for the reversed needle. A short v' also benefits the algorithm in general. The reverse critical factorization is computed quickly by using the same maximal suffix algorithm, but terminating as soon as we have a location with local period equal to period(x). This adds extra fields crit_pos_back and memory_back for the reverse case. The new overhead for TwoWaySearcher::new is low, and additionally I think the "short period" case is uncommon in many applications of string search. The maximal_suffix methods were updated in documentation and the algorithms updated to not use !0 and wrapping add, variable left is now 1 larger, offset 1 smaller. Use periodicity when computing byteset: in the periodic case, just iterate over one period instead of the whole needle. Example before (rfind) after (twoway_rfind) benchmark shows the removal of quadratic behavior. needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100 ``` test periodic::rfind ... bench: 1,926,595 ns/iter (+/- 11,390) = 10 MB/s test periodic::twoway_rfind ... bench: 51,740 ns/iter (+/- 66) = 386 MB/s ```
1201 lines
43 KiB
Rust
1201 lines
43 KiB
Rust
// Copyright 2015 The Rust Project Developers. See the COPYRIGHT
|
|
// file at the top-level directory of this distribution and at
|
|
// http://rust-lang.org/COPYRIGHT.
|
|
//
|
|
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
|
|
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
|
|
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
|
|
// option. This file may not be copied, modified, or distributed
|
|
// except according to those terms.
|
|
|
|
//! The string Pattern API.
|
|
//!
|
|
//! For more details, see the traits `Pattern`, `Searcher`,
|
|
//! `ReverseSearcher` and `DoubleEndedSearcher`.
|
|
|
|
#![unstable(feature = "pattern",
|
|
reason = "API not fully fleshed out and ready to be stabilized",
|
|
issue = "27721")]
|
|
|
|
use prelude::v1::*;
|
|
|
|
use cmp;
|
|
use usize;
|
|
|
|
// Pattern
|
|
|
|
/// A string pattern.
|
|
///
|
|
/// A `Pattern<'a>` expresses that the implementing type
|
|
/// can be used as a string pattern for searching in a `&'a str`.
|
|
///
|
|
/// For example, both `'a'` and `"aa"` are patterns that
|
|
/// would match at index `1` in the string `"baaaab"`.
|
|
///
|
|
/// The trait itself acts as a builder for an associated
|
|
/// `Searcher` type, which does the actual work of finding
|
|
/// occurrences of the pattern in a string.
|
|
pub trait Pattern<'a>: Sized {
|
|
/// Associated searcher for this pattern
|
|
type Searcher: Searcher<'a>;
|
|
|
|
/// Constructs the associated searcher from
|
|
/// `self` and the `haystack` to search in.
|
|
fn into_searcher(self, haystack: &'a str) -> Self::Searcher;
|
|
|
|
/// Checks whether the pattern matches anywhere in the haystack
|
|
#[inline]
|
|
fn is_contained_in(self, haystack: &'a str) -> bool {
|
|
self.into_searcher(haystack).next_match().is_some()
|
|
}
|
|
|
|
/// Checks whether the pattern matches at the front of the haystack
|
|
#[inline]
|
|
fn is_prefix_of(self, haystack: &'a str) -> bool {
|
|
match self.into_searcher(haystack).next() {
|
|
SearchStep::Match(0, _) => true,
|
|
_ => false,
|
|
}
|
|
}
|
|
|
|
/// Checks whether the pattern matches at the back of the haystack
|
|
#[inline]
|
|
fn is_suffix_of(self, haystack: &'a str) -> bool
|
|
where Self::Searcher: ReverseSearcher<'a>
|
|
{
|
|
match self.into_searcher(haystack).next_back() {
|
|
SearchStep::Match(_, j) if haystack.len() == j => true,
|
|
_ => false,
|
|
}
|
|
}
|
|
}
|
|
|
|
// Searcher
|
|
|
|
/// Result of calling `Searcher::next()` or `ReverseSearcher::next_back()`.
|
|
#[derive(Copy, Clone, Eq, PartialEq, Debug)]
|
|
pub enum SearchStep {
|
|
/// Expresses that a match of the pattern has been found at
|
|
/// `haystack[a..b]`.
|
|
Match(usize, usize),
|
|
/// Expresses that `haystack[a..b]` has been rejected as a possible match
|
|
/// of the pattern.
|
|
///
|
|
/// Note that there might be more than one `Reject` between two `Match`es,
|
|
/// there is no requirement for them to be combined into one.
|
|
Reject(usize, usize),
|
|
/// Expresses that every byte of the haystack has been visted, ending
|
|
/// the iteration.
|
|
Done
|
|
}
|
|
|
|
/// A searcher for a string pattern.
|
|
///
|
|
/// This trait provides methods for searching for non-overlapping
|
|
/// matches of a pattern starting from the front (left) of a string.
|
|
///
|
|
/// It will be implemented by associated `Searcher`
|
|
/// types of the `Pattern` trait.
|
|
///
|
|
/// The trait is marked unsafe because the indices returned by the
|
|
/// `next()` methods are required to lie on valid utf8 boundaries in
|
|
/// the haystack. This enables consumers of this trait to
|
|
/// slice the haystack without additional runtime checks.
|
|
pub unsafe trait Searcher<'a> {
|
|
/// Getter for the underlaying string to be searched in
|
|
///
|
|
/// Will always return the same `&str`
|
|
fn haystack(&self) -> &'a str;
|
|
|
|
/// Performs the next search step starting from the front.
|
|
///
|
|
/// - Returns `Match(a, b)` if `haystack[a..b]` matches the pattern.
|
|
/// - Returns `Reject(a, b)` if `haystack[a..b]` can not match the
|
|
/// pattern, even partially.
|
|
/// - Returns `Done` if every byte of the haystack has been visited
|
|
///
|
|
/// The stream of `Match` and `Reject` values up to a `Done`
|
|
/// will contain index ranges that are adjacent, non-overlapping,
|
|
/// covering the whole haystack, and laying on utf8 boundaries.
|
|
///
|
|
/// A `Match` result needs to contain the whole matched pattern,
|
|
/// however `Reject` results may be split up into arbitrary
|
|
/// many adjacent fragments. Both ranges may have zero length.
|
|
///
|
|
/// As an example, the pattern `"aaa"` and the haystack `"cbaaaaab"`
|
|
/// might produce the stream
|
|
/// `[Reject(0, 1), Reject(1, 2), Match(2, 5), Reject(5, 8)]`
|
|
fn next(&mut self) -> SearchStep;
|
|
|
|
/// Find the next `Match` result. See `next()`
|
|
#[inline]
|
|
fn next_match(&mut self) -> Option<(usize, usize)> {
|
|
loop {
|
|
match self.next() {
|
|
SearchStep::Match(a, b) => return Some((a, b)),
|
|
SearchStep::Done => return None,
|
|
_ => continue,
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Find the next `Reject` result. See `next()`
|
|
#[inline]
|
|
fn next_reject(&mut self) -> Option<(usize, usize)> {
|
|
loop {
|
|
match self.next() {
|
|
SearchStep::Reject(a, b) => return Some((a, b)),
|
|
SearchStep::Done => return None,
|
|
_ => continue,
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
/// A reverse searcher for a string pattern.
|
|
///
|
|
/// This trait provides methods for searching for non-overlapping
|
|
/// matches of a pattern starting from the back (right) of a string.
|
|
///
|
|
/// It will be implemented by associated `Searcher`
|
|
/// types of the `Pattern` trait if the pattern supports searching
|
|
/// for it from the back.
|
|
///
|
|
/// The index ranges returned by this trait are not required
|
|
/// to exactly match those of the forward search in reverse.
|
|
///
|
|
/// For the reason why this trait is marked unsafe, see them
|
|
/// parent trait `Searcher`.
|
|
pub unsafe trait ReverseSearcher<'a>: Searcher<'a> {
|
|
/// Performs the next search step starting from the back.
|
|
///
|
|
/// - Returns `Match(a, b)` if `haystack[a..b]` matches the pattern.
|
|
/// - Returns `Reject(a, b)` if `haystack[a..b]` can not match the
|
|
/// pattern, even partially.
|
|
/// - Returns `Done` if every byte of the haystack has been visited
|
|
///
|
|
/// The stream of `Match` and `Reject` values up to a `Done`
|
|
/// will contain index ranges that are adjacent, non-overlapping,
|
|
/// covering the whole haystack, and laying on utf8 boundaries.
|
|
///
|
|
/// A `Match` result needs to contain the whole matched pattern,
|
|
/// however `Reject` results may be split up into arbitrary
|
|
/// many adjacent fragments. Both ranges may have zero length.
|
|
///
|
|
/// As an example, the pattern `"aaa"` and the haystack `"cbaaaaab"`
|
|
/// might produce the stream
|
|
/// `[Reject(7, 8), Match(4, 7), Reject(1, 4), Reject(0, 1)]`
|
|
fn next_back(&mut self) -> SearchStep;
|
|
|
|
/// Find the next `Match` result. See `next_back()`
|
|
#[inline]
|
|
fn next_match_back(&mut self) -> Option<(usize, usize)>{
|
|
loop {
|
|
match self.next_back() {
|
|
SearchStep::Match(a, b) => return Some((a, b)),
|
|
SearchStep::Done => return None,
|
|
_ => continue,
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Find the next `Reject` result. See `next_back()`
|
|
#[inline]
|
|
fn next_reject_back(&mut self) -> Option<(usize, usize)>{
|
|
loop {
|
|
match self.next_back() {
|
|
SearchStep::Reject(a, b) => return Some((a, b)),
|
|
SearchStep::Done => return None,
|
|
_ => continue,
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
/// A marker trait to express that a `ReverseSearcher`
|
|
/// can be used for a `DoubleEndedIterator` implementation.
|
|
///
|
|
/// For this, the impl of `Searcher` and `ReverseSearcher` need
|
|
/// to follow these conditions:
|
|
///
|
|
/// - All results of `next()` need to be identical
|
|
/// to the results of `next_back()` in reverse order.
|
|
/// - `next()` and `next_back()` need to behave as
|
|
/// the two ends of a range of values, that is they
|
|
/// can not "walk past each other".
|
|
///
|
|
/// # Examples
|
|
///
|
|
/// `char::Searcher` is a `DoubleEndedSearcher` because searching for a
|
|
/// `char` only requires looking at one at a time, which behaves the same
|
|
/// from both ends.
|
|
///
|
|
/// `(&str)::Searcher` is not a `DoubleEndedSearcher` because
|
|
/// the pattern `"aa"` in the haystack `"aaa"` matches as either
|
|
/// `"[aa]a"` or `"a[aa]"`, depending from which side it is searched.
|
|
pub trait DoubleEndedSearcher<'a>: ReverseSearcher<'a> {}
|
|
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
// Impl for a CharEq wrapper
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
|
|
#[doc(hidden)]
|
|
trait CharEq {
|
|
fn matches(&mut self, char) -> bool;
|
|
fn only_ascii(&self) -> bool;
|
|
}
|
|
|
|
impl CharEq for char {
|
|
#[inline]
|
|
fn matches(&mut self, c: char) -> bool { *self == c }
|
|
|
|
#[inline]
|
|
fn only_ascii(&self) -> bool { (*self as u32) < 128 }
|
|
}
|
|
|
|
impl<F> CharEq for F where F: FnMut(char) -> bool {
|
|
#[inline]
|
|
fn matches(&mut self, c: char) -> bool { (*self)(c) }
|
|
|
|
#[inline]
|
|
fn only_ascii(&self) -> bool { false }
|
|
}
|
|
|
|
impl<'a> CharEq for &'a [char] {
|
|
#[inline]
|
|
fn matches(&mut self, c: char) -> bool {
|
|
self.iter().any(|&m| { let mut m = m; m.matches(c) })
|
|
}
|
|
|
|
#[inline]
|
|
fn only_ascii(&self) -> bool {
|
|
self.iter().all(|m| m.only_ascii())
|
|
}
|
|
}
|
|
|
|
struct CharEqPattern<C: CharEq>(C);
|
|
|
|
#[derive(Clone)]
|
|
struct CharEqSearcher<'a, C: CharEq> {
|
|
char_eq: C,
|
|
haystack: &'a str,
|
|
char_indices: super::CharIndices<'a>,
|
|
#[allow(dead_code)]
|
|
ascii_only: bool,
|
|
}
|
|
|
|
impl<'a, C: CharEq> Pattern<'a> for CharEqPattern<C> {
|
|
type Searcher = CharEqSearcher<'a, C>;
|
|
|
|
#[inline]
|
|
fn into_searcher(self, haystack: &'a str) -> CharEqSearcher<'a, C> {
|
|
CharEqSearcher {
|
|
ascii_only: self.0.only_ascii(),
|
|
haystack: haystack,
|
|
char_eq: self.0,
|
|
char_indices: haystack.char_indices(),
|
|
}
|
|
}
|
|
}
|
|
|
|
unsafe impl<'a, C: CharEq> Searcher<'a> for CharEqSearcher<'a, C> {
|
|
#[inline]
|
|
fn haystack(&self) -> &'a str {
|
|
self.haystack
|
|
}
|
|
|
|
#[inline]
|
|
fn next(&mut self) -> SearchStep {
|
|
let s = &mut self.char_indices;
|
|
// Compare lengths of the internal byte slice iterator
|
|
// to find length of current char
|
|
let (pre_len, _) = s.iter.iter.size_hint();
|
|
if let Some((i, c)) = s.next() {
|
|
let (len, _) = s.iter.iter.size_hint();
|
|
let char_len = pre_len - len;
|
|
if self.char_eq.matches(c) {
|
|
return SearchStep::Match(i, i + char_len);
|
|
} else {
|
|
return SearchStep::Reject(i, i + char_len);
|
|
}
|
|
}
|
|
SearchStep::Done
|
|
}
|
|
}
|
|
|
|
unsafe impl<'a, C: CharEq> ReverseSearcher<'a> for CharEqSearcher<'a, C> {
|
|
#[inline]
|
|
fn next_back(&mut self) -> SearchStep {
|
|
let s = &mut self.char_indices;
|
|
// Compare lengths of the internal byte slice iterator
|
|
// to find length of current char
|
|
let (pre_len, _) = s.iter.iter.size_hint();
|
|
if let Some((i, c)) = s.next_back() {
|
|
let (len, _) = s.iter.iter.size_hint();
|
|
let char_len = pre_len - len;
|
|
if self.char_eq.matches(c) {
|
|
return SearchStep::Match(i, i + char_len);
|
|
} else {
|
|
return SearchStep::Reject(i, i + char_len);
|
|
}
|
|
}
|
|
SearchStep::Done
|
|
}
|
|
}
|
|
|
|
impl<'a, C: CharEq> DoubleEndedSearcher<'a> for CharEqSearcher<'a, C> {}
|
|
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
|
|
macro_rules! pattern_methods {
|
|
($t:ty, $pmap:expr, $smap:expr) => {
|
|
type Searcher = $t;
|
|
|
|
#[inline]
|
|
fn into_searcher(self, haystack: &'a str) -> $t {
|
|
($smap)(($pmap)(self).into_searcher(haystack))
|
|
}
|
|
|
|
#[inline]
|
|
fn is_contained_in(self, haystack: &'a str) -> bool {
|
|
($pmap)(self).is_contained_in(haystack)
|
|
}
|
|
|
|
#[inline]
|
|
fn is_prefix_of(self, haystack: &'a str) -> bool {
|
|
($pmap)(self).is_prefix_of(haystack)
|
|
}
|
|
|
|
#[inline]
|
|
fn is_suffix_of(self, haystack: &'a str) -> bool
|
|
where $t: ReverseSearcher<'a>
|
|
{
|
|
($pmap)(self).is_suffix_of(haystack)
|
|
}
|
|
}
|
|
}
|
|
|
|
macro_rules! searcher_methods {
|
|
(forward) => {
|
|
#[inline]
|
|
fn haystack(&self) -> &'a str {
|
|
self.0.haystack()
|
|
}
|
|
#[inline]
|
|
fn next(&mut self) -> SearchStep {
|
|
self.0.next()
|
|
}
|
|
#[inline]
|
|
fn next_match(&mut self) -> Option<(usize, usize)> {
|
|
self.0.next_match()
|
|
}
|
|
#[inline]
|
|
fn next_reject(&mut self) -> Option<(usize, usize)> {
|
|
self.0.next_reject()
|
|
}
|
|
};
|
|
(reverse) => {
|
|
#[inline]
|
|
fn next_back(&mut self) -> SearchStep {
|
|
self.0.next_back()
|
|
}
|
|
#[inline]
|
|
fn next_match_back(&mut self) -> Option<(usize, usize)> {
|
|
self.0.next_match_back()
|
|
}
|
|
#[inline]
|
|
fn next_reject_back(&mut self) -> Option<(usize, usize)> {
|
|
self.0.next_reject_back()
|
|
}
|
|
}
|
|
}
|
|
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
// Impl for char
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
|
|
/// Associated type for `<char as Pattern<'a>>::Searcher`.
|
|
#[derive(Clone)]
|
|
pub struct CharSearcher<'a>(<CharEqPattern<char> as Pattern<'a>>::Searcher);
|
|
|
|
unsafe impl<'a> Searcher<'a> for CharSearcher<'a> {
|
|
searcher_methods!(forward);
|
|
}
|
|
|
|
unsafe impl<'a> ReverseSearcher<'a> for CharSearcher<'a> {
|
|
searcher_methods!(reverse);
|
|
}
|
|
|
|
impl<'a> DoubleEndedSearcher<'a> for CharSearcher<'a> {}
|
|
|
|
/// Searches for chars that are equal to a given char
|
|
impl<'a> Pattern<'a> for char {
|
|
pattern_methods!(CharSearcher<'a>, CharEqPattern, CharSearcher);
|
|
}
|
|
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
// Impl for &[char]
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
|
|
// Todo: Change / Remove due to ambiguity in meaning.
|
|
|
|
/// Associated type for `<&[char] as Pattern<'a>>::Searcher`.
|
|
#[derive(Clone)]
|
|
pub struct CharSliceSearcher<'a, 'b>(<CharEqPattern<&'b [char]> as Pattern<'a>>::Searcher);
|
|
|
|
unsafe impl<'a, 'b> Searcher<'a> for CharSliceSearcher<'a, 'b> {
|
|
searcher_methods!(forward);
|
|
}
|
|
|
|
unsafe impl<'a, 'b> ReverseSearcher<'a> for CharSliceSearcher<'a, 'b> {
|
|
searcher_methods!(reverse);
|
|
}
|
|
|
|
impl<'a, 'b> DoubleEndedSearcher<'a> for CharSliceSearcher<'a, 'b> {}
|
|
|
|
/// Searches for chars that are equal to any of the chars in the array
|
|
impl<'a, 'b> Pattern<'a> for &'b [char] {
|
|
pattern_methods!(CharSliceSearcher<'a, 'b>, CharEqPattern, CharSliceSearcher);
|
|
}
|
|
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
// Impl for F: FnMut(char) -> bool
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
|
|
/// Associated type for `<F as Pattern<'a>>::Searcher`.
|
|
#[derive(Clone)]
|
|
pub struct CharPredicateSearcher<'a, F>(<CharEqPattern<F> as Pattern<'a>>::Searcher)
|
|
where F: FnMut(char) -> bool;
|
|
|
|
unsafe impl<'a, F> Searcher<'a> for CharPredicateSearcher<'a, F>
|
|
where F: FnMut(char) -> bool
|
|
{
|
|
searcher_methods!(forward);
|
|
}
|
|
|
|
unsafe impl<'a, F> ReverseSearcher<'a> for CharPredicateSearcher<'a, F>
|
|
where F: FnMut(char) -> bool
|
|
{
|
|
searcher_methods!(reverse);
|
|
}
|
|
|
|
impl<'a, F> DoubleEndedSearcher<'a> for CharPredicateSearcher<'a, F>
|
|
where F: FnMut(char) -> bool {}
|
|
|
|
/// Searches for chars that match the given predicate
|
|
impl<'a, F> Pattern<'a> for F where F: FnMut(char) -> bool {
|
|
pattern_methods!(CharPredicateSearcher<'a, F>, CharEqPattern, CharPredicateSearcher);
|
|
}
|
|
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
// Impl for &&str
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
|
|
/// Delegates to the `&str` impl.
|
|
impl<'a, 'b> Pattern<'a> for &'b &'b str {
|
|
pattern_methods!(StrSearcher<'a, 'b>, |&s| s, |s| s);
|
|
}
|
|
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
// Impl for &str
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
|
|
/// Non-allocating substring search.
|
|
///
|
|
/// Will handle the pattern `""` as returning empty matches at each character
|
|
/// boundary.
|
|
impl<'a, 'b> Pattern<'a> for &'b str {
|
|
type Searcher = StrSearcher<'a, 'b>;
|
|
|
|
#[inline]
|
|
fn into_searcher(self, haystack: &'a str) -> StrSearcher<'a, 'b> {
|
|
StrSearcher::new(haystack, self)
|
|
}
|
|
|
|
/// Checks whether the pattern matches at the front of the haystack
|
|
#[inline]
|
|
fn is_prefix_of(self, haystack: &'a str) -> bool {
|
|
haystack.is_char_boundary(self.len()) &&
|
|
self == &haystack[..self.len()]
|
|
}
|
|
|
|
/// Checks whether the pattern matches at the back of the haystack
|
|
#[inline]
|
|
fn is_suffix_of(self, haystack: &'a str) -> bool {
|
|
self.len() <= haystack.len() &&
|
|
haystack.is_char_boundary(haystack.len() - self.len()) &&
|
|
self == &haystack[haystack.len() - self.len()..]
|
|
}
|
|
}
|
|
|
|
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
// Two Way substring searcher
|
|
/////////////////////////////////////////////////////////////////////////////
|
|
|
|
#[derive(Clone, Debug)]
|
|
/// Associated type for `<&str as Pattern<'a>>::Searcher`.
|
|
pub struct StrSearcher<'a, 'b> {
|
|
haystack: &'a str,
|
|
needle: &'b str,
|
|
|
|
searcher: StrSearcherImpl,
|
|
}
|
|
|
|
#[derive(Clone, Debug)]
|
|
enum StrSearcherImpl {
|
|
Empty(EmptyNeedle),
|
|
TwoWay(TwoWaySearcher),
|
|
}
|
|
|
|
#[derive(Clone, Debug)]
|
|
struct EmptyNeedle {
|
|
position: usize,
|
|
end: usize,
|
|
is_match_fw: bool,
|
|
is_match_bw: bool,
|
|
}
|
|
|
|
impl<'a, 'b> StrSearcher<'a, 'b> {
|
|
fn new(haystack: &'a str, needle: &'b str) -> StrSearcher<'a, 'b> {
|
|
if needle.is_empty() {
|
|
StrSearcher {
|
|
haystack: haystack,
|
|
needle: needle,
|
|
searcher: StrSearcherImpl::Empty(EmptyNeedle {
|
|
position: 0,
|
|
end: haystack.len(),
|
|
is_match_fw: true,
|
|
is_match_bw: true,
|
|
}),
|
|
}
|
|
} else {
|
|
StrSearcher {
|
|
haystack: haystack,
|
|
needle: needle,
|
|
searcher: StrSearcherImpl::TwoWay(
|
|
TwoWaySearcher::new(needle.as_bytes(), haystack.len())
|
|
),
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
unsafe impl<'a, 'b> Searcher<'a> for StrSearcher<'a, 'b> {
|
|
fn haystack(&self) -> &'a str { self.haystack }
|
|
|
|
#[inline]
|
|
fn next(&mut self) -> SearchStep {
|
|
match self.searcher {
|
|
StrSearcherImpl::Empty(ref mut searcher) => {
|
|
// empty needle rejects every char and matches every empty string between them
|
|
let is_match = searcher.is_match_fw;
|
|
searcher.is_match_fw = !searcher.is_match_fw;
|
|
let pos = searcher.position;
|
|
match self.haystack[pos..].chars().next() {
|
|
_ if is_match => SearchStep::Match(pos, pos),
|
|
None => SearchStep::Done,
|
|
Some(ch) => {
|
|
searcher.position += ch.len_utf8();
|
|
SearchStep::Reject(pos, searcher.position)
|
|
}
|
|
}
|
|
}
|
|
StrSearcherImpl::TwoWay(ref mut searcher) => {
|
|
// TwoWaySearcher produces valid *Match* indices that split at char boundaries
|
|
// as long as it does correct matching and that haystack and needle are
|
|
// valid UTF-8
|
|
// *Rejects* from the algorithm can fall on any indices, but we will walk them
|
|
// manually to the next character boundary, so that they are utf-8 safe.
|
|
if searcher.position == self.haystack.len() {
|
|
return SearchStep::Done;
|
|
}
|
|
let is_long = searcher.memory == usize::MAX;
|
|
match searcher.next::<RejectAndMatch>(self.haystack.as_bytes(),
|
|
self.needle.as_bytes(),
|
|
is_long)
|
|
{
|
|
SearchStep::Reject(a, mut b) => {
|
|
// skip to next char boundary
|
|
while !self.haystack.is_char_boundary(b) {
|
|
b += 1;
|
|
}
|
|
searcher.position = cmp::max(b, searcher.position);
|
|
SearchStep::Reject(a, b)
|
|
}
|
|
otherwise => otherwise,
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
#[inline(always)]
|
|
fn next_match(&mut self) -> Option<(usize, usize)> {
|
|
match self.searcher {
|
|
StrSearcherImpl::Empty(..) => {
|
|
loop {
|
|
match self.next() {
|
|
SearchStep::Match(a, b) => return Some((a, b)),
|
|
SearchStep::Done => return None,
|
|
SearchStep::Reject(..) => { }
|
|
}
|
|
}
|
|
}
|
|
StrSearcherImpl::TwoWay(ref mut searcher) => {
|
|
let is_long = searcher.memory == usize::MAX;
|
|
// write out `true` and `false` cases to encourage the compiler
|
|
// to specialize the two cases separately.
|
|
if is_long {
|
|
searcher.next::<MatchOnly>(self.haystack.as_bytes(),
|
|
self.needle.as_bytes(),
|
|
true)
|
|
} else {
|
|
searcher.next::<MatchOnly>(self.haystack.as_bytes(),
|
|
self.needle.as_bytes(),
|
|
false)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
unsafe impl<'a, 'b> ReverseSearcher<'a> for StrSearcher<'a, 'b> {
|
|
#[inline]
|
|
fn next_back(&mut self) -> SearchStep {
|
|
match self.searcher {
|
|
StrSearcherImpl::Empty(ref mut searcher) => {
|
|
let is_match = searcher.is_match_bw;
|
|
searcher.is_match_bw = !searcher.is_match_bw;
|
|
let end = searcher.end;
|
|
match self.haystack[..end].chars().next_back() {
|
|
_ if is_match => SearchStep::Match(end, end),
|
|
None => SearchStep::Done,
|
|
Some(ch) => {
|
|
searcher.end -= ch.len_utf8();
|
|
SearchStep::Reject(searcher.end, end)
|
|
}
|
|
}
|
|
}
|
|
StrSearcherImpl::TwoWay(ref mut searcher) => {
|
|
if searcher.end == 0 {
|
|
return SearchStep::Done;
|
|
}
|
|
let is_long = searcher.memory == usize::MAX;
|
|
match searcher.next_back::<RejectAndMatch>(self.haystack.as_bytes(),
|
|
self.needle.as_bytes(),
|
|
is_long)
|
|
{
|
|
SearchStep::Reject(mut a, b) => {
|
|
// skip to next char boundary
|
|
while !self.haystack.is_char_boundary(a) {
|
|
a -= 1;
|
|
}
|
|
searcher.end = cmp::min(a, searcher.end);
|
|
SearchStep::Reject(a, b)
|
|
}
|
|
otherwise => otherwise,
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
#[inline]
|
|
fn next_match_back(&mut self) -> Option<(usize, usize)> {
|
|
match self.searcher {
|
|
StrSearcherImpl::Empty(..) => {
|
|
loop {
|
|
match self.next_back() {
|
|
SearchStep::Match(a, b) => return Some((a, b)),
|
|
SearchStep::Done => return None,
|
|
SearchStep::Reject(..) => { }
|
|
}
|
|
}
|
|
}
|
|
StrSearcherImpl::TwoWay(ref mut searcher) => {
|
|
let is_long = searcher.memory == usize::MAX;
|
|
// write out `true` and `false`, like `next_match`
|
|
if is_long {
|
|
searcher.next_back::<MatchOnly>(self.haystack.as_bytes(),
|
|
self.needle.as_bytes(),
|
|
true)
|
|
} else {
|
|
searcher.next_back::<MatchOnly>(self.haystack.as_bytes(),
|
|
self.needle.as_bytes(),
|
|
false)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
/// The internal state of the two-way substring search algorithm.
|
|
#[derive(Clone, Debug)]
|
|
struct TwoWaySearcher {
|
|
// constants
|
|
/// critical factorization index
|
|
crit_pos: usize,
|
|
/// critical factorization index for reversed needle
|
|
crit_pos_back: usize,
|
|
period: usize,
|
|
/// `byteset` is an extension (not part of the two way algorithm);
|
|
/// it's a 64-bit "fingerprint" where each set bit `j` corresponds
|
|
/// to a (byte & 63) == j present in the needle.
|
|
byteset: u64,
|
|
|
|
// variables
|
|
position: usize,
|
|
end: usize,
|
|
/// index into needle before which we have already matched
|
|
memory: usize,
|
|
/// index into needle after which we have already matched
|
|
memory_back: usize,
|
|
}
|
|
|
|
/*
|
|
This is the Two-Way search algorithm, which was introduced in the paper:
|
|
Crochemore, M., Perrin, D., 1991, Two-way string-matching, Journal of the ACM 38(3):651-675.
|
|
|
|
Here's some background information.
|
|
|
|
A *word* is a string of symbols. The *length* of a word should be a familiar
|
|
notion, and here we denote it for any word x by |x|.
|
|
(We also allow for the possibility of the *empty word*, a word of length zero).
|
|
|
|
If x is any non-empty word, then an integer p with 0 < p <= |x| is said to be a
|
|
*period* for x iff for all i with 0 <= i <= |x| - p - 1, we have x[i] == x[i+p].
|
|
For example, both 1 and 2 are periods for the string "aa". As another example,
|
|
the only period of the string "abcd" is 4.
|
|
|
|
We denote by period(x) the *smallest* period of x (provided that x is non-empty).
|
|
This is always well-defined since every non-empty word x has at least one period,
|
|
|x|. We sometimes call this *the period* of x.
|
|
|
|
If u, v and x are words such that x = uv, where uv is the concatenation of u and
|
|
v, then we say that (u, v) is a *factorization* of x.
|
|
|
|
Let (u, v) be a factorization for a word x. Then if w is a non-empty word such
|
|
that both of the following hold
|
|
|
|
- either w is a suffix of u or u is a suffix of w
|
|
- either w is a prefix of v or v is a prefix of w
|
|
|
|
then w is said to be a *repetition* for the factorization (u, v).
|
|
|
|
Just to unpack this, there are four possibilities here. Let w = "abc". Then we
|
|
might have:
|
|
|
|
- w is a suffix of u and w is a prefix of v. ex: ("lolabc", "abcde")
|
|
- w is a suffix of u and v is a prefix of w. ex: ("lolabc", "ab")
|
|
- u is a suffix of w and w is a prefix of v. ex: ("bc", "abchi")
|
|
- u is a suffix of w and v is a prefix of w. ex: ("bc", "a")
|
|
|
|
Note that the word vu is a repetition for any factorization (u,v) of x = uv,
|
|
so every factorization has at least one repetition.
|
|
|
|
If x is a string and (u, v) is a factorization for x, then a *local period* for
|
|
(u, v) is an integer r such that there is some word w such that |w| = r and w is
|
|
a repetition for (u, v).
|
|
|
|
We denote by local_period(u, v) the smallest local period of (u, v). We sometimes
|
|
call this *the local period* of (u, v). Provided that x = uv is non-empty, this
|
|
is well-defined (because each non-empty word has at least one factorization, as
|
|
noted above).
|
|
|
|
It can be proven that the following is an equivalent definition of a local period
|
|
for a factorization (u, v): any positive integer r such that x[i] == x[i+r] for
|
|
all i such that |u| - r <= i <= |u| - 1 and such that both x[i] and x[i+r] are
|
|
defined. (i.e. i > 0 and i + r < |x|).
|
|
|
|
Using the above reformulation, it is easy to prove that
|
|
|
|
1 <= local_period(u, v) <= period(uv)
|
|
|
|
A factorization (u, v) of x such that local_period(u,v) = period(x) is called a
|
|
*critical factorization*.
|
|
|
|
The algorithm hinges on the following theorem, which is stated without proof:
|
|
|
|
**Critical Factorization Theorem** Any word x has at least one critical
|
|
factorization (u, v) such that |u| < period(x).
|
|
|
|
The purpose of maximal_suffix is to find such a critical factorization.
|
|
|
|
If the period is short, compute another factorization x = u' v' to use
|
|
for reverse search, chosen instead so that |v'| < period(x).
|
|
|
|
*/
|
|
impl TwoWaySearcher {
|
|
fn new(needle: &[u8], end: usize) -> TwoWaySearcher {
|
|
let (crit_pos_false, period_false) = TwoWaySearcher::maximal_suffix(needle, false);
|
|
let (crit_pos_true, period_true) = TwoWaySearcher::maximal_suffix(needle, true);
|
|
|
|
let (crit_pos, period) =
|
|
if crit_pos_false > crit_pos_true {
|
|
(crit_pos_false, period_false)
|
|
} else {
|
|
(crit_pos_true, period_true)
|
|
};
|
|
|
|
// A particularly readable explanation of what's going on here can be found
|
|
// in Crochemore and Rytter's book "Text Algorithms", ch 13. Specifically
|
|
// see the code for "Algorithm CP" on p. 323.
|
|
//
|
|
// What's going on is we have some critical factorization (u, v) of the
|
|
// needle, and we want to determine whether u is a suffix of
|
|
// &v[..period]. If it is, we use "Algorithm CP1". Otherwise we use
|
|
// "Algorithm CP2", which is optimized for when the period of the needle
|
|
// is large.
|
|
if &needle[..crit_pos] == &needle[period.. period + crit_pos] {
|
|
// short period case -- the period is exact
|
|
// compute a separate critical factorization for the reversed needle
|
|
// x = u' v' where |v'| < period(x).
|
|
//
|
|
// This is sped up by the period being known already.
|
|
// Note that a case like x = "acba" may be factored exactly forwards
|
|
// (crit_pos = 1, period = 3) while being factored with approximate
|
|
// period in reverse (crit_pos = 2, period = 2). We use the given
|
|
// reverse factorization but keep the exact period.
|
|
let crit_pos_back = needle.len() - cmp::max(
|
|
TwoWaySearcher::reverse_maximal_suffix(needle, period, false),
|
|
TwoWaySearcher::reverse_maximal_suffix(needle, period, true));
|
|
|
|
TwoWaySearcher {
|
|
crit_pos: crit_pos,
|
|
crit_pos_back: crit_pos_back,
|
|
period: period,
|
|
byteset: Self::byteset_create(&needle[..period]),
|
|
|
|
position: 0,
|
|
end: end,
|
|
memory: 0,
|
|
memory_back: needle.len(),
|
|
}
|
|
} else {
|
|
// long period case -- we have an approximation to the actual period,
|
|
// and don't use memorization.
|
|
//
|
|
// Approximate the period by lower bound max(|u|, |v|) + 1.
|
|
// The critical factorization is efficient to use for both forward and
|
|
// reverse search.
|
|
|
|
TwoWaySearcher {
|
|
crit_pos: crit_pos,
|
|
crit_pos_back: crit_pos,
|
|
period: cmp::max(crit_pos, needle.len() - crit_pos) + 1,
|
|
byteset: Self::byteset_create(needle),
|
|
|
|
position: 0,
|
|
end: end,
|
|
memory: usize::MAX, // Dummy value to signify that the period is long
|
|
memory_back: usize::MAX,
|
|
}
|
|
}
|
|
}
|
|
|
|
#[inline]
|
|
fn byteset_create(bytes: &[u8]) -> u64 {
|
|
bytes.iter().fold(0, |a, &b| (1 << (b & 0x3f)) | a)
|
|
}
|
|
|
|
#[inline(always)]
|
|
fn byteset_contains(&self, byte: u8) -> bool {
|
|
(self.byteset >> ((byte & 0x3f) as usize)) & 1 != 0
|
|
}
|
|
|
|
// One of the main ideas of Two-Way is that we factorize the needle into
|
|
// two halves, (u, v), and begin trying to find v in the haystack by scanning
|
|
// left to right. If v matches, we try to match u by scanning right to left.
|
|
// How far we can jump when we encounter a mismatch is all based on the fact
|
|
// that (u, v) is a critical factorization for the needle.
|
|
#[inline(always)]
|
|
fn next<S>(&mut self, haystack: &[u8], needle: &[u8], long_period: bool)
|
|
-> S::Output
|
|
where S: TwoWayStrategy
|
|
{
|
|
// `next()` uses `self.position` as its cursor
|
|
let old_pos = self.position;
|
|
let needle_last = needle.len() - 1;
|
|
'search: loop {
|
|
// Check that we have room to search in
|
|
// position + needle_last can not overflow if we assume slices
|
|
// are bounded by isize's range.
|
|
let tail_byte = match haystack.get(self.position + needle_last) {
|
|
Some(&b) => b,
|
|
None => {
|
|
self.position = haystack.len();
|
|
return S::rejecting(old_pos, self.position);
|
|
}
|
|
};
|
|
|
|
if S::use_early_reject() && old_pos != self.position {
|
|
return S::rejecting(old_pos, self.position);
|
|
}
|
|
|
|
// Quickly skip by large portions unrelated to our substring
|
|
if !self.byteset_contains(tail_byte) {
|
|
self.position += needle.len();
|
|
if !long_period {
|
|
self.memory = 0;
|
|
}
|
|
continue 'search;
|
|
}
|
|
|
|
// See if the right part of the needle matches
|
|
let start = if long_period { self.crit_pos }
|
|
else { cmp::max(self.crit_pos, self.memory) };
|
|
for i in start..needle.len() {
|
|
if needle[i] != haystack[self.position + i] {
|
|
self.position += i - self.crit_pos + 1;
|
|
if !long_period {
|
|
self.memory = 0;
|
|
}
|
|
continue 'search;
|
|
}
|
|
}
|
|
|
|
// See if the left part of the needle matches
|
|
let start = if long_period { 0 } else { self.memory };
|
|
for i in (start..self.crit_pos).rev() {
|
|
if needle[i] != haystack[self.position + i] {
|
|
self.position += self.period;
|
|
if !long_period {
|
|
self.memory = needle.len() - self.period;
|
|
}
|
|
continue 'search;
|
|
}
|
|
}
|
|
|
|
// We have found a match!
|
|
let match_pos = self.position;
|
|
|
|
// Note: add self.period instead of needle.len() to have overlapping matches
|
|
self.position += needle.len();
|
|
if !long_period {
|
|
self.memory = 0; // set to needle.len() - self.period for overlapping matches
|
|
}
|
|
|
|
return S::matching(match_pos, match_pos + needle.len());
|
|
}
|
|
}
|
|
|
|
// Follows the ideas in `next()`.
|
|
//
|
|
// The definitions are symmetrical, with period(x) = period(reverse(x))
|
|
// and local_period(u, v) = local_period(reverse(v), reverse(u)), so if (u, v)
|
|
// is a critical factorization, so is (reverse(v), reverse(u)).
|
|
//
|
|
// For the reverse case we have computed a critical factorization x = u' v'
|
|
// (field `crit_pos_back`). We need |u| < period(x) for the forward case and
|
|
// thus |v'| < period(x) for the reverse.
|
|
//
|
|
// To search in reverse through the haystack, we search forward through
|
|
// a reversed haystack with a reversed needle, matching first u' and then v'.
|
|
#[inline]
|
|
fn next_back<S>(&mut self, haystack: &[u8], needle: &[u8], long_period: bool)
|
|
-> S::Output
|
|
where S: TwoWayStrategy
|
|
{
|
|
// `next_back()` uses `self.end` as its cursor -- so that `next()` and `next_back()`
|
|
// are independent.
|
|
let old_end = self.end;
|
|
'search: loop {
|
|
// Check that we have room to search in
|
|
// end - needle.len() will wrap around when there is no more room,
|
|
// but due to slice length limits it can never wrap all the way back
|
|
// into the length of haystack.
|
|
let front_byte = match haystack.get(self.end.wrapping_sub(needle.len())) {
|
|
Some(&b) => b,
|
|
None => {
|
|
self.end = 0;
|
|
return S::rejecting(0, old_end);
|
|
}
|
|
};
|
|
|
|
if S::use_early_reject() && old_end != self.end {
|
|
return S::rejecting(self.end, old_end);
|
|
}
|
|
|
|
// Quickly skip by large portions unrelated to our substring
|
|
if !self.byteset_contains(front_byte) {
|
|
self.end -= needle.len();
|
|
if !long_period {
|
|
self.memory_back = needle.len();
|
|
}
|
|
continue 'search;
|
|
}
|
|
|
|
// See if the left part of the needle matches
|
|
let crit = if long_period { self.crit_pos_back }
|
|
else { cmp::min(self.crit_pos_back, self.memory_back) };
|
|
for i in (0..crit).rev() {
|
|
if needle[i] != haystack[self.end - needle.len() + i] {
|
|
self.end -= self.crit_pos_back - i;
|
|
if !long_period {
|
|
self.memory_back = needle.len();
|
|
}
|
|
continue 'search;
|
|
}
|
|
}
|
|
|
|
// See if the right part of the needle matches
|
|
let needle_end = if long_period { needle.len() }
|
|
else { self.memory_back };
|
|
for i in self.crit_pos_back..needle_end {
|
|
if needle[i] != haystack[self.end - needle.len() + i] {
|
|
self.end -= self.period;
|
|
if !long_period {
|
|
self.memory_back = self.period;
|
|
}
|
|
continue 'search;
|
|
}
|
|
}
|
|
|
|
// We have found a match!
|
|
let match_pos = self.end - needle.len();
|
|
// Note: sub self.period instead of needle.len() to have overlapping matches
|
|
self.end -= needle.len();
|
|
if !long_period {
|
|
self.memory_back = needle.len();
|
|
}
|
|
|
|
return S::matching(match_pos, match_pos + needle.len());
|
|
}
|
|
}
|
|
|
|
// Compute the maximal suffix of `arr`.
|
|
//
|
|
// The maximal suffix is a possible critical factorization (u, v) of `arr`.
|
|
//
|
|
// Returns (`i`, `p`) where `i` is the starting index of v and `p` is the
|
|
// period of v.
|
|
//
|
|
// `order_greater` determines if lexical order is `<` or `>`. Both
|
|
// orders must be computed -- the ordering with the largest `i` gives
|
|
// a critical factorization.
|
|
//
|
|
// For long period cases, the resulting period is not exact (it is too short).
|
|
#[inline]
|
|
fn maximal_suffix(arr: &[u8], order_greater: bool) -> (usize, usize) {
|
|
let mut left = 0; // Corresponds to i in the paper
|
|
let mut right = 1; // Corresponds to j in the paper
|
|
let mut offset = 0; // Corresponds to k in the paper, but starting at 0
|
|
// to match 0-based indexing.
|
|
let mut period = 1; // Corresponds to p in the paper
|
|
|
|
while let Some(&a) = arr.get(right + offset) {
|
|
// `left` will be inbounds when `right` is.
|
|
let b = arr[left + offset];
|
|
if (a < b && !order_greater) || (a > b && order_greater) {
|
|
// Suffix is smaller, period is entire prefix so far.
|
|
right += offset + 1;
|
|
offset = 0;
|
|
period = right - left;
|
|
} else if a == b {
|
|
// Advance through repetition of the current period.
|
|
if offset + 1 == period {
|
|
right += offset + 1;
|
|
offset = 0;
|
|
} else {
|
|
offset += 1;
|
|
}
|
|
} else {
|
|
// Suffix is larger, start over from current location.
|
|
left = right;
|
|
right += 1;
|
|
offset = 0;
|
|
period = 1;
|
|
}
|
|
}
|
|
(left, period)
|
|
}
|
|
|
|
// Compute the maximal suffix of the reverse of `arr`.
|
|
//
|
|
// The maximal suffix is a possible critical factorization (u', v') of `arr`.
|
|
//
|
|
// Returns `i` where `i` is the starting index of v', from the back;
|
|
// returns immedately when a period of `known_period` is reached.
|
|
//
|
|
// `order_greater` determines if lexical order is `<` or `>`. Both
|
|
// orders must be computed -- the ordering with the largest `i` gives
|
|
// a critical factorization.
|
|
//
|
|
// For long period cases, the resulting period is not exact (it is too short).
|
|
fn reverse_maximal_suffix(arr: &[u8], known_period: usize,
|
|
order_greater: bool) -> usize
|
|
{
|
|
let mut left = 0; // Corresponds to i in the paper
|
|
let mut right = 1; // Corresponds to j in the paper
|
|
let mut offset = 0; // Corresponds to k in the paper, but starting at 0
|
|
// to match 0-based indexing.
|
|
let mut period = 1; // Corresponds to p in the paper
|
|
let n = arr.len();
|
|
|
|
while right + offset < n {
|
|
let a = arr[n - (1 + right + offset)];
|
|
let b = arr[n - (1 + left + offset)];
|
|
if (a < b && !order_greater) || (a > b && order_greater) {
|
|
// Suffix is smaller, period is entire prefix so far.
|
|
right += offset + 1;
|
|
offset = 0;
|
|
period = right - left;
|
|
} else if a == b {
|
|
// Advance through repetition of the current period.
|
|
if offset + 1 == period {
|
|
right += offset + 1;
|
|
offset = 0;
|
|
} else {
|
|
offset += 1;
|
|
}
|
|
} else {
|
|
// Suffix is larger, start over from current location.
|
|
left = right;
|
|
right += 1;
|
|
offset = 0;
|
|
period = 1;
|
|
}
|
|
if period == known_period {
|
|
break;
|
|
}
|
|
}
|
|
debug_assert!(period <= known_period);
|
|
left
|
|
}
|
|
}
|
|
|
|
// TwoWayStrategy allows the algorithm to either skip non-matches as quickly
|
|
// as possible, or to work in a mode where it emits Rejects relatively quickly.
|
|
trait TwoWayStrategy {
|
|
type Output;
|
|
fn use_early_reject() -> bool;
|
|
fn rejecting(usize, usize) -> Self::Output;
|
|
fn matching(usize, usize) -> Self::Output;
|
|
}
|
|
|
|
/// Skip to match intervals as quickly as possible
|
|
enum MatchOnly { }
|
|
|
|
impl TwoWayStrategy for MatchOnly {
|
|
type Output = Option<(usize, usize)>;
|
|
|
|
#[inline]
|
|
fn use_early_reject() -> bool { false }
|
|
#[inline]
|
|
fn rejecting(_a: usize, _b: usize) -> Self::Output { None }
|
|
#[inline]
|
|
fn matching(a: usize, b: usize) -> Self::Output { Some((a, b)) }
|
|
}
|
|
|
|
/// Emit Rejects regularly
|
|
enum RejectAndMatch { }
|
|
|
|
impl TwoWayStrategy for RejectAndMatch {
|
|
type Output = SearchStep;
|
|
|
|
#[inline]
|
|
fn use_early_reject() -> bool { true }
|
|
#[inline]
|
|
fn rejecting(a: usize, b: usize) -> Self::Output { SearchStep::Reject(a, b) }
|
|
#[inline]
|
|
fn matching(a: usize, b: usize) -> Self::Output { SearchStep::Match(a, b) }
|
|
}
|