user0/rust - Forgejo: Beyond coding. We Forge.

user0/rust

Author	SHA1	Message	Date
Oliver Middleton	9673ace339	Fix a couple of issues in from_utf8 docs	2016-02-14 18:38:37 +00:00
Oliver Middleton	bc8495abd8	Fix doc error for Utf8Error	2016-02-14 17:52:05 +00:00
Carlos E. Garcia	02aa0aff2f	Minor spelling fixes	2016-02-09 11:52:39 -05:00
Kamal Marhubi	129a6239d2	docs: Standardize on 'Errors' header in std docs	2016-02-01 21:41:29 -05:00
Ulrik Sverdrup	ba9a3bc453	core: Use raw pointers to avoid aliasing in str::split_at_mut Introduce private function from_raw_parts_mut for str to factor out the logic. We want to use raw pointers here instead of duplicating a &mut str, to be on safer ground w.r.t rust aliasing rules.	2016-01-21 15:25:49 +01:00
Alex Crichton	9a4f43b9b6	std: Stabilize APIs for the 1.7 release This commit stabilizes and deprecates the FCP (final comment period) APIs for the upcoming 1.7 beta release. The specific APIs which changed were: Stabilized * `Path::strip_prefix` (renamed from `relative_from`) * `path::StripPrefixError` (new error type returned from `strip_prefix`) * `Ipv4Addr::is_loopback` * `Ipv4Addr::is_private` * `Ipv4Addr::is_link_local` * `Ipv4Addr::is_multicast` * `Ipv4Addr::is_broadcast` * `Ipv4Addr::is_documentation` * `Ipv6Addr::is_unspecified` * `Ipv6Addr::is_loopback` * `Ipv6Addr::is_unique_local` * `Ipv6Addr::is_multicast` * `Vec::as_slice` * `Vec::as_mut_slice` * `String::as_str` * `String::as_mut_str` * `<[T]>::clone_from_slice` - the `usize` return value is removed * `<[T]>::sort_by_key` * `i32::checked_rem` (and other signed types) * `i32::checked_neg` (and other signed types) * `i32::checked_shl` (and other signed types) * `i32::checked_shr` (and other signed types) * `i32::saturating_mul` (and other signed types) * `i32::overflowing_add` (and other signed types) * `i32::overflowing_sub` (and other signed types) * `i32::overflowing_mul` (and other signed types) * `i32::overflowing_div` (and other signed types) * `i32::overflowing_rem` (and other signed types) * `i32::overflowing_neg` (and other signed types) * `i32::overflowing_shl` (and other signed types) * `i32::overflowing_shr` (and other signed types) * `u32::checked_rem` (and other unsigned types) * `u32::checked_neg` (and other unsigned types) * `u32::checked_shl` (and other unsigned types) * `u32::saturating_mul` (and other unsigned types) * `u32::overflowing_add` (and other unsigned types) * `u32::overflowing_sub` (and other unsigned types) * `u32::overflowing_mul` (and other unsigned types) * `u32::overflowing_div` (and other unsigned types) * `u32::overflowing_rem` (and other unsigned types) * `u32::overflowing_neg` (and other unsigned types) * `u32::overflowing_shl` (and other unsigned types) * `u32::overflowing_shr` (and other unsigned types) * `ffi::IntoStringError` * `CString::into_string` * `CString::into_bytes` * `CString::into_bytes_with_nul` * `From<CString> for Vec<u8>` * `From<CString> for Vec<u8>` * `IntoStringError::into_cstring` * `IntoStringError::utf8_error` * `Error for IntoStringError` Deprecated * `Path::relative_from` - renamed to `strip_prefix` * `Path::prefix` - use `components().next()` instead * `os::unix::fs` constants - moved to the `libc` crate * `fmt::{radix, Radix, RadixFmt}` - not used enough to stabilize * `IntoCow` - conflicts with `Into` and may come back later * `i32::{BITS, BYTES}` (and other integers) - not pulling their weight * `DebugTuple::formatter` - will be removed * `sync::Semaphore` - not used enough and confused with system semaphores Closes #23284 cc #27709 (still lots more methods though) Closes #27712 Closes #27722 Closes #27728 Closes #27735 Closes #27729 Closes #27755 Closes #27782 Closes #27798	2016-01-16 11:03:10 -08:00
Ulrik Sverdrup	cadcd70775	UTF-8 validation: Add missing if conditional for short input We need to guard that `len` is large enough for the fast skip loop.	2016-01-14 14:59:55 +01:00
Ulrik Sverdrup	11e3de39d9	Add fast path for ASCII in UTF-8 validation This speeds up the ascii case (and long stretches of ascii in otherwise mixed UTF-8 data) when checking UTF-8 validity. Benchmark results suggest that on purely ASCII input, we can improve throughput (megabytes verified / second) by a factor of 13 to 14! On xml and mostly english language input (en.wikipedia xml dump), throughput increases by a factor 7. On mostly non-ASCII input, performance increases slightly or is the same. The UTF-8 validation is rewritten to use indexed access; since all access is preceded by a (mandatory for validation) length check, they are statically elided by llvm and this formulation is in fact the best for performance. A previous version had losses due to slice to iterator conversions. A large credit to Björn Steinbrink who improved this patch immensely, writing this second version. Benchmark results on x86-64 (Sandy Bridge) compiled with -C opt-level=3. Old code is `regular`, this PR is called `fast`. Datasets: - `ascii` is just ascii (2.5 kB) - `cyr` is cyrillic script with ascii spaces (5 kB) - `dewik10` is 10MB of a de.wikipedia xml dump - `enwik10` is 100MB of an en.wikipedia xml dump - `jawik10` is 10MB of a ja.wikipedia xml dump ``` test from_utf8_ascii_fast ... bench: 140 ns/iter (+/- 4) = 18221 MB/s test from_utf8_ascii_regular ... bench: 1,932 ns/iter (+/- 19) = 1320 MB/s test from_utf8_cyr_fast ... bench: 10,025 ns/iter (+/- 245) = 511 MB/s test from_utf8_cyr_regular ... bench: 12,250 ns/iter (+/- 437) = 418 MB/s test from_utf8_dewik10_fast ... bench: 6,017,909 ns/iter (+/- 105,755) = 1740 MB/s test from_utf8_dewik10_regular ... bench: 11,669,493 ns/iter (+/- 264,045) = 891 MB/s test from_utf8_enwik8_fast ... bench: 14,085,692 ns/iter (+/- 1,643,316) = 7000 MB/s test from_utf8_enwik8_regular ... bench: 93,657,410 ns/iter (+/- 5,353,353) = 1000 MB/s test from_utf8_jawik10_fast ... bench: 29,154,073 ns/iter (+/- 4,659,534) = 340 MB/s test from_utf8_jawik10_regular ... bench: 29,112,917 ns/iter (+/- 2,475,123) = 340 MB/s ``` Co-authored-by: Björn Steinbrink <bsteinbr@gmail.com>	2016-01-12 21:57:04 +01:00
Ori Avtalion	d01ec6ccca	Add missing links to str docs	2015-12-11 22:52:28 +02:00
Alex Crichton	464cdff102	std: Stabilize APIs for the 1.6 release This commit is the standard API stabilization commit for the 1.6 release cycle. The list of issues and APIs below have all been through their cycle-long FCP and the libs team decisions are listed below Stabilized APIs * `Read::read_exact` * `ErrorKind::UnexpectedEof` (renamed from `UnexpectedEOF`) * libcore -- this was a bit of a nuanced stabilization, the crate itself is now marked as `#[stable]` and the methods appearing via traits for primitives like `char` and `str` are now also marked as stable. Note that the extension traits themeselves are marked as unstable as they're imported via the prelude. The `try!` macro was also moved from the standard library into libcore to have the same interface. Otherwise the functions all have copied stability from the standard library now. * The `#![no_std]` attribute * `fs::DirBuilder` * `fs::DirBuilder::new` * `fs::DirBuilder::recursive` * `fs::DirBuilder::create` * `os::unix::fs::DirBuilderExt` * `os::unix::fs::DirBuilderExt::mode` * `vec::Drain` * `vec::Vec::drain` * `string::Drain` * `string::String::drain` * `vec_deque::Drain` * `vec_deque::VecDeque::drain` * `collections::hash_map::Drain` * `collections::hash_map::HashMap::drain` * `collections::hash_set::Drain` * `collections::hash_set::HashSet::drain` * `collections::binary_heap::Drain` * `collections::binary_heap::BinaryHeap::drain` * `Vec::extend_from_slice` (renamed from `push_all`) * `Mutex::get_mut` * `Mutex::into_inner` * `RwLock::get_mut` * `RwLock::into_inner` * `Iterator::min_by_key` (renamed from `min_by`) * `Iterator::max_by_key` (renamed from `max_by`) Deprecated APIs * `ErrorKind::UnexpectedEOF` (renamed to `UnexpectedEof`) * `OsString::from_bytes` * `OsStr::to_cstring` * `OsStr::to_bytes` * `fs::walk_dir` and `fs::WalkDir` * `path::Components::peek` * `slice::bytes::MutableByteVector` * `slice::bytes::copy_memory` * `Vec::push_all` (renamed to `extend_from_slice`) * `Duration::span` * `IpAddr` * `SocketAddr::ip` * `Read::tee` * `io::Tee` * `Write::broadcast` * `io::Broadcast` * `Iterator::min_by` (renamed to `min_by_key`) * `Iterator::max_by` (renamed to `max_by_key`) * `net::lookup_addr` New APIs (still unstable) * `<[T]>::sort_by_key` (added to mirror `min_by_key`) Closes #27585 Closes #27704 Closes #27707 Closes #27710 Closes #27711 Closes #27727 Closes #27740 Closes #27744 Closes #27799 Closes #27801 cc #27801 (doesn't close as `Chars` is still unstable) Closes #28968	2015-12-05 15:09:44 -08:00
bors	040a77f772	Auto merge of #29952 - petrochenkov:depr, r=brson Part of https://github.com/rust-lang/rust/issues/29935 The deprecation lint is still called "deprecated", so people can continue using `#[allow(deprecated)]` and similar things.	2015-11-23 20:08:49 +00:00
Manish Goregaokar	c0f9a39e5c	Mark slice_error_fail as a cold path	2015-11-23 09:34:58 +05:30
Vadim Petrochenkov	a613059e3f	Rename #[deprecated] to #[rustc_deprecated]	2015-11-20 16:11:20 +03:00
Devon Hollowood	da5dd298d4	Add information about str::parse() in FromStr docs	2015-11-20 00:41:10 -08:00
Vadim Petrochenkov	7e2ffc7090	Add missing annotations and some tests	2015-11-18 01:24:21 +03:00
Kevin Butler	82784cb89d	libcore: deny warnings in doctests	2015-11-12 05:16:08 +00:00
Vadim Petrochenkov	2ef07f0519	Remove stability annotations from trait impl items Remove `stable` stability annotations from inherent impls	2015-11-06 00:13:46 +03:00
bors	e02ada6d38	Auto merge of #29254 - alexcrichton:stabilize-1.5, r=brson This commit stabilizes and deprecates library APIs whose FCP has closed in the last cycle, specifically: Stabilized APIs: * `fs::canonicalize` * `Path::{metadata, symlink_metadata, canonicalize, read_link, read_dir, exists, is_file, is_dir}` - all moved to inherent methods from the `PathExt` trait. * `Formatter::fill` * `Formatter::width` * `Formatter::precision` * `Formatter::sign_plus` * `Formatter::sign_minus` * `Formatter::alternate` * `Formatter::sign_aware_zero_pad` * `string::ParseError` * `Utf8Error::valid_up_to` * `Iterator::{cmp, partial_cmp, eq, ne, lt, le, gt, ge}` * `<[T]>::split_{first,last}{,_mut}` * `Condvar::wait_timeout` - note that `wait_timeout_ms` is not yet deprecated but will be once 1.5 is released. * `str::{R,}MatchIndices` * `str::{r,}match_indices` * `char::from_u32_unchecked` * `VecDeque::insert` * `VecDeque::shrink_to_fit` * `VecDeque::as_slices` * `VecDeque::as_mut_slices` * `VecDeque::swap_remove_front` - (renamed from `swap_front_remove`) * `VecDeque::swap_remove_back` - (renamed from `swap_back_remove`) * `Vec::resize` * `str::slice_mut_unchecked` * `FileTypeExt` * `FileTypeExt::{is_block_device, is_char_device, is_fifo, is_socket}` * `BinaryHeap::from` - `from_vec` deprecated in favor of this * `BinaryHeap::into_vec` - plus a `Into` impl * `BinaryHeap::into_sorted_vec` Deprecated APIs * `slice::ref_slice` * `slice::mut_ref_slice` * `iter::{range_inclusive, RangeInclusive}` * `std::dynamic_lib` Closes #27706 Closes #27725 cc #27726 (align not stabilized yet) Closes #27734 Closes #27737 Closes #27742 Closes #27743 Closes #27772 Closes #27774 Closes #27777 Closes #27781 cc #27788 (a few remaining methods though) Closes #27790 Closes #27793 Closes #27796 Closes #27810 cc #28147 (not all parts stabilized)	2015-10-25 16:38:38 +00:00
Alex Crichton	ff49733274	std: Stabilize library APIs for 1.5 This commit stabilizes and deprecates library APIs whose FCP has closed in the last cycle, specifically: Stabilized APIs: * `fs::canonicalize` * `Path::{metadata, symlink_metadata, canonicalize, read_link, read_dir, exists, is_file, is_dir}` - all moved to inherent methods from the `PathExt` trait. * `Formatter::fill` * `Formatter::width` * `Formatter::precision` * `Formatter::sign_plus` * `Formatter::sign_minus` * `Formatter::alternate` * `Formatter::sign_aware_zero_pad` * `string::ParseError` * `Utf8Error::valid_up_to` * `Iterator::{cmp, partial_cmp, eq, ne, lt, le, gt, ge}` * `<[T]>::split_{first,last}{,_mut}` * `Condvar::wait_timeout` - note that `wait_timeout_ms` is not yet deprecated but will be once 1.5 is released. * `str::{R,}MatchIndices` * `str::{r,}match_indices` * `char::from_u32_unchecked` * `VecDeque::insert` * `VecDeque::shrink_to_fit` * `VecDeque::as_slices` * `VecDeque::as_mut_slices` * `VecDeque::swap_remove_front` - (renamed from `swap_front_remove`) * `VecDeque::swap_remove_back` - (renamed from `swap_back_remove`) * `Vec::resize` * `str::slice_mut_unchecked` * `FileTypeExt` * `FileTypeExt::{is_block_device, is_char_device, is_fifo, is_socket}` * `BinaryHeap::from` - `from_vec` deprecated in favor of this * `BinaryHeap::into_vec` - plus a `Into` impl * `BinaryHeap::into_sorted_vec` Deprecated APIs * `slice::ref_slice` * `slice::mut_ref_slice` * `iter::{range_inclusive, RangeInclusive}` * `std::dynamic_lib` Closes #27706 Closes #27725 cc #27726 (align not stabilized yet) Closes #27734 Closes #27737 Closes #27742 Closes #27743 Closes #27772 Closes #27774 Closes #27777 Closes #27781 cc #27788 (a few remaining methods though) Closes #27790 Closes #27793 Closes #27796 Closes #27810 cc #28147 (not all parts stabilized)	2015-10-25 09:36:32 -07:00
Steve Klabnik	b17433292d	Unsafety -> Safety in doc headings Follow https://doc.rust-lang.org/book/documentation.html#special-sections	2015-10-23 11:42:14 -04:00
Andrew Paseltiner	1162b3752c	Correct spelling in docs	2015-10-13 09:44:11 -04:00
Andrew Chin	dce58baff0	Trivial typo fix: from_utrf8 should be from_utf8	2015-10-10 23:33:43 -04:00
Cristi Cobzarenco	4b308b44e1	typos: fix a grabbag of typos all over the place	2015-10-08 19:49:31 +01:00
Steve Klabnik	4d73da92f0	Improve documentation for the from_utf8 family Our docs were very basic for the various versions of from_utf8, so this commit beefs them up. It also improves docs for the &str variant's error, Utf8Error.	2015-10-02 19:42:25 -04:00
Steve Klabnik	80130005da	Add some docs for FromString::from_str @marchelzo pointed out on IRC that this doesn't have docs, so, let's change that.	2015-09-30 17:42:41 -04:00
bors	54f7b1d455	Auto merge of #28632 - alexcrichton:update-match-indices, r=Kimundi This commit updates the `MatchIndices` and `RMatchIndices` iterators to follow the same pattern as the `chars` and `char_indices` iterators. The `matches` iterator currently yield `&str` elements, so the `MatchIndices` iterator now yields the index of the match as well as the `&str` that matched (instead of start/end indexes). cc #27743	2015-09-26 20:06:51 +00:00
Alex Crichton	d5f2d3b177	std: Update MatchIndices to return a subslice This commit updates the `MatchIndices` and `RMatchIndices` iterators to follow the same pattern as the `chars` and `char_indices` iterators. The `matches` iterator currently yield `&str` elements, so the `MatchIndices` iterator now yields the index of the match as well as the `&str` that matched (instead of start/end indexes). cc #27743	2015-09-25 09:29:23 -07:00
Simon Sapin	081278eb7a	Utf8Error::valid_up_to: make documented semantics more precise/useful	2015-09-24 18:54:12 +02:00
bors	cedbd998a4	Auto merge of #28339 - alexcrichton:stabilize-1.4, r=aturon The FCP is coming to a close and 1.4 is coming out soon, so this brings in the libs team decision for all library features this cycle. Stabilized APIs: * `<Box<str>>::into_string` * `Arc::downgrade` * `Arc::get_mut` * `Arc::make_mut` * `Arc::try_unwrap` * `Box::from_raw` * `Box::into_raw` * `CStr::to_str` * `CStr::to_string_lossy` * `CString::from_raw` * `CString::into_raw` * `IntoRawFd::into_raw_fd` * `IntoRawFd` * `IntoRawHandle::into_raw_handle` * `IntoRawHandle` * `IntoRawSocket::into_raw_socket` * `IntoRawSocket` * `Rc::downgrade` * `Rc::get_mut` * `Rc::make_mut` * `Rc::try_unwrap` * `Result::expect` * `String::into_boxed_slice` * `TcpSocket::read_timeout` * `TcpSocket::set_read_timeout` * `TcpSocket::set_write_timeout` * `TcpSocket::write_timeout` * `UdpSocket::read_timeout` * `UdpSocket::set_read_timeout` * `UdpSocket::set_write_timeout` * `UdpSocket::write_timeout` * `Vec::append` * `Vec::split_off` * `VecDeque::append` * `VecDeque::retain` * `VecDeque::split_off` * `rc::Weak::upgrade` * `rc::Weak` * `slice::Iter::as_slice` * `slice::IterMut::into_slice` * `str::CharIndices::as_str` * `str::Chars::as_str` * `str::split_at_mut` * `str::split_at` * `sync::Weak::upgrade` * `sync::Weak` * `thread::park_timeout` * `thread::sleep` Deprecated APIs * `BTreeMap::with_b` * `BTreeSet::with_b` * `Option::as_mut_slice` * `Option::as_slice` * `Result::as_mut_slice` * `Result::as_slice` * `f32::from_str_radix` * `f64::from_str_radix` Closes #27277 Closes #27718 Closes #27736 Closes #27764 Closes #27765 Closes #27766 Closes #27767 Closes #27768 Closes #27769 Closes #27771 Closes #27773 Closes #27775 Closes #27776 Closes #27785 Closes #27792 Closes #27795 Closes #27797	2015-09-13 19:45:15 +00:00
Alex Crichton	f0b1326dc7	std: Stabilize/deprecate features for 1.4 The FCP is coming to a close and 1.4 is coming out soon, so this brings in the libs team decision for all library features this cycle. Stabilized APIs: * `<Box<str>>::into_string` * `Arc::downgrade` * `Arc::get_mut` * `Arc::make_mut` * `Arc::try_unwrap` * `Box::from_raw` * `Box::into_raw` * `CStr::to_str` * `CStr::to_string_lossy` * `CString::from_raw` * `CString::into_raw` * `IntoRawFd::into_raw_fd` * `IntoRawFd` * `IntoRawHandle::into_raw_handle` * `IntoRawHandle` * `IntoRawSocket::into_raw_socket` * `IntoRawSocket` * `Rc::downgrade` * `Rc::get_mut` * `Rc::make_mut` * `Rc::try_unwrap` * `Result::expect` * `String::into_boxed_slice` * `TcpSocket::read_timeout` * `TcpSocket::set_read_timeout` * `TcpSocket::set_write_timeout` * `TcpSocket::write_timeout` * `UdpSocket::read_timeout` * `UdpSocket::set_read_timeout` * `UdpSocket::set_write_timeout` * `UdpSocket::write_timeout` * `Vec::append` * `Vec::split_off` * `VecDeque::append` * `VecDeque::retain` * `VecDeque::split_off` * `rc::Weak::upgrade` * `rc::Weak` * `slice::Iter::as_slice` * `slice::IterMut::into_slice` * `str::CharIndices::as_str` * `str::Chars::as_str` * `str::split_at_mut` * `str::split_at` * `sync::Weak::upgrade` * `sync::Weak` * `thread::park_timeout` * `thread::sleep` Deprecated APIs * `BTreeMap::with_b` * `BTreeSet::with_b` * `Option::as_mut_slice` * `Option::as_slice` * `Result::as_mut_slice` * `Result::as_slice` * `f32::from_str_radix` * `f64::from_str_radix` Closes #27277 Closes #27718 Closes #27736 Closes #27764 Closes #27765 Closes #27766 Closes #27767 Closes #27768 Closes #27769 Closes #27771 Closes #27773 Closes #27775 Closes #27776 Closes #27785 Closes #27792 Closes #27795 Closes #27797	2015-09-11 09:48:48 -07:00
Erick Tryzelaar	fbd91a732b	Optimize string comparison by using memcmp llvm seems to be having some trouble optimizing the iterator-based string comparsion method into some equivalent to memcmp. This explicitly calls out to the memcmp intrinisic in order to allow llvm to generate better code. In some manual benchmarking, this memcmp-based approach is 20 times faster than the iterator approach.	2015-09-10 16:46:59 -07:00
Andre Bogus	9cca96545f	some more clippy-based improvements	2015-09-08 00:36:29 +02:00
bors	94ddfc7707	Auto merge of #28119 - nagisa:bytesderef, r=alexcrichton	2015-09-04 12:34:03 +00:00
Alex Crichton	48615a68fb	std: Account for CRLF in {str, BufRead}::lines This commit is an implementation of [RFC 1212][rfc] which tweaks the behavior of the `str::lines` and `BufRead::lines` iterators. Both iterators now account for `\r\n` sequences in addition to `\n`, allowing for less surprising behavior across platforms (especially in the `BufRead` case). Splitting only on the `\n` character can still be achieved with `split('\n')` in both cases. The `str::lines_any` function is also now deprecated as `str::lines` is a drop-in replacement for it. [rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1212-line-endings.md Closes #28032	2015-09-03 23:01:41 -07:00
Manish Goregaokar	a520568ae7	Elide lifetimes in libcore	2015-09-03 17:46:35 +05:30
Simonas Kazlauskas	25dce09a96	Change explicit BytesDeref impl into Cloned iterator	2015-08-31 14:48:28 +03:00
bors	b0f77ba26a	Auto merge of #28101 - ijks:24214-str-bytes, r=alexcrichton Specifically, `count`, `last`, and `nth` are implemented to use the methods of the underlying slice iterator. Partially closes #24214.	2015-08-31 09:15:55 +00:00
Daan Rijks	dacf2725ec	Add overrides to iterator methods for `str::Bytes` Specifically, `count`, `last`, and `nth` are implemented to use the methods of the underlying slice iterator. Partially closes #24214.	2015-08-30 17:32:50 +02:00
Simon Sapin	e33650c16f	Add .as_str() to str::Chars and str::CharIndices. See #27775 .	2015-08-28 22:44:17 +02:00
bors	de67d62c6b	Auto merge of #27474 - bluss:twoway-reverse, r=brson StrSearcher: Implement the complete reverse case for the two way algorithm Fix quadratic behavior in StrSearcher in reverse search with periodic needles. This commit adds the missing pieces for the "short period" case in reverse search. The short case will show up when the needle is literally periodic, for example "abababab". Two way uses a "critical factorization" of the needle: x = u v. Searching matches v first, if mismatch at character k, skip k forward. Matching u, if mismatch, skip period(x) forward. To avoid O(mn) behavior after mismatch in u, memorize the already matched prefix. The short period case requires that \|u\| < period(x). For the reverse search we need to compute a different critical factorization x = u' v' where \|v'\| < period(x), because we are searching for the reversed needle. A short v' also benefits the algorithm in general. The reverse critical factorization is computed quickly by using the same maximal suffix algorithm, but terminating as soon as we have a location with local period equal to period(x). This adds extra fields crit_pos_back and memory_back for the reverse case. The new overhead for TwoWaySearcher::new is low, and additionally I think the "short period" case is uncommon in many applications of string search. The maximal_suffix methods were updated in documentation and the algorithms updated to not use !0 and wrapping add, variable left is now 1 larger, offset 1 smaller. Use periodicity when computing byteset: in the periodic case, just iterate over one period instead of the whole needle. Example before (rfind) after (twoway_rfind) benchmark shows the removal of quadratic behavior. needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100 ``` test periodic::rfind ... bench: 1,926,595 ns/iter (+/- 11,390) = 10 MB/s test periodic::twoway_rfind ... bench: 51,740 ns/iter (+/- 66) = 386 MB/s ```	2015-08-18 02:02:57 +00:00
Ulrik Sverdrup	01e8812461	StrSearcher: Additional comments and small code moves Break out a separate static method to create the "byteset".	2015-08-16 22:37:18 +02:00
Alex Crichton	b7dcf272d9	core: Fill out issues for unstable features	2015-08-15 18:09:16 -07:00
Niko Matsakis	91b3e9cac0	Fallout in libs -- misc missing bounds uncovered by WF checks.	2015-08-12 17:58:56 -04:00
Tobias Bucher	5309fbb6c9	Make `str::as_bytes_mut` private	2015-08-09 22:05:22 +02:00
Tobias Bucher	22ec5f4af7	Replace many uses of `mem::transmute` with more specific functions The replacements are functions that usually use a single `mem::transmute` in their body and restrict input and output via more concrete types than `T` and `U`. Worth noting are the `transmute` functions for slices and the `from_utf8*` family for mutable slices. Additionally, `mem::transmute` was often used for casting raw pointers, when you can already cast raw pointers just fine with `as`.	2015-08-09 22:05:22 +02:00
Ulrik Sverdrup	2b82c072c7	StrSearcher: Improve inner loop in TwoWaySearcher::next, next_back The innermost loop of TwoWaySearcher checks the boundary of the haystack vs position + needle.len(), and it checks the last byte of the needle against the byteset. If these two steps are combined by using the indexing of the last needle byte's position as bounds check, the algorithm improves its throughput. We improve the innermost loop by reducing the number of instructions used, and elminating the panic case for the checked indexing that was previously used. Selected benchmarks from the external/workspace testsuite. Benchmarks improve across the board. ``` before: test bb_in_aa::twoway_find ... bench: 4,229 ns/iter (+/- 1,305) = 23646 MB/s test bb_in_aa::twoway_rfind ... bench: 3,873 ns/iter (+/- 101) = 25819 MB/s test short_1let_long::twoway_find ... bench: 7,075 ns/iter (+/- 29) = 360 MB/s test short_1let_long::twoway_rfind ... bench: 6,640 ns/iter (+/- 79) = 384 MB/s test short_2let_long::twoway_find ... bench: 3,823 ns/iter (+/- 16) = 667 MB/s test short_2let_long::twoway_rfind ... bench: 3,774 ns/iter (+/- 44) = 675 MB/s test short_3let_long::twoway_find ... bench: 3,582 ns/iter (+/- 47) = 712 MB/s test short_3let_long::twoway_rfind ... bench: 3,616 ns/iter (+/- 34) = 705 MB/s with this commit: test bb_in_aa::twoway_find ... bench: 2,952 ns/iter (+/- 20) = 33875 MB/s test bb_in_aa::twoway_rfind ... bench: 2,939 ns/iter (+/- 99) = 34025 MB/s test short_1let_long::twoway_find ... bench: 4,593 ns/iter (+/- 4) = 555 MB/s test short_1let_long::twoway_rfind ... bench: 4,592 ns/iter (+/- 76) = 555 MB/s test short_2let_long::twoway_find ... bench: 2,804 ns/iter (+/- 3) = 909 MB/s test short_2let_long::twoway_rfind ... bench: 2,807 ns/iter (+/- 40) = 908 MB/s test short_3let_long::twoway_find ... bench: 3,105 ns/iter (+/- 120) = 821 MB/s test short_3let_long::twoway_rfind ... bench: 3,019 ns/iter (+/- 50) = 844 MB/s ``` - `bb_in_aa`: fast skip due to byteset filter loop improves. - 1/2/3let: Searches for 1, 2, or 3 ascii bytes improves.	2015-08-07 13:41:17 +02:00
Alex Crichton	5cccf3cd25	syntax: Implement #![no_core] This commit is an implementation of [RFC 1184][rfc] which tweaks the behavior of the `#![no_std]` attribute and adds a new `#![no_core]` attribute. The `#![no_std]` attribute now injects `extern crate core` at the top of the crate as well as the libcore prelude into all modules (in the same manner as the standard library's prelude). The `#![no_core]` attribute disables both std and core injection. [rfc]: https://github.com/rust-lang/rfcs/pull/1184	2015-08-03 17:23:01 -07:00
Ulrik Sverdrup	7ebae85bb8	StrSearcher: Implement the full two way algorithm in reverse for rfind Fix quadratic behavior in StrSearcher in reverse search with periodic needles. This commit adds the missing pieces for the "short period" case in reverse search. The short case will show up when the needle is literally periodic, for example "abababab". Two way uses a "critical factorization" of the needle: x = u v. Searching matches v first, if mismatch at character k, skip k forward. Matching u, if mismatch, skip period(x) forward. To avoid O(mn) behavior after mismatch in u, memorize the already matched prefix. The short period case requires that \|u\| < period(x). For the reverse search we need to compute a different critical factorization x = u' v' where \|v'\| < period(x), because we are searching for the reversed needle. A short v' also benefits the algorithm in general. The reverse critical factorization is computed quickly by using the same maximal suffix algorithm, but terminating as soon as we have a location with local period equal to period(x). This adds extra fields crit_pos_back and memory_back for the reverse case. The new overhead for TwoWaySearcher::new is low, and additionally I think the "short period" case is uncommon in many applications of string search. The maximal_suffix methods were updated in documentation and the algorithms updated to not use !0 and wrapping add, variable left is now 1 larger, offset 1 smaller. Use periodicity when computing byteset: in the periodic case, just iterate over one period instead of the whole needle. Example before (rfind) after (twoway_rfind) benchmark shows the removal of quadratic behavior. needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100 ``` test periodic::rfind ... bench: 1,926,595 ns/iter (+/- 11,390) = 10 MB/s test periodic::twoway_rfind ... bench: 51,740 ns/iter (+/- 66) = 386 MB/s ```	2015-08-02 20:09:35 +02:00
Alex Crichton	5af6cf9fa4	std: Remove the curious inner module This isn't actually necessary any more with the advent of `$crate` and changes in the compiler to expand macros to `::core::$foo` in the context of a `#![no_std]` crate. The libcore inner module was also trimmed down a bit to the bare bones.	2015-07-29 14:18:24 -07:00
Björn Steinbrink	000e870554	Inline eq_slice_() into eq_slice() eq_slice_() used to be a common implementation for two function that both called it, but of those only eq_slice() is left, so we can as well directly inline the code.	2015-07-21 18:30:18 +02:00

1 2 3 4 5

228 commits