user0/rust - Forgejo: Beyond coding. We Forge.

user0/rust

Author	SHA1	Message	Date
ltdk	edd318c313	Add {floor,ceil}_char_boundary methods to str	2022-02-07 13:34:08 -05:00
Thom Chiovoloni	41f821461f	Fix comment grammar for `do_count_chars`	2022-02-05 11:17:10 -08:00
Thom Chiovoloni	ebbccaf6bf	Respond to review feedback, and improve implementation somewhat	2022-02-05 11:15:18 -08:00
Thom Chiovoloni	628b217326	Optimize `core::str::Chars::count`	2022-02-05 11:15:17 -08:00
Frank Steffahn	a957cefda6	Fix a bunch of typos	2021-12-14 16:40:43 +01:00
japm48	0d7b830139	doc: fix typo in comments dereferencable -> dereferenceable	2021-12-12 00:27:27 +01:00
David Tolnay	4b0a9c9bc3	Delete Utf8Lossy::from_str	2021-12-08 22:54:51 -08:00
bors	94bec90702	Auto merge of #91244 - dtolnay:lossy, r=Mark-Simulacrum Eliminate bunch of copies of error codepath from Utf8LossyChunksIter Using a macro to stamp out 7 identical copies of the nontrivial slicing logic to exit this loop didn't seem like a necessary use of a macro. The early return case can be handled by `break` without practically any changes to the logic inside the loop. All this code is from early 2014 (#12062—nearly 8 years ago; pre-1.0) so it's possible there were compiler limitations that forced the macro way at the time. Confirmed that `x.py bench library/alloc --stage 0 --test-args from_utf8_lossy` is unaffected on my machine.	2021-11-30 01:08:56 +00:00
David Tolnay	c6810a569f	Clarify safety comment on using i to index into self.source	2021-11-26 12:57:36 -08:00
David Tolnay	2be9a8349f	Eliminate bunch of copies of error codepath from Utf8LossyChunksIter Using a macro to stamp out 7 identical copies of the nontrivial slicing logic to exit this loop didn't seem like a necessary use of a macro. The early return case can be handled by `break` without practically any changes to the logic inside the loop. All this code is from early 2014 (7.5 years old, pre-1.0) so it's possible there were compiler limitations that forced the macro way at the time. Confirmed that `x.py bench library/alloc --stage 0 --test-args from_utf8_lossy` is unaffected on my machine.	2021-11-25 19:52:45 -08:00
David Tolnay	553a84c445	Saner formatting for UTF8_CHAR_WIDTH table	2021-11-25 18:18:36 -08:00
Eduardo Sánchez Muñoz	23637e20cd	libcore: assume the input of `next_code_point` and `next_code_point_reverse` is UTF-8-like The functions are now `unsafe` and they use `Option::unwrap_unchecked` instead of `unwrap_or_0` `unwrap_or_0` was added in `42357d772b`. I guess `unwrap_unchecked` was not available back then. Given this example: ```rust pub fn first_char(s: &str) -> Option<char> { s.chars().next() } ``` Previously, the following assembly was produced: ```asm _ZN7example10first_char17ha056ddea6bafad1cE: .cfi_startproc test rsi, rsi je .LBB0_1 movzx edx, byte ptr [rdi] test dl, dl js .LBB0_3 mov eax, edx ret .LBB0_1: mov eax, 1114112 ret .LBB0_3: lea r8, [rdi + rsi] xor eax, eax mov r9, r8 cmp rsi, 1 je .LBB0_5 movzx eax, byte ptr [rdi + 1] add rdi, 2 and eax, 63 mov r9, rdi .LBB0_5: mov ecx, edx and ecx, 31 cmp dl, -33 jbe .LBB0_6 cmp r9, r8 je .LBB0_9 movzx esi, byte ptr [r9] add r9, 1 and esi, 63 shl eax, 6 or eax, esi cmp dl, -16 jb .LBB0_12 .LBB0_13: cmp r9, r8 je .LBB0_14 movzx edx, byte ptr [r9] and edx, 63 jmp .LBB0_16 .LBB0_6: shl ecx, 6 or eax, ecx ret .LBB0_9: xor esi, esi mov r9, r8 shl eax, 6 or eax, esi cmp dl, -16 jae .LBB0_13 .LBB0_12: shl ecx, 12 or eax, ecx ret .LBB0_14: xor edx, edx .LBB0_16: and ecx, 7 shl ecx, 18 shl eax, 6 or eax, ecx or eax, edx ret ``` After this change, the assembly is reduced to: ```asm _ZN7example10first_char17h4318683472f884ccE: .cfi_startproc test rsi, rsi je .LBB0_1 movzx ecx, byte ptr [rdi] test cl, cl js .LBB0_3 mov eax, ecx ret .LBB0_1: mov eax, 1114112 ret .LBB0_3: mov eax, ecx and eax, 31 movzx esi, byte ptr [rdi + 1] and esi, 63 cmp cl, -33 jbe .LBB0_4 movzx edx, byte ptr [rdi + 2] shl esi, 6 and edx, 63 or edx, esi cmp cl, -16 jb .LBB0_7 movzx ecx, byte ptr [rdi + 3] and eax, 7 shl eax, 18 shl edx, 6 and ecx, 63 or ecx, edx or eax, ecx ret .LBB0_4: shl eax, 6 or eax, esi ret .LBB0_7: shl eax, 12 or eax, edx ret ```	2021-11-21 17:05:55 +01:00
Maybe Waffle	573a00e3f9	Fill in tracking issues for `const_str_from_utf8` and `const_str_from_utf8_unchecked_mut` features	2021-11-18 14:04:01 +03:00
Maybe Waffle	cf6f64a963	Make slice->str conversion and related functions const This commit makes the following functions from `core::str` `const fn`: - `from_utf8[_mut]` (`feature(const_str_from_utf8)`) - `from_utf8_unchecked_mut` (`feature(const_str_from_utf8_unchecked_mut)`) - `Utf8Error::{valid_up_to,error_len}` (`feature(const_str_from_utf8)`)	2021-11-18 00:50:42 +03:00
bors	c7e4740ec1	Auto merge of #86336 - camsteffen:char-array-pattern, r=joshtriplett impl Pattern for char array Closes #39511 Closes #86329	2021-10-31 15:45:39 +00:00
Matthias Krüger	88e5ae2dd3	Rollup merge of #89786 - jkugelman:must-use-len-and-is_empty, r=joshtriplett Add #[must_use] to len and is_empty Parent issue: #89692 r? `@joshtriplett`	2021-10-31 13:20:05 +01:00
Matthias Krüger	95750ae439	Rollup merge of #89897 - jkugelman:must-use-core, r=joshtriplett Add #[must_use] to remaining core functions I've run out of compelling reasons to group functions together across crates so I'm just going to go module-by-module. This is everything remaining from the `core` crate. Ignored by clippy for reasons unknown: ```rust core::alloc::Layout unsafe fn for_value_raw<T: ?Sized>(t: *const T) -> Self; core::any const fn type_name_of_val<T: ?Sized>(_val: &T) -> &'static str; ``` Ignored by clippy because of `mut`: ```rust str fn split_at_mut(&mut self, mid: usize) -> (&mut str, &mut str); ``` <del> Ignored by clippy presumably because a caller might want `f` called for side effects. That seems like a bad usage of `map` to me. ```rust core::cell::Ref<'b, T> fn map<U: ?Sized, F>(orig: Ref<'b, T>, f: F) -> Ref<'b, T>; core::cell::Ref<'b, T> fn map_split<U: ?Sized, V: ?Sized, F>(orig: Ref<'b, T>, f: F) -> (Ref<'b, U>, Ref<'b, V>); ``` </del> Parent issue: #89692 r? ```@joshtriplett```	2021-10-31 09:20:26 +01:00
Matthias Krüger	a26b1d2259	Rollup merge of #89835 - jkugelman:must-use-expensive-computations, r=joshtriplett Add #[must_use] to expensive computations The unifying theme for this commit is weak, admittedly. I put together a list of "expensive" functions when I originally proposed this whole effort, but nobody's cared about that criterion. Still, it's a decent way to bite off a not-too-big chunk of work. Given the grab bag nature of this commit, the messages I used vary quite a bit. I'm open to wording changes. For some reason clippy flagged four `BTreeSet` methods but didn't say boo about equivalent ones on `HashSet`. I stared at them for a while but I can't figure out the difference so I added the `HashSet` ones in. ```rust // Flagged by clippy. alloc::collections::btree_set::BTreeSet<T> fn difference<'a>(&'a self, other: &'a BTreeSet<T>) -> Difference<'a, T>; alloc::collections::btree_set::BTreeSet<T> fn symmetric_difference<'a>(&'a self, other: &'a BTreeSet<T>) -> SymmetricDifference<'a, T> alloc::collections::btree_set::BTreeSet<T> fn intersection<'a>(&'a self, other: &'a BTreeSet<T>) -> Intersection<'a, T>; alloc::collections::btree_set::BTreeSet<T> fn union<'a>(&'a self, other: &'a BTreeSet<T>) -> Union<'a, T>; // Ignored by clippy, but not by me. std::collections::HashSet<T, S> fn difference<'a>(&'a self, other: &'a HashSet<T, S>) -> Difference<'a, T, S>; std::collections::HashSet<T, S> fn symmetric_difference<'a>(&'a self, other: &'a HashSet<T, S>) -> SymmetricDifference<'a, T, S> std::collections::HashSet<T, S> fn intersection<'a>(&'a self, other: &'a HashSet<T, S>) -> Intersection<'a, T, S>; std::collections::HashSet<T, S> fn union<'a>(&'a self, other: &'a HashSet<T, S>) -> Union<'a, T, S>; ``` Parent issue: #89692 r? ```@joshtriplett```	2021-10-31 09:20:24 +01:00
John Kugelman	6745e8da06	Add #[must_use] to len and is_empty	2021-10-30 19:25:12 -04:00
John Kugelman	68b0d86294	Add #[must_use] to remaining core functions	2021-10-30 18:21:29 -04:00
Matthias Krüger	86087f906d	Rollup merge of #90371 - Veykril:patch-2, r=jyn514 Fix incorrect doc link Looks like a copy paste mistake	2021-10-30 14:36:59 +02:00
Lukas Wirth	29a4e4a009	Fix incorrect doc link	2021-10-28 11:51:00 +02:00
Pietro Albini	a5a8bb0125	replace `\|` with `\|\|` in string validation Using short-circuiting operators makes it easier to perform some kinds of source code analysis, like MC/DC code coverage (a requirement in safety-critical environments). The optimized x86_64 assembly is equivalent between the old and new versions. Old assembly of that condition: ``` mov rax, qword ptr [rdi + rdx + 8] or rax, qword ptr [rdi + rdx] test rax, r9 je .LBB0_7 ``` New assembly of that condition: ``` mov rax, qword ptr [rdi + rdx] or rax, qword ptr [rdi + rdx + 8] test rax, r8 je .LBB0_7 ```	2021-10-27 17:00:49 +02:00
Noah Lev	865d99f82b	docs: Escape brackets to satisfy the linkchecker My change to use `Type::def_id()` (formerly `Type::def_id_full()`) in more places caused some docs to show up that used to be missed by rustdoc. Those docs contained unescaped square brackets, which triggered linkcheck errors. This commit escapes the square brackets and adds this particular instance to the linkcheck exception list.	2021-10-22 14:08:43 -07:00
Adam Skoufis	4b59b35b76	Add missing word to `FromStr` trait docs	2021-10-14 13:47:54 +11:00
John Kugelman	21f4677744	Add #[must_use] to expensive computations The unifying theme for this commit is weak, admittedly. I put together a list of "expensive" functions when I originally proposed this whole effort, but nobody's cared about that criterion. Still, it's a decent way to bite off a not-too-big chunk of work. Given the grab bag nature of this commit, the messages I used vary quite a bit.	2021-10-12 23:27:17 -04:00
the8472	b55a3c5d15	Rollup merge of #89778 - jkugelman:must-use-as_type-conversions, r=joshtriplett Add #[must_use] to as_type conversions Clippy missed these: ```rust alloc::string::String fn as_mut_str(&mut self) -> &mut str; core::mem::NonNull<T> unsafe fn as_uninit_mut<'a>(&mut self) -> &'a MaybeUninit<T>; str unsafe fn as_bytes_mut(&mut self) -> &mut [u8]; str fn as_mut_ptr(&mut self) -> *mut u8; ``` Parent issue: #89692 r? ````@joshtriplett````	2021-10-12 14:53:08 +02:00
John Kugelman	06e625f7d5	Add #[must_use] to as_type conversions	2021-10-11 13:57:38 -04:00
Guillaume Gomez	96ffc74fe3	Rollup merge of #89753 - jkugelman:must-use-from_value-conversions, r=joshtriplett Add #[must_use] to from_value conversions I added two methods to the list myself. Clippy did not flag them because they take `mut` args, but neither modifies their argument. ```rust core::str const unsafe fn from_utf8_unchecked_mut(v: &mut [u8]) -> &mut str; std::ffi::CString unsafe fn from_raw(ptr: mut c_char) -> CString; ``` I put a custom note on `from_raw`: ```rust #[must_use = "call `drop(from_raw(ptr))` if you intend to drop the `CString`"] pub unsafe fn from_raw(ptr: mut c_char) -> CString { ``` Parent issue: #89692 r? ``@joshtriplett``	2021-10-11 14:11:45 +02:00
John Kugelman	cf2bcd10ed	Add #[must_use] to from_value conversions	2021-10-10 19:00:33 -04:00
Matthias Krüger	0c04b1fc03	Rollup merge of #89718 - jkugelman:must-use-is_condition-tests, r=joshtriplett Add #[must_use] to is_condition tests There's nothing insightful to say about these so I didn't write any extra explanations. Parent issue: #89692	2021-10-10 18:22:23 +02:00
bors	0c87288f92	Auto merge of #89219 - nickkuk:str_split_once_get_unchecked, r=Mark-Simulacrum Use get_unchecked in str::[r]split_once This PR removes indices checking in `str::split_once` and `str::rsplit_once` methods.	2021-10-10 12:29:48 +00:00
John Kugelman	475e9925a7	Add #[must_use] to is_condition tests There's nothing insightful to say about these so I didn't write any extra explanations.	2021-10-09 21:27:13 -04:00
Matthias Krüger	827b540424	Rollup merge of #89694 - jkugelman:must-use-string-transforms, r=joshtriplett Add #[must_use] to string/char transformation methods These methods could be misconstrued as modifying their arguments instead of returning new values. Where possible I made the note recommend a method that does mutate in place. Parent issue: #89692	2021-10-09 11:56:07 +02:00
Matthias Krüger	36db658796	Rollup merge of #88707 - sylvestre:split_example, r=yaahc String.split_terminator: Add an example when using a slice of chars	2021-10-09 11:55:58 +02:00
John Kugelman	54d807cfc7	Add #[must_use] to string/char transformation methods These methods could be misconstrued as modifying their arguments instead of returning new values. Where possible I made the note recommend a method that does mutate in place.	2021-10-09 01:01:40 -04:00
nickkuk	a35aaa2108	Use get_unchecked in str::[r]split_once	2021-10-05 14:42:08 +05:00
The8472	5e1428e18b	manually inline function	2021-09-11 12:29:34 +02:00
The8472	66195d8bc4	optimization continuation byte validation of strings containing multibyte chars ``` old, -O2, x86-64 test str::str_validate_emoji ... bench: 4,606 ns/iter (+/- 64) new, -O2, x86-64 test str::str_validate_emoji ... bench: 3,837 ns/iter (+/- 60) ```	2021-09-11 00:25:41 +02:00
The8472	b6278664af	optimize utf8_is_cont_byte() to speed up str.chars().count() it shows consistent improvements across several x86_64 feature levels ``` old, -O2, x86-64 test str::str_char_count_emoji ... bench: 1,924 ns/iter (+/- 26) test str::str_char_count_lorem ... bench: 879 ns/iter (+/- 12) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64 test str::str_char_count_emoji ... bench: 1,878 ns/iter (+/- 21) test str::str_char_count_lorem ... bench: 851 ns/iter (+/- 11) test str::str_char_count_lorem_short ... bench: 4 ns/iter (+/- 0) old, -O2, x86-64-v2 test str::str_char_count_emoji ... bench: 1,477 ns/iter (+/- 46) test str::str_char_count_lorem ... bench: 675 ns/iter (+/- 15) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64-v2 test str::str_char_count_emoji ... bench: 1,323 ns/iter (+/- 39) test str::str_char_count_lorem ... bench: 593 ns/iter (+/- 18) test str::str_char_count_lorem_short ... bench: 4 ns/iter (+/- 0) old, -O2, x86-64-v3 test str::str_char_count_emoji ... bench: 748 ns/iter (+/- 7) test str::str_char_count_lorem ... bench: 348 ns/iter (+/- 2) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) new, -O2, x86-64-v3 test str::str_char_count_emoji ... bench: 650 ns/iter (+/- 4) test str::str_char_count_lorem ... bench: 301 ns/iter (+/- 1) test str::str_char_count_lorem_short ... bench: 5 ns/iter (+/- 0) ```	2021-09-11 00:25:41 +02:00
Mark Rousskov	b4e7649d6d	Bump stage0 compiler to 1.56	2021-09-08 20:51:05 -04:00
Sylvestre Ledru	d4031d092d	String.split_terminator: Add an example when using a slice of chars	2021-09-06 23:25:38 +02:00
bors	cc9bb1522e	Auto merge of #83342 - Count-Count:win-console-incomplete-utf8, r=m-ou-se Allow writing of incomplete UTF-8 sequences to the Windows console via stdout/stderr # Problem Writes of just an incomplete UTF-8 byte sequence (e.g. `b"\xC3"` or `b"\xF0\x9F"`) to stdout/stderr with a Windows console attached error with `io::ErrorKind::InvalidData, "Windows stdio in console mode does not support writing non-UTF-8 byte sequences"` even though further writes could complete the codepoint. This is currently a rare occurence since the [linewritershim](`2c56ea38b0/library/std/src/io/buffered/linewritershim.rs`) implementation flushes complete lines immediately and buffers up to 1024 bytes for incomplete lines. It can still happen as described in #83258. The problem will become more pronounced once the developer can switch stdout/stderr from line-buffered to block-buffered or immediate when the changes in the "Switchable buffering for Stdout" pull request (#78515) get merged. # Patch description If there is at least one valid UTF-8 codepoint all valid UTF-8 is passed through to the extracted `write_valid_utf8_to_console()` fn. The new code only comes into play if `write()` is being passed a short byte slice comprising an incomplete UTF-8 codepoint. In this case up to three bytes are buffered in the `IncompleteUtf8` struct associated with `Stdout` / `Stderr`. The bytes are accepted one at a time. As soon as an error can be detected `io::ErrorKind::InvalidData, "Windows stdio in console mode does not support writing non-UTF-8 byte sequences"` is returned. Once a complete UTF-8 codepoint is received it is passed to the `write_valid_utf8_to_console()` and the buffer length is set to zero. Calling `flush()` will neither error nor write anything if an incomplete codepoint is present in the buffer. # Tests Currently there are no Windows-specific tests for console writing code at all. Writing (regression) tests for this problem is a bit challenging since unit tests and UI tests don't run in a console and suddenly popping up another console window might be surprising to developers running the testsuite and it might not work at all in CI builds. To just test the new functionality in unit tests the code would need to be refactored. Some guidance on how to proceed would be appreciated. # Public API changes * `std::str::verifications::utf8_char_width()` would be exposed as `std::str::utf8_char_width()` behind the "str_internals" feature gate. # Related issues * Fixes #83258. * PR #78515 will exacerbate the problem. # Open questions * Add tests? * Squash into one commit with better commit message?	2021-09-02 03:31:17 +00:00
Deadbeef	b5afa6807b	Constified `Default` implementations The libs-api team agrees to allow const_trait_impl to appear in the standard library as long as stable code cannot be broken (they are properly gated) this means if the compiler teams thinks it's okay, then it's okay. My priority on constifying would be: 1. Non-generic impls (e.g. Default) or generic impls with no bounds 2. Generic functions with bounds (that use const impls) 3. Generic impls with bounds 4. Impls for traits with associated types For people opening constification PRs: please cc me and/or oli-obk.	2021-08-17 07:15:54 +00:00
Ali Malik	e43254aad1	Fix may not to appropriate might not or must not	2021-07-29 01:15:20 -04:00
Cameron Steffen	28f7890a29	Add char array without ref Pattern impl	2021-07-28 16:13:46 -05:00
Cameron Steffen	5e4f2128e2	impl Pattern for char array	2021-07-28 16:13:45 -05:00
Frank Steffahn	69dd992f95	Add TrustedRandomAccessNoCoerce supertrait without requirements or guarantees about subtype coercions Update all the TrustedRandomAccess impls to also implement the new supertrait	2021-07-28 14:33:35 +02:00
Jacob Pratt	36f02f3523	Stabilize `const_fn_transmute`	2021-07-27 16:03:09 -04:00
Alexis Bourget	101a146db9	Fix #85462 by adding a marker flag This will not affect ABI since the other variant of the enum is bigger. It may break some code, but that would be very strange: usually people don't continue after the first `Done` (or `None` for a normal iterator).	2021-07-11 17:45:12 +02:00

1 2 3

147 commits