Commit graph

214 commits

Author SHA1 Message Date
vallentin
116aacc2a9
Updated lines doc to include trailing carriage return note 2023-07-09 21:44:37 +02:00
Tshepang Mbambo
6bab85a456
str docs: remove "Basic usage" text where not useful
Not "useful" in that there is only one example given
2023-06-26 22:11:13 +02:00
Urgau
7f99c7d3e6 Add invalid_from_utf8 analogous to invalid_from_utf8_unchecked 2023-05-27 00:18:28 +02:00
Urgau
7f8846a9ef Uplift clippy::invalid_utf8_in_unchecked as invalid_from_utf8_unchecked 2023-05-27 00:16:47 +02:00
Scott McMurray
47232ade61 Fix some misleading and copy-pasted Pattern examples
These examples were listed twice and also were confusable with doing a substring match instead of a any-of-set match.
2023-05-14 23:03:50 -07:00
Scott McMurray
370d31b93d Constify [u8]::is_ascii (unstably)
UTF-8 checking in `const fn`-stabilized back in 1.63, but apparently somehow ASCII checking was never const-ified, despite being simpler.
2023-05-04 14:30:20 -07:00
Scott McMurray
8c781b0906 Add the basic ascii::Char type 2023-05-03 22:09:33 -07:00
Deadbeef
63e0ddbf1d core is now compilable 2023-04-16 07:20:26 +00:00
Deadbeef
76dbe29104 rm const traits in libcore 2023-04-16 06:49:27 +00:00
DaniPopes
a0daf22b95
Fix typos in library 2023-04-10 21:07:29 +02:00
Lukas Markeffsky
bdb0c7f427 add comment to impl !Error for &str 2023-03-30 22:19:28 +02:00
Dylan DPC
d694f47baa
Rollup merge of #100311 - xfix:lines-fix-handling-of-bare-cr, r=ChrisDenton
Fix handling of trailing bare CR in str::lines

Continuing from #91191.

Fixes #94435.
2023-03-23 00:00:30 +05:30
bors
7820b62d20 Auto merge of #105117 - pitaj:debug_asserts, r=the8472
Add more debug assertions to unsafe functions

related to #51713
2023-03-05 19:35:44 +00:00
Peter Jaszkowiak
cd35794d5e Comment for why char boundaries aren't checked 2023-03-04 15:11:24 -07:00
Konrad Borowski
bc3f6542f3 Remove manual implementation of str::ne 2023-03-02 16:32:04 +01:00
Peter Jaszkowiak
09f8885b3b debug assertions for slice::split_at_unchecked, str::get_unchecked 2023-01-21 12:50:03 -07:00
Lukas Markeffsky
76e216f29b Use associated items of char instead of freestanding items in core::char 2023-01-14 11:58:41 +01:00
bors
b435960c4c Auto merge of #95644 - WaffleLapkin:str_split_as_str_refactor_take2, r=Amanieu
`Split*::as_str` refactor

I've made this patch almost a year ago, so the rename and the behavior change are in one commit, sorry 😅

This fixes #84974, as it's required to make other changes work.

This PR
- Renames `as_str` method of string `Split*` iterators to `remainder` (it seems like the `as_str` name was confusing to users)
- Makes `remainder` return `Option<&str>`, to distinguish between "the iterator is exhausted" and "the tail is empty", this was [required on the tracking issue](https://github.com/rust-lang/rust/issues/77998#issuecomment-832696619)

r? `@m-ou-se`
2023-01-03 11:06:08 +00:00
jonathanCogan
db47071df2 Replace libstd, libcore, liballoc in line comments. 2022-12-30 14:00:42 +01:00
Stefano Zacchiroli
7f9438910c
str.lines() docstring: clarify that line endings are not returned
Previously, the str.lines() docstring stated that lines are split at line
endings, but not whether those were returned or not.  This new version of the
docstring states this explicitly, avoiding the need of getting to doctests to
get an answer to this FAQ.
2022-12-17 12:20:56 +01:00
Maybe Waffle
b458a49f26 Replace Split*::as_str with remainder
This commit
- Renames `Split*::{as_str -> remainder}` as it seems less confusing
- Makes `remainder` return Option<&str> to distinguish between
  "iterator is exhausted" and "the tail is empty"
2022-12-16 13:04:22 +00:00
Maybe Waffle
ca4989eac2 SplitInternal: always set finished in get_end 2022-12-16 12:57:22 +00:00
Eduardo Sánchez Muñoz
00e7b54d46 Make some trivial functions #[inline(always)] 2022-12-07 17:11:17 +01:00
Rageking8
58110572fb fix dupe word typos 2022-12-05 16:42:36 +08:00
Matthias Krüger
e4d1fe7b15
Rollup merge of #104436 - ismailmaj:add-slice-to-stack-allocated-string-comment, r=Mark-Simulacrum
Add slice to the stack allocated string comment

Precise that the "stack allocated string" is not a string but a string slice.

``@rustbot`` label +A-docs
2022-11-29 22:43:16 +01:00
Vojtech Kral
07ccf67f59 Document split{_ascii,}_whitespace() for empty strings 2022-11-24 15:22:24 +01:00
The 8472
3ed8fccff5 fix OOB access in SIMD impl of str.contains() 2022-11-22 20:59:19 +01:00
ismailmaj
005c6dfde6 type annotate &str when stack allocating a string 2022-11-21 10:38:04 +01:00
The 8472
a2b2010891 - convert from core::arch to core::simd
- bump simd compare to 32bytes
- import small slice compare code from memmem crate
- try a few different probe bytes to avoid degenerate cases
  - but special-case 2-byte needles
2022-11-15 18:30:31 +01:00
The 8472
3d4a8482b9 x86_64 SSE2 fast-path for str.contains(&str) and short needles
Based on Wojciech Muła's "SIMD-friendly algorithms for substring searching"[0]

The two-way algorithm is Big-O efficient but it needs to preprocess the needle
to find a "criticla factorization" of it. This additional work is significant
for short needles. Additionally it mostly advances needle.len() bytes at a time.

The SIMD-based approach used here on the other hand can advance based on its
vector width, which can exceed the needle length. Except for pathological cases,
but due to being limited to small needles the worst case blowup is also small.

benchmarks taken on a Zen2:

```
16CGU, OLD:
test str::bench_contains_short_short                     ... bench:          27 ns/iter (+/- 1)
test str::bench_contains_short_long                      ... bench:         667 ns/iter (+/- 29)
test str::bench_contains_bad_naive                       ... bench:         131 ns/iter (+/- 2)
test str::bench_contains_bad_simd                        ... bench:         130 ns/iter (+/- 2)
test str::bench_contains_equal                           ... bench:         148 ns/iter (+/- 4)


16CGU, NEW:
test str::bench_contains_short_short                     ... bench:           8 ns/iter (+/- 0)
test str::bench_contains_short_long                      ... bench:         135 ns/iter (+/- 4)
test str::bench_contains_bad_naive                       ... bench:         130 ns/iter (+/- 2)
test str::bench_contains_bad_simd                        ... bench:         292 ns/iter (+/- 1)
test str::bench_contains_equal                           ... bench:           3 ns/iter (+/- 0)


1CGU, OLD:
test str::bench_contains_short_short                     ... bench:          30 ns/iter (+/- 0)
test str::bench_contains_short_long                      ... bench:         713 ns/iter (+/- 17)
test str::bench_contains_bad_naive                       ... bench:         131 ns/iter (+/- 3)
test str::bench_contains_bad_simd                        ... bench:         130 ns/iter (+/- 3)
test str::bench_contains_equal                           ... bench:         148 ns/iter (+/- 6)

1CGU, NEW:
test str::bench_contains_short_short                     ... bench:          10 ns/iter (+/- 0)
test str::bench_contains_short_long                      ... bench:         111 ns/iter (+/- 0)
test str::bench_contains_bad_naive                       ... bench:         135 ns/iter (+/- 3)
test str::bench_contains_bad_simd                        ... bench:         274 ns/iter (+/- 2)
test str::bench_contains_equal                           ... bench:           4 ns/iter (+/- 0)
```


[0] http://0x80.pl/articles/simd-strfind.html#sse-avx2
2022-11-14 23:03:16 +01:00
Sky
9a7e527e28
Fix typo in ReverseSearcher docs 2022-10-17 13:14:15 -04:00
Konrad Borowski
cef81dcd0a Fix handling of trailing bare CR in str::lines
Previously "bare\r" was split into ["bare"] even though the
documentation said that only LF and CRLF count as newlines.

This fix is a behavioural change, even though it brings the behaviour
into line with the documentation, and into line with that of
`std::io::BufRead::lines()`.

This is an alternative to #91051, which proposes to document rather
than fix the behaviour.

Fixes #94435.

Co-authored-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
2022-10-06 16:05:38 +00:00
Konrad Borowski
0d8a0c56fe Rename LinesAnyMap to LinesMap
lines_any method was replaced with lines method, so it
makes sense to rename this structure to match new name.

Co-authored-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
2022-10-06 16:04:48 +00:00
Eduardo Sánchez Muñoz
9c7c232a50 Improve FromStr example
The `from_str` implementation from the example had an `unwrap` that would make it panic on invalid input strings. Instead of panicking, it nows returns an error to better reflect the intented behavior of the `FromStr` trait.
2022-10-02 11:32:56 +02:00
Pietro Albini
3975d55d98
remove cfg(bootstrap) 2022-09-26 10:14:45 +02:00
Guillaume Gomez
b4fdc5861d Add missing documentation for bool::from_str 2022-09-21 14:17:11 +02:00
Deadbeef
075084f772 Make const_eval_select a real intrinsic 2022-09-04 20:35:23 +08:00
Jane Losare-Lusby
bf7611d55e Move error trait into core 2022-08-22 13:28:25 -07:00
Matthias Krüger
a45f69f27d
Rollup merge of #100822 - WaffleLapkin:no_offset_question_mark, r=scottmcm
Replace most uses of `pointer::offset` with `add` and `sub`

As PR title says, it replaces `pointer::offset` in compiler and standard library with `pointer::add` and `pointer::sub`. This generally makes code cleaner, easier to grasp and removes (or, well, hides) integer casts.

This is generally trivially correct, `.offset(-constant)` is just `.sub(constant)`, `.offset(usized as isize)` is just `.add(usized)`, etc. However in some cases we need to be careful with signs of things.

r? ````@scottmcm````

_split off from #100746_
2022-08-21 16:54:07 +02:00
Maybe Waffle
e4720e1cf2 Replace most uses of pointer::offset with add and sub 2022-08-21 02:21:41 +04:00
Matthias Krüger
d49906519b
Rollup merge of #99544 - dylni:expose-utf8lossy, r=Mark-Simulacrum
Expose `Utf8Lossy` as `Utf8Chunks`

This PR changes the feature for `Utf8Lossy` from `str_internals` to `utf8_lossy` and improves the API. This is done to eventually expose the API as stable.

Proposal: rust-lang/libs-team#54
Tracking Issue: #99543
2022-08-20 19:32:07 +02:00
dylni
e8ee0b7b2b Expose Utf8Lossy as Utf8Chunks 2022-08-20 12:49:20 -04:00
Xavier Noria
64d1c91a31
ascii -> ASCII in code comment 2022-08-05 18:45:42 +02:00
David Herberth
c1c1abc08a Use split_once in FromStr docs 2022-07-18 08:57:43 +02:00
bors
5fb8a39266 Auto merge of #97367 - WaffleLapkin:stabilize_checked_slice_to_str_conv, r=dtolnay
Stabilize checked slice->str conversion functions

This PR stabilizes the following APIs as `const` functions in Rust 1.63:
```rust
// core::str

pub const fn from_utf8(v: &[u8]) -> Result<&str, Utf8Error>;

impl Utf8Error {
    pub const fn valid_up_to(&self) -> usize;
    pub const fn error_len(&self) -> Option<usize>;
}
```

Note that the `from_utf8_mut` function is not stabilized as unique references (`&mut _`) are [unstable in const context].

FCP: https://github.com/rust-lang/rust/issues/91006#issuecomment-1134593095

[unstable in const context]: https://github.com/rust-lang/rust/issues/57349
2022-06-19 05:51:42 +00:00
Imbolc
acda8866cc
Document an edge case of str::split_once 2022-06-13 13:35:49 +03:00
Maybe Waffle
89295352ee Allow some internal instability 2022-05-26 13:00:14 +04:00
Maybe Waffle
138aa99503 Stabilize checked slice->str conversion functions 2022-05-24 22:58:28 +04:00
Guillaume Gomez
1210b50dbb
Rollup merge of #97190 - SylvainDe:master, r=Dylan-DPC
Add implicit call to from_str via parse in documentation

The documentation mentions "FromStr’s from_str method is often used implicitly,
through str’s parse method. See parse’s documentation for examples.".

It may be nicer to show that in the code example as well.
2022-05-21 11:39:48 +02:00
Dan Gohman
b836cf6fb8 Say "last" instead of "rightmost" in the documentation for std::str::rfind.
In the documentation comment for `std::str::rfind`, say "last" instead
of "rightmost" to describe the match that `rfind` finds. This follows the
spirit of #30459, for which `trim_left` and `trim_right` were replaced by
`trim_start` and `trim_end` to be more clear about how they work on
text which is displayed right-to-left.
2022-05-19 15:31:17 -07:00