rust/library
Matthias Krüger aa65c31c18
Rollup merge of #147932 - thaliaarchi:utf8-osstring, r=tgross35
Create UTF-8 version of `OsStr`/`OsString`

Implement a UTF-8 version of `OsStr`/`OsString`, in addition to the existing bytes and WTF-8 platform-dependent encodings.

This is applicable for several platforms, but I've currently only implemented it for Motor OS:

- WASI uses Unicode paths, but currently reexports the Unix bytes-assuming `OsStrExt`/`OsStringExt` traits.
  - [wasi:filesystem](https://wa.dev/wasi:filesystem) APIs:
    > Paths are passed as interface-type `strings`, meaning they must consist of a sequence of Unicode Scalar Values (USVs). Some filesystems may contain paths which are not accessible by this API.
  - In [wasi-filesystem#17](https://github.com/WebAssembly/wasi-filesystem/issues/17#issuecomment-1430639353), it was decided that applications can use any Unicode transformation format, so we're free to use UTF-8 (and probably already do). This was chosen over specifically UTF-8 or an ad hoc encoding which preserves paths not representable in UTF-8.
      > The current API uses strings for filesystem paths, which contains sequences of Unicode scalar values (USVs), which applications can work with using strings encoded in UTF-8, UTF-16, or other Unicode encodings.
    >
    > This does mean that the API is unable to open files which do not have well-formed Unicode encodings, which may want separate APIs for handling such paths or may want something like the arf-strings proposal, but if we need that we should file a new issue for it.
- As of Redox OS [0.7.0](https://www.redox-os.org/news/release-0.7.0/), "All paths are now required to be UTF-8, and the kernel enforces this". This appears to have been implemented in commit [d331f72f](d331f72f2a) (Use UTF-8 for all paths, 2021-02-14). Redox does not have `OsStrExt`/`OsStringExt`.
- Motor OS guarantees that its OS strings are UTF-8 in its [current `OsStrExt`/`OsStringExt` traits](a828ffcf5f/library/std/src/os/motor/ffi.rs), but they're still internally bytes like Unix.

This is an alternate approach to https://github.com/rust-lang/rust/pull/147797, which reuses the existing bytes `OsString` and relies on the safety properties of `from_encoded_bytes_unchecked`. Compared to that, this also gains efficiency from propagating the UTF-8 invariant to the whole type, as it never needs to test for UTF-8 validity.

Note that Motor OS currently does not build until https://github.com/rust-lang/rust/pull/147930 merges.

cc `@tgross35` (for earlier review)
cc `@alexcrichton,` `@rylev,` `@loganek` (for WASI)
cc `@lasiotus` (for Motor OS)
cc `@jackpot51` (for Redox OS)
2025-10-22 07:12:11 +02:00
..
alloc Rollup merge of #141445 - yotamofek:pr/library/from-iter-char-string, r=the8472,joshtriplett 2025-10-22 07:12:08 +02:00
alloctests Rollup merge of #145113 - petrochenkov:lessfinalize, r=lcnr 2025-09-26 18:11:08 +02:00
backtrace@b65ab935fb Update the backtrace submodule 2025-06-16 07:00:13 +00:00
compiler-builtins Mark float intrinsics with no preconditions as safe 2025-09-21 20:37:51 -04:00
core Rollup merge of #147788 - clarfonthey:const-cell, r=oli-obk 2025-10-22 07:12:11 +02:00
coretests Rollup merge of #147788 - clarfonthey:const-cell, r=oli-obk 2025-10-22 07:12:11 +02:00
panic_abort Use core via rustc-std-workspace-core in library/panic* 2025-07-31 22:47:24 +00:00
panic_unwind Migrate panic_unwind to use cfg_select! 2025-08-20 16:45:24 -07:00
portable-simd docs(std): add missing closing code block fences in doc comments 2025-09-02 22:11:29 +02:00
proc_macro Rollup merge of #147497 - cyrgani:proc-macro-cleanups-3, r=petrochenkov 2025-10-14 16:30:58 +11:00
profiler_builtins Fix profiler_builtins build script to handle full path to profiler lib 2025-04-11 16:57:38 +02:00
rtstartup Update cfg(bootstrap) 2025-07-01 10:55:49 -07:00
rustc-std-workspace-alloc Disable unit tests for stdlib packages that don't contain any 2025-07-24 09:15:28 +00:00
rustc-std-workspace-core Use core via rustc-std-workspace-core in library/panic* 2025-07-31 22:47:24 +00:00
rustc-std-workspace-std Disable unit tests for stdlib packages that don't contain any 2025-07-24 09:15:28 +00:00
std motor: Use UTF-8 guarantee for OS strings 2025-10-21 16:36:10 -06:00
std_detect std_detect Darwin AArch64: synchronize features 2025-09-13 23:29:55 -07:00
stdarch Auto merge of #146683 - clarfonthey:safe-intrinsics, r=RalfJung,Amanieu 2025-09-22 14:35:46 +00:00
sysroot Add panic=immediate-abort 2025-09-21 13:12:18 -04:00
test Rollup merge of #142807 - sourcefrog:failfast, r=dtolnay 2025-09-17 14:56:41 +10:00
unwind Indent some code inside cfg_select! 2025-08-16 16:01:08 -07:00
windows_targets Rollup merge of #144399 - bjorn3:stdlib_tests_separate_packages, r=Mark-Simulacrum 2025-07-28 08:36:53 +02:00
Cargo.lock Rollup merge of #147000 - moturus:motor-os_stdlib_pr, r=tgross35 2025-10-16 19:35:23 +02:00
Cargo.toml Remove the std workspace patch for compiler-builtins 2025-08-19 18:56:35 +00:00