# Rationale
When dealing with strings, many functions deal with either a `char` (unicode
codepoint) or a byte (utf-8 encoding related). There is often an inconsistent
way in which methods are referred to as to whether they contain "byte", "char",
or nothing in their name. There are also issues open to rename *all* methods to
reflect that they operate on utf8 encodings or bytes (e.g. utf8_len() or
byte_len()).
The current state of String seems to largely be what is desired, so this PR
proposes the following rationale for methods dealing with bytes or characters:
> When constructing a string, the input encoding *must* be mentioned (e.g.
> from_utf8). This makes it clear what exactly the input type is expected to be
> in terms of encoding.
>
> When a method operates on anything related to an *index* within the string
> such as length, capacity, position, etc, the method *implicitly* operates on
> bytes. It is an understood fact that String is a utf-8 encoded string, and
> burdening all methods with "bytes" would be redundant.
>
> When a method operates on the *contents* of a string, such as push() or pop(),
> then "char" is the default type. A String can loosely be thought of as being a
> collection of unicode codepoints, but not all collection-related operations
> make sense because some can be woefully inefficient.
# Method stabilization
The following methods have been marked #[stable]
* The String type itself
* String::new
* String::with_capacity
* String::from_utf16_lossy
* String::into_bytes
* String::as_bytes
* String::len
* String::clear
* String::as_slice
The following methods have been marked #[unstable]
* String::from_utf8 - The error type in the returned `Result` may change to
provide a nicer message when it's `unwrap()`'d
* String::from_utf8_lossy - The returned `MaybeOwned` type still needs
stabilization
* String::from_utf16 - The return type may change to become a `Result` which
includes more contextual information like where the error
occurred.
* String::from_chars - This is equivalent to iter().collect(), but currently not
as ergonomic.
* String::from_char - This method is the equivalent of Vec::from_elem, and has
been marked #[unstable] becuase it can be seen as a
duplicate of iterator-based functionality as well as
possibly being renamed.
* String::push_str - This *can* be emulated with .extend(foo.chars()), but is
less efficient because of decoding/encoding. Due to the
desire to minimize API surface this may be able to be
removed in the future for something possibly generic with
no loss in performance.
* String::grow - This is a duplicate of iterator-based functionality, which may
become more ergonomic in the future.
* String::capacity - This function was just added.
* String::push - This function was just added.
* String::pop - This function was just added.
* String::truncate - The failure conventions around String methods and byte
indices isn't totally clear at this time, so the failure
semantics and return value of this method are subject to
change.
* String::as_mut_vec - the naming of this method may change.
* string::raw::* - these functions are all waiting on [an RFC][2]
[2]: rust-lang/rfcs#240
The following method have been marked #[experimental]
* String::from_str - This function only exists as it's more efficient than
to_string(), but having a less ergonomic function for
performance reasons isn't the greatest reason to keep it
around. Like Vec::push_all, this has been marked
experimental for now.
The following methods have been #[deprecated]
* String::append - This method has been deprecated to remain consistent with the
deprecation of Vec::append. While convenient, it is one of
the only functional-style apis on String, and requires more
though as to whether it belongs as a first-class method or
now (and how it relates to other collections).
* String::from_byte - This is fairly rare functionality and can be emulated with
str::from_utf8 plus an assert plus a call to to_string().
Additionally, String::from_char could possibly be used.
* String::byte_capacity - Renamed to String::capacity due to the rationale
above.
* String::push_char - Renamed to String::push due to the rationale above.
* String::pop_char - Renamed to String::pop due to the rationale above.
* String::push_bytes - There are a number of `unsafe` functions on the `String`
type which allow bypassing utf-8 checks. These have all
been deprecated in favor of calling `.as_mut_vec()` and
then operating directly on the vector returned. These
methods were deprecated because naming them with relation
to other methods was difficult to rationalize and it's
arguably more composable to call .as_mut_vec().
* String::as_mut_bytes - See push_bytes
* String::push_byte - See push_bytes
* String::pop_byte - See push_bytes
* String::shift_byte - See push_bytes
# Reservation methods
This commit does not yet touch the methods for reserving bytes. The methods on
Vec have also not yet been modified. These methods are discussed in the upcoming
[Collections reform RFC][1]
[1]: https://github.com/aturon/rfcs/blob/collections-conventions/active/0000-collections-conventions.md#implicit-growth
|
||
|---|---|---|
| man | ||
| mk | ||
| src | ||
| .gitattributes | ||
| .gitignore | ||
| .gitmodules | ||
| .mailmap | ||
| .travis.yml | ||
| AUTHORS.txt | ||
| configure | ||
| CONTRIBUTING.md | ||
| COPYRIGHT | ||
| LICENSE-APACHE | ||
| LICENSE-MIT | ||
| Makefile.in | ||
| README.md | ||
| RELEASES.txt | ||
The Rust Programming Language
This is a compiler for Rust, including standard libraries, tools and documentation.
Quick Start
- Download a binary installer for your platform.
- Read the guide.
- Enjoy!
Note: Windows users can read the detailed getting started notes on the wiki.
Building from Source
-
Make sure you have installed the dependencies:
g++4.7 orclang++3.xpython2.6 or later (but not 3.x)perl5.0 or later- GNU
make3.81 or later curlgit
-
Download and build Rust:
You can either download a tarball or build directly from the repo.
To build from the tarball do:
$ curl -O https://static.rust-lang.org/dist/rust-nightly.tar.gz $ tar -xzf rust-nightly.tar.gz $ cd rust-nightlyOr to build from the repo do:
$ git clone https://github.com/rust-lang/rust.git $ cd rustNow that you have Rust's source code, you can configure and build it:
$ ./configure $ make && make installNote: You may need to use
sudo make installif you do not normally have permission to modify the destination directory. The install locations can be adjusted by passing a--prefixargument toconfigure. Various other options are also supported, pass--helpfor more information on them.When complete,
make installwill place several programs into/usr/local/bin:rustc, the Rust compiler, andrustdoc, the API-documentation tool. -
Read the guide.
-
Enjoy!
Building on Windows
To easily build on windows we can use MSYS2:
-
Grab the latest MSYS2 installer and go through the installer.
-
Now from the MSYS2 terminal we want to install the mingw64 toolchain and the other tools we need.
$ pacman -S mingw-w64-i686-toolchain $ pacman -S base-devel -
With that now start
mingw32_shell.batfrom where you installed MSYS2 (i.e.C:\msys). -
From there just navigate to where you have Rust's source code, configure and build it:
$ ./configure $ make && make install
Notes
Since the Rust compiler is written in Rust, it must be built by a precompiled "snapshot" version of itself (made in an earlier state of development). As such, source builds require a connection to the Internet, to fetch snapshots, and an OS that can execute the available snapshot binaries.
Snapshot binaries are currently built and tested on several platforms:
- Windows (7, 8, Server 2008 R2), x86 only
- Linux (2.6.18 or later, various distributions), x86 and x86-64
- OSX 10.7 (Lion) or greater, x86 and x86-64
You may find that other platforms work, but these are our officially supported build environments that are most likely to work.
Rust currently needs about 1.5 GiB of RAM to build without swapping; if it hits swap, it will take a very long time to build.
There is a lot more documentation in the wiki.
Getting help and getting involved
The Rust community congregates in a few places:
- StackOverflow - Get help here.
- /r/rust - General discussion.
- discuss.rust-lang.org - For development of the Rust language itself.
License
Rust is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.
See LICENSE-APACHE, LICENSE-MIT, and COPYRIGHT for details.