user0/rust - Forgejo: Beyond coding. We Forge.

user0/rust

Author	SHA1	Message	Date
bors	149e76f12c	Auto merge of #38018 - sourcefrog:doc, r=alexcrichton Document that Process::command will search the PATH	2016-12-01 11:35:19 +00:00
Jeremy Soller	729442206c	Cleanup env	2016-11-30 21:50:17 -07:00
bors	070fad1701	Auto merge of #37573 - ruuda:faster-cursor, r=alexcrichton Add small-copy optimization for copy_from_slice ## Summary During benchmarking, I found that one of my programs spent between 5 and 10 percent of the time doing memmoves. Ultimately I tracked these down to single-byte slices being copied with a memcopy. Doing a manual copy if the slice contains only one element can speed things up significantly. For my program, this reduced the running time by 20%. ## Background I am optimizing a program that relies heavily on reading a single byte at a time. To avoid IO overhead, I read all data into a vector once, and then I use a `Cursor` around that vector to read from. During profiling, I noticed that `__memmove_avx_unaligned_erms` was hot, taking up 7.3% of the running time. It turns out that these were caused by calls to `Cursor::read()`, which calls `<&[u8] as Read>::read()`, which calls `&[T]::copy_from_slice()`, which calls `ptr::copy_nonoverlapping()`. This one is implemented as a memcopy. Copying a single byte with a memcopy is very wasteful, because (at least on my platform) it involves calling `memcpy` in libc. This is an indirect call when libc is linked dynamically, and furthermore `memcpy` is optimized for copying large amounts of data at the cost of a bit of overhead for small copies. ## Benchmarks Before I made this change, `perf` reported the following for my program. I only included the relevant functions, and how they rank. (This is on a different machine than where I ran the original benchmarks. It has an older CPU, so `__memmove_sse2_unaligned_erms` is called instead of `__memmove_avx_unaligned_erms`.) ``` #3 5.47% bench_decode libc-2.24.so [.] __memmove_sse2_unaligned_erms #5 1.67% bench_decode libc-2.24.so [.] memcpy@GLIBC_2.2.5 #6 1.51% bench_decode bench_decode [.] memcpy@plt ``` `memcpy` is eating up 8.65% of the total running time, and the overhead of dispatching to a specialized fast copy function (`memcpy@GLIBC` showing up) is clearly visible. The price of dynamic linking (`memcpy@plt` showing up) is visible too. After this change, this is what `perf` reports: ``` #5 0.33% bench_decode libc-2.24.so [.] __memmove_sse2_unaligned_erms #14 0.01% bench_decode libc-2.24.so [.] memcpy@GLIBC_2.2.5 ``` Now only 0.34% of the running time is spent on memcopies. The dynamic linking overhead is not significant at all any more. To add some more data, my program generates timing results for the operation in its main loop. These are the timings before and after the change: \| Time before \| Time after \| After/Before \| \|---------------\|---------------\|--------------\| \| 29.8 ± 0.8 ns \| 23.6 ± 0.5 ns \| 0.79 ± 0.03 \| The time is basically the total running time divided by a constant; the actual numbers are not important. This change reduced the total running time by 21% (much more than the original 9% spent on memmoves, likely because the CPU is stalling a lot less because data dependencies are more transparent). Of course YMMV and for most programs this will not matter at all. But when it does, the gains can be significant! ## Alternatives * At first I implemented this in `io::Cursor`. I moved it to `&[T]::copy_from_slice()` instead, but this might be too intrusive, especially because it applies to all `T`, not just `u8`. To restrict this to `io::Read`, `<&[u8] as Read>::read()` is probably the best place. * I tried copying bytes in a loop up to 64 or 8 bytes before calling `Read::read`, but both resulted in about a 20% slowdown instead of speedup.	2016-12-01 02:52:09 +00:00
Ted Mielczarek	e6975e9748	just add one method named creation_flags, fix the tidy error	2016-11-30 21:31:47 -05:00
Martin Pool	db93677360	Document that Process::command will search the PATH	2016-11-30 17:10:32 -08:00
Ted Mielczarek	8b1c4cbbaf	Add std::os::windows::process::CommandExt, with set_creation_flags and add_creation_flags methods. Fixes #37827 This adds a CommandExt trait for Windows along with an implementation of it for std::process::Command with methods to set the process creation flags that are passed to CreateProcess.	2016-11-30 19:44:07 -05:00
Theodore DeRego	8d9d07a1ca	Removed Option<ExitStatus> member from fuchsia Process struct. Destroy launchpads and close handles in Drop impls rather than manually	2016-11-30 14:20:44 -08:00
Alex Crichton	2186660b51	Update the bootstrap compiler Now that we've got a beta build, let's use it!	2016-11-30 10:38:08 -08:00
Ruud van Asseldonk	3be2c3b309	Move small-copy optimization into <&[u8] as Read> Based on the discussion in https://github.com/rust-lang/rust/pull/37573, it is likely better to keep this limited to std::io, instead of modifying a function which users expect to be a memcpy.	2016-11-30 11:09:29 +01:00
Ruud van Asseldonk	341805288e	Move small-copy optimization into copy_from_slice Ultimately copy_from_slice is being a bottleneck, not io::Cursor::read. It might be worthwhile to move the check here, so more places can benefit from it.	2016-11-30 11:09:29 +01:00
Ruud van Asseldonk	cd7fade0a9	Add small-copy optimization for io::Cursor During benchmarking, I found that one of my programs spent between 5 and 10 percent of the time doing memmoves. Ultimately I tracked these down to single-byte slices being copied with a memcopy in io::Cursor::read(). Doing a manual copy if only one byte is requested can speed things up significantly. For my program, this reduced the running time by 20%. Why special-case only a single byte, and not a "small" slice in general? I tried doing this for slices of at most 64 bytes and of at most 8 bytes. In both cases my test program was significantly slower.	2016-11-30 11:09:29 +01:00
Corey Farwell	274777a158	Rename 'librustc_unicode' crate to 'libstd_unicode'. Fixes #26554.	2016-11-30 01:24:01 -05:00
Guillaume Gomez	336e5dd33d	Add missing examples for IpAddr enum	2016-11-29 19:44:53 -08:00
Jeremy Soller	e68393397a	Commit to fix make tidy	2016-11-28 21:07:26 -07:00
Jeremy Soller	6378c77716	Remove file path from std::fs::File	2016-11-28 20:21:19 -07:00
Jeremy Soller	1d0bba8224	Move stdout/err flush into sys	2016-11-28 18:25:47 -07:00
Jeremy Soller	2ec21327f2	Switch to using Prefix::Verbatim	2016-11-28 18:19:17 -07:00
Jeremy Soller	746222fd9d	Switch to using syscall crate directly - without import	2016-11-28 18:07:19 -07:00
Alex Crichton	ecc60106c9	std: Fix partial writes in LineWriter Previously the `LineWriter` could successfully write some bytes but then fail to report that it has done so. Additionally, an erroneous flush after a successful write was permanently ignored. This commit fixes these two issues by (a) maintaining a `need_flush` flag to indicate whether a flush should be the first operation in `LineWriter::write` and (b) avoiding returning an error once some bytes have been successfully written. Closes #37807	2016-11-28 15:05:04 -08:00
bors	c7ddb8946b	Auto merge of #38019 - sourcefrog:doc-separator, r=frewsxcv Clearer description of std::path::MAIN_SEPARATOR.	2016-11-27 20:22:44 -06:00
bors	03bdaade2a	Auto merge of #38022 - arthurprs:micro-opt-hm, r=bluss Use displacement instead of initial bucket in HashMap code Use displacement instead of initial bucket in HashMap code. It makes the code a bit cleaner and also saves a few instructions (handy since it'll be using some to do some sort of adaptive behavior soon).	2016-11-27 17:06:58 -06:00
arthurprs	178e29df7d	Use displacement instead of initial bucket in HashMap code	2016-11-27 21:38:46 +01:00
bors	2008732975	Auto merge of #37983 - GuillaumeGomez:tcp_listener_doc, r=frewsxcv Add examples for TcpListener struct r? @frewsxcv	2016-11-27 10:39:41 -06:00
Guillaume Gomez	f216f1fc53	Add examples for TcpListener struct	2016-11-27 13:00:31 +01:00
bors	9a8657925b	Auto merge of #38004 - GuillaumeGomez:tcp_stream_doc, r=frewsxcv Add missing urls and examples to TcpStream r? @frewsxcv	2016-11-26 15:37:34 -06:00
Guillaume Gomez	ebcc6d2571	Add part of missing UdpSocket's urls and examples	2016-11-26 21:35:41 +01:00
Martin Pool	591c134456	Clearer description of std::path::MAIN_SEPARATOR.	2016-11-26 09:24:48 -08:00
Seo Sanghyeon	44b926a6bb	Rollup merge of #38010 - frewsxcv:lock-creations, r=GuillaumeGomez Document how lock 'guard' structures are created.	2016-11-26 22:02:15 +09:00
Seo Sanghyeon	f9f92e12c7	Rollup merge of #38001 - vickenty:patch-1, r=steveklabnik Follow our own recommendations in the examples Remove exclamation marks from the the example error descriptions: > The description [...] should not contain newlines or sentence-ending punctuation	2016-11-26 22:02:14 +09:00
Seo Sanghyeon	18f4006e09	Rollup merge of #37985 - frewsxcv:completed-fixme, r=petrochenkov Remove completed FIXME. https://github.com/rust-lang/rust/issues/30530	2016-11-26 22:02:14 +09:00
Seo Sanghyeon	eeac361f52	Rollup merge of #37978 - fkjogu:master, r=sfackler Define `bound` argument in std::sync::mpsc::sync_channel in the documentation The `bound` argument in `std::sync::mpsc::sync:channel(bound: usize)` was not defined in the documentation.	2016-11-26 22:02:14 +09:00
Seo Sanghyeon	a809749fdf	Rollup merge of #37962 - GuillaumeGomez:socket-v6, r=frewsxcv Add missing examples to SocketAddrV6 r? @steveklabnik cc @frewsxcv	2016-11-26 22:02:13 +09:00
Jeremy Soller	d73d32f58d	Fix canonicalize	2016-11-25 19:53:21 -07:00
Jeremy Soller	3a1bb2ba26	Use O_DIRECTORY	2016-11-25 18:23:19 -07:00
Corey Farwell	6075af4ac0	Document how the `MutexGuard` structure is created. Also, end sentence with a period.	2016-11-25 19:08:26 -05:00
Corey Farwell	6b4de8bf91	Document how the `RwLockWriteGuard` structure is created.	2016-11-25 18:57:11 -05:00
Corey Farwell	276d91d8cb	Document how the `RwLockReadGuard` structure is created.	2016-11-25 18:57:09 -05:00
Guillaume Gomez	56529cd286	Add missing urls and examples to TcpStream	2016-11-25 23:45:43 +01:00
Vickenty Fesunov	a3ce39898c	Follow our own recommendations in the examples Remove exclamation marks from the the example error descriptions: > The description [...] should not contain newlines or sentence-ending punctuation	2016-11-25 17:59:04 +01:00
Corey Farwell	e1269ff688	Remove completed FIXME. https://github.com/rust-lang/rust/issues/30530	2016-11-24 16:26:21 -05:00
fkjogu	a3e03e42e1	Define `bound` argument in std::sync::mpsc::sync_channel The `bound` argument in `std::sync::mpsc::sync:channel(bound: usize)` was not defined in the documentation.	2016-11-24 09:49:30 +01:00
Jorge Aparicio	ba07a1b58d	std: make compilation of libpanic_unwind optional via a Cargo feature with this feature disabled, you can (Cargo) compile std with "panic=abort" rustbuild will build std with this feature enabled, to maintain the status quo fixes #37252	2016-11-23 21:49:54 -05:00
Theodore DeRego	5c1c48532f	Separated fuchsia-specific process stuff into 'process_fuchsia.rs' and refactored out some now-duplicated code into a 'process_common.rs'	2016-11-23 13:58:13 -08:00
Jeremy Soller	6733074c84	Allow setting nonblock on sockets	2016-11-23 14:22:39 -07:00
Guillaume Gomez	559141c827	Add missing examples to SocketAddrV6	2016-11-23 17:14:41 +01:00
Jeremy Soller	4a0bc71bb7	Add File set_permissions	2016-11-23 08:24:49 -07:00
Jeremy Soller	b3c91dfb6a	Merge branch 'master' into redox	2016-11-23 08:21:15 -07:00
Guillaume Gomez	a5049f7bba	Add ::1 example in IPv6 to IPv4 conversion	2016-11-23 12:24:04 +01:00
Guillaume Gomez	cfc7fce2f0	Rollup merge of #37925 - jtdowney:env-args-doc-links, r=steveklabnik Add some internal docs links for Args/ArgsOs In many places the docs link to other sections and I noticed it was lacking here. Not sure if there is a standard for if inter-linking is appropriate.	2016-11-23 12:18:10 +01:00
Guillaume Gomez	881115c896	Rollup merge of #37913 - GuillaumeGomez:socket-v4, r=frewsxcv Add missing examples for SocketAddrV4 r? @steveklabnik cc @frewsxcv	2016-11-23 12:18:10 +01:00

... 4 5 6 7 8 ...

10095 commits