Rollup merge of #46050 - sunfishcode:read_to_end, r=sfackler

Optimize `read_to_end`.

This patch makes `read_to_end` use Vec's memory-growth pattern rather
than using a custom pattern.

This has some interesting effects:

 - If memory is reserved up front, `read_to_end` can be faster, as it
   starts reading at the buffer size, rather than always starting at 32
   bytes. This speeds up file reading by 2x in one of my use cases.

 - It can reduce the number of syscalls when reading large files.
   Previously, `read_to_end` would settle into a sequence of 8192-byte
   reads. With this patch, the read size follows Vec's allocation
   pattern. For example, on a 16MiB file, it can do 21 read syscalls
   instead of 2057. In simple benchmarks of large files though, overall
   speed is still dominated by the actual I/O.

 - A downside is that Read implementations that don't implement
   `initializer()` may see increased memory zeroing overhead.

I benchmarked this on a variety of data sizes, with and without
preallocated buffers. Most benchmarks see no difference, but reading
a small/medium file with a pre-allocated buffer is faster.
This commit is contained in:
kennytm 2017-11-22 01:12:58 +08:00 committed by GitHub
commit 9b090a0261

View file

@ -366,16 +366,13 @@ fn append_to_string<F>(buf: &mut String, f: F) -> Result<usize>
fn read_to_end<R: Read + ?Sized>(r: &mut R, buf: &mut Vec<u8>) -> Result<usize> {
let start_len = buf.len();
let mut g = Guard { len: buf.len(), buf: buf };
let mut new_write_size = 16;
let ret;
loop {
if g.len == g.buf.len() {
if new_write_size < DEFAULT_BUF_SIZE {
new_write_size *= 2;
}
unsafe {
g.buf.reserve(new_write_size);
g.buf.set_len(g.len + new_write_size);
g.buf.reserve(32);
let capacity = g.buf.capacity();
g.buf.set_len(capacity);
r.initializer().initialize(&mut g.buf[g.len..]);
}
}