Windows in particular can experience slow encoding speeds on highly
fragmented disks. Using setvbuf to increase the size of the buffer to
10Meg.
This is probably not needed on Linux/Unix, but is very unlikely to
to cause any negative effects.
Patch-from: Janne Hyvärinen <cse@sci.fi>
* Changes flac_snprintf (in src/share/grabbag/snprintf.c) and its copy
local_snprintf (src/libFLAC/metadata_iterators.c) to be almost sane.
* Adds flac_vsnprintf (src/share/grabbag/snprintf.c) and its copy
local_vsnprintf (src/share/win_utf8_io/win_utf8_io.c).
* Changes stats_print_info in src/flac/utils.c so it uses flac_vsnprintf
instead of vsnprintf. This makes return value checking unnecessary.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Adds two new apodization functions that seem to perform better than
the apodization functions currently in the codebase and fixes three
existing windows as well.
Its important to note that this patch only affects the encoder stage
that evaluates various possible predictors. Audio encoded with these
new windows will still decode with existing legacy decoders.
= Theory =
These functions are used to window the audio data at the predictor
stage. These news functions enable the use of only part of the signal
to generate a predictor. This helps because short transients can
introduce noise into the predictor. The predictor becomes very good
at prediciting one part of the signal, instead of mediocre for the
whole block.
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
* Removes FLAC__lpc_restore_signal_asm_ppc_altivec_16*
from lpc.h and stream_decoder.c
* Removes PPC-specific code from cpu.c and cpu.h
* Removes PPC stuff from libFLAC/Makefile.lite and build/*.mk
* Removes as/gas/PPC-specific stuff from configure.ac and
libFLAC/Makefile.am*
* Removes libFLAC/ppc folder and remove "src/libFLAC/ppc*/Makefile"
lines from configure.ac
Patch-from: lvqcl <lvqcl.mail@gmail.com>
According to patch author GCC can optimize expressions like
"(a<<8)|(a>>8)", but has problems with "(a<<8)+(a>>8)".
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Accelerate FLAC__lpc_compute_autocorrelation_intrin_sse_lag_NN routines for
AMD and newer Intel CPUs (means Core i aka Nehalem and newer). Unfortunately
it's slower on older Intel CPUs.
According to tests at HA:
<http://www.hydrogenaud.io/forums/index.php?s=&showtopic=101082&view=findpost&p=870753>
CPU flac -5 flac -8
Athlon XP +5 % +2.4 %
Athlon 64 X2 +9 % +4 %
Core i +7 % +1 % ... +2.7 %
Core 2 ? -3.5 %
According to Steam HW survey <http://store.steampowered.com/hwsurvey/>
69% of Steam users have SSE4.2 which means that the new code is faster for
them. There are also AMD users that don't have SSE4.2, so 75% of Steam users
should benefit from this patch.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
The html documentation was diff'ed to the current website pages
and all difference were merged into the page they weren't yet
incorporated. This includes lots of small fixes and
improvements.
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
When using the -p switch during encoding, the encoder should try
different qlp predictor precision steps. However, some faulty code
was too severely restricting the possible steps. This patch lifts
the restriction to match a restriction coded a little further in
the process. This doesn't make using -p worth your while, but at
least it doesn't create larger files now
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
Simplify the code that tries to detect whether OS supports SSE instructions.
a) Linux: "old" vs "new" sigaction
OBSOLETE_SIGCONTEXT_FLAVOR was disabled in Mar 2007 in commit 1ca3a445f.
According to <http://unixhelp.ed.ac.uk/CGI/man-cgi?sigaction>: "Support for
SA_SIGINFO was added in Linux 2.2" (released in Jan 1999). If noone wants to
use FLAC with Linux kernel 2.0 then it's safe to delete this code.
b) MSVC: try/catch vs. sigill_handler
TRY_CATCH_FLAVOR was enabled in Jan 2009 in commit a832ef32. According to the
comment in cpu.c, "sigill_handler flavor resulted in several crash reports on
win32". Also this sigill_handler flavor is not thread-safe.
c) MinGW: fxsave/fxrestore vs. sigill_handler
The code was added Mar 2014 in commit 99d5154f. It's better to use FXSR flavor
instead of sigill_handler flavor. The reasons are the same as for MSVC.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Add new function:
FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse41()
and rewrite function:
FLAC__lpc_compute_residual_from_qlp_coefficients_16_intrin_sse2()
Testing shows noticeable speed increase on Intel Core i3/5/7 (up to 30%
for -8 mode), AMD Athlon64, Phenom, Bulldozer/Piledriver, but no increase
or even very small speed decrease (~2% for -8 mode) on Intel Core2.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
In FLAC 1.2.0, a new field 'FLAC__CPUInfo cpu_info' was added to the
FLAC__BitReader struct. It became useless in 1.3.0 because of various
bitreader optimizations.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Previously CFLAGS had a -O3 at the start and a -O2 at the end. According
to the GCC docs:
https://gcc.gnu.org/onlinedocs/gcc-4.9.1/gcc/Optimize-Options.html
"If you use multiple -O options, with or without level numbers,
the last such option is the one that is effective" which means that
GCC doesn't try to use SIMD to vectorize the code, etc."
Windows (MSVC, MinGW) versions of setlocale don't care about LC_*
environment variables. For example, flac cannot pass the test for
--until and --skip options the script calls it with --skip=0:01.1001
and it expects decimal comma (--skip=0:01,1001) on some locales.
Solve this (on Windows) by calling setlocale(LC_ALL, "") if some
LC_* variable is set to "C".
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Previous version of get_console_width() may return 0 which will result in
a division by 0 in stats_print_name():
console_width = get_console_width();
len = strlen_console(name)+2;
console_chars_left = console_width - (len % console_width);
Bug-report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=739613
Patch-from: lvqcl <lvqcl.mail@gmail.com>
This function was un-used because it showed no speed improvement over the
C version. As a result the bitreader_read_from_client_() function can be
made static again.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
This function offer no speed up from the C version of the function and were
commented out after the release of 1.3.0. We will now drop them completely.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
In the precompute_partition_info_sums_ function, instead of selecting
64-bit accumulator when the signal bps is larger than 16, revert to the
original approach based on partition size, but make room for few extra
bits to not overflow with unusual signals where the average residual
magnitude may be larger than bps.
It slightly improves the performance with standard encoding levels and
16-bit files as the 17-bit side channel can still be processed with the
32-bit accumulator and correctly selects the 64-bit accumulator with
very large 16-bit partitions.
This is related to commits 6f7ec60c and 187e596e.
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>