According to Agner Fog in optimizing_assembly.pdf:
"... write to a partial register may result in false dependencies
between instructions, so it is better to avoid it."
Patch-from: lvqcl <lvqcl.mail@gmail.com>
GCC generates slow ia32 code for FLAC__lpc_restore_signal_wide() and
FLAC__lpc_compute_residual_from_qlp_coefficients_wide() so 24-bit
encoding/decoding is slower for GCC compile than for MSVS or ICC
compile. This patch adds ia32 asm versions of these functions.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
According to Agner Fog, "...you must make sure that all calls
are matched with returns. Never jump out of a subroutine without
a return and never use a return as an indirect jump."
(see paragraph 3.15 in microarchitecture.pdf and
examples 3.5a and 3.5b in optimizing_assembly.pdf)
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Most non-static functions have FLAC__ prefix, but they were missing
from the precompute_partition_info_sums_* functions.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Besides SPE (FSL e500v? cores) there are other powerpc processors
that don't support altivec instructions so only enable them when it's
100% sure that the target has it.
Signed-off-by: Gustavo Zacarias <gustavo@zacarias.com.ar>
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
Autoconf detects the Clang compiler as GNU GCC (clang sets defines like
__GNUC__ etc) but Clang is *not* completely compatible. If we detect
Clang we set ac_vc_c_compiler_gnu to 'no'.
Restrict works very poorly in Visual Studio (much slower than without)
so defined flac_restrict in share/compat.h and use that in:
lpc_compute_residual...()
lpc_restore_signal...()
As a result, FLAC__lpc_compute_residual_from_qlp_coefficients_wide_intrin_sse41()
offers no advantage for 64-bit compiles and was removed from x86-64 part
of stream_encoder.c
Patch-from: lvqcl <lvqcl.mail@gmail.com>
rplaces
OutputDirectory="..\..\..\..\objs\debug\bin"
with
OutputDirectory="$(SolutionDir)objs\$(ConfigurationName)\bin
and so on.
Rmoves
OutputFile="..\..\objs\debug\lib\$(ProjectName).lib
when possible.
Also, in the current version "Whole program optimization" compiler option
is set, but the corresponding linker option isn't. From MSDN:
"If you do not explicitly specify /LTCG when you pass /GL or MSIL modules
to the linker, the linker eventually detects this and restarts the link
by using /LTCG. Explicitly specify /LTCG when you pass /GL and MSIL modules
to the linker for the fastest possible build performance."
So /LTCG option was added too.
Debug build now uses libogg_static.lib from .\objs\debug\lib folder.
(the dependency for both release and debug is
objs\$(ConfigurationName)\lib\libogg_static.lib)
Patch-from: lvqcl <lvqcl.mail@gmail.com>
* Splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c
* Add FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2()
function to lpc_intrin_sse2.c
* Add lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions
(useful for 24-bit en-/decoding)
* Add precompute_partition_info_sums_intrin_sse2() / ...ssse3() and
disables precompute_partition_info_sums_32bit_asm_ia32_().
SSE2 version uses 4 SSE2 instructions instead of 1 SSSE3 instruction
PABSD so it is slightly slower.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
For some reason all documentation lists the max rice partition
order to be 16, while the maximum is 15. This fixes flac -H, the
man page and the HTML source code documentation
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
Before this patch it was possible to set or get data.ia32.sse3 value
from x86-64 code, etc which is a potential source of errors.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
This patch removes all content that is better viewed online (i.e.
downloads, links etc.) and not necessary for development.
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
While developing FLAC Frontend, there where several occasions where
I felt little restrained because the FLAC logo is available only in
gif. I made an SVG version and rendered a new GIF version from it.
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
A preprocessor macro FLAC__ALIGN_MALLOC_DATA is defined in the Makefiles
but absent in *.vcproj files. This patch adds it to libFLAC_static.vcproj
and libFLAC_dynamic.vcproj.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
For the 32 bit x86 ASM functions there were already versions of this
function for lags (N = 4, 8, 12). They require lpc_order less than N.
The best compression preset (flac -8) uses lpc_order up to 12; it
means that during encoding FLAC also uses unaccelerated C function.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>