¡@

Home 

c++ Programming Glossary: speedup

Any reason to overload global new and delete?

http://stackoverflow.com/questions/1152511/any-reason-to-overload-global-new-and-delete

allocators for individual types too in many cases the speedup or capabilities you can get by providing custom allocators for.. of use of an STL data structure far exceeds the general speedup you can get from the global overloads. Take a look at some of..

Why does changing `const ull` to `const ull&` in function parameter result in performance gain?

http://stackoverflow.com/questions/14805641/why-does-changing-const-ull-to-const-ull-in-function-parameter-result-in-pe

typedef unsigned long long ull results in a roughly 25 speedup when compiled with gcc 4.7.2 and flags O3 std c 11 g and I can't.. question I was able to reproduce your observation of the speedup it was even more noticeable for me 1.75x faster . The problem..

Fastest way in C to determine if an integer is between two integers (inclusive) with known sets of values

http://stackoverflow.com/questions/17095324/fastest-way-in-c-to-determine-if-an-integer-is-between-two-integers-inclusive

trying the accepted answer I got an order of magnitude speedup on the one line of code over doing it the normal x start x end..

How to analyze program running time

http://stackoverflow.com/questions/18194577/how-to-analyze-program-running-time

and so on. You could easily find that you get enormous speedup this way. flame Some people say this is exactly what profilers.. 25 and C taking 12.5 . If you fix them all you get 8x speedup. If you only miss A you get 1.6x. If you only miss B you get.. only miss A you get 1.6x. If you only miss B you get 2.67x speedup. If you only miss C you get 4x which is still twice as slow..

modular arithmetics and NTT (finite field DFT) optimizations

http://stackoverflow.com/questions/18577076/modular-arithmetics-and-ntt-finite-field-dft-optimizations

1x times loop so its not very precise error ~ 10 but the speedup is noticeable even now normally i loop it 1000x an more but.. and eliminating unnecessary calls. The resulting speedup is stunning more than 40x times now is NTT multiplication faster.. further questions Does anyone see any other option to speedup NTT Are my optimizations of modular arithmetics safe results..

How to speed up my sparse matrix solver?

http://stackoverflow.com/questions/2388196/how-to-speed-up-my-sparse-matrix-solver

as bad as a jacobi iteration it's hard to say whether the speedup offsets it. Use SOR . It's simple doesn't add much computation..

How can I quickly enumerate directories on Win32?

http://stackoverflow.com/questions/2511672/how-can-i-quickly-enumerate-directories-on-win32

I quickly enumerate directories on Win32 I'm trying to speedup directory enumeration in C where I'm recursing into subdirectories... you can use FindFirstFileEx with FindExInfoBasic the main speedup being omitting the short file name on NTFS file systems where..

What can I use to profile C++ code in Linux?

http://stackoverflow.com/questions/375913/what-can-i-use-to-profile-c-code-in-linux

cause of performance problems and the opportunity to get speedup . Added It might not be obvious but the stack sampling technique..

OpenMP: What is the benefit of nesting parallelizations?

http://stackoverflow.com/questions/4317551/openmp-what-is-the-benefit-of-nesting-parallelizations

threads is greater than the cores which may degrades the speedup. In an extreme case where nested parallelism is called recursively.. the case of where N # of CPU. Yes right in this case the speedup would be limited by N and letting nested parallelism will definitely..

Porting optimized Sieve of Eratosthenes from Python to C++

http://stackoverflow.com/questions/5293238/porting-optimized-sieve-of-eratosthenes-from-python-to-c

in about 415 ms on the same machine as above. That's a 3x speedup better then I expected #include vector #include boost dynamic_bitset.hpp..

Why is CUDA pinned memory so fast?

http://stackoverflow.com/questions/5736968/why-is-cuda-pinned-memory-so-fast

is CUDA pinned memory so fast I observe substantial speedups in data transfer when I use pinned memory for CUDA data transfers... pages could've been swapped out yet I still observed the speedup. Can anyone explain what's really going on here any insight..

Are there alternatives to polymorphism in C++?

http://stackoverflow.com/questions/584544/are-there-alternatives-to-polymorphism-in-c

speedwise by virtual function calls as hinted at here. A speedup of even 2.5x would be fantastic. The classes in question are..

How does Intel TBB's scalable_allocator work?

http://stackoverflow.com/questions/657783/how-does-intel-tbbs-scalable-allocator-work

Just using TBB 3.0 for the first time and seen my best speedup from scalable_allocator yet. Changing a single vector int to..