Skip to content

The 5 Steps to Greater Performance

Jurassic edited this page May 17, 2011 · 6 revisions
  1. Determine bottlenecks!
    This is a crucial part of optimizing any pipelined application. The results will also be perfect candidates for the Performance Review part of the thesis.
  2. Disable Quex asserts.
    Quex is very persistent in reminding me that is running slower than usual due to debugging asserts being turned on. As Quex deals with a lot of short operations, it is credible that a plethora of asserts might break its performance.
  3. Use tbb::scalable_allocator.
    Threading Building Blocks supply their own C++ allocators which should use insights about the way computers to really work in multithreaded contexts to increase performance. We'll see about that... hopefully.
  4. Cut down on dynamic allocations in the pipeline.
    In the current implementation, a new chunk of memory is allocated twice for every chunk of tokens to be processed during tokenization. These allocated chunks could be recycled so almost no allocation would take place during runtime.
  5. Use tries instead of std::multimap<string,int> for enumeration properties.
    This would most probably be more effective, but I believe that the performance of the FeatureExtractor which is responsible for testing properties will not be the bottleneck of this application on multithreaded machines, because it is the only filter in the pipeline which can be rune simultaneously. Maybe a boost::unordered_multimap<string, int> might be more effective as well and easier to deploy, because it would replace the numerous string comparisons with a hash computation.
  6. Decrease the likelihood of the working TBB threads stalling on I/O The TBB documentation warns that the pipeline is not suitable for use with threads which are dependent on I/O operations due to its scheduling algorithm. This was anticipated in the design of the system and thus the input files and output files are managed by the TextCleaner and Encoder classes whose code runs on different threads. It might be beneficial to setup the pipestream connecting these I/O threads with the working threads in a way, that its buffer is large or even infinite, to reduce the probability of a worker thread idly waiting for an I/O thread to fill or clean the buffer. There is an easy option in the pipestream constructor which lets use infinite dynamic buffers but this might pose a problem when processing really large files on a machine with limited memory. It might also be possible to modify the pipestreambuf code to allow for a larger buffer when using the limited capacity buffer option.
  7. Process multiple files in parallel This parallelization offers itself when we would like to upscale the application for use on a machine with many CPUs. This would however necessitate a lot of locking, multiple instances of the pipeline and making sure that the shared object's methods are thread-safe.
Clone this wiki locally