I have described in detail how one can profile and improve an AngularJS application's performance. The blog post grew very long, includes lots of charts and source code snippets. In this blog post I will summarize the main lessons learnt.
Micro-benchmarks do not matter
Synthetic benchmarks require a lot of effort and lead to optimizing wrong things.
Learn how to accurately profile your code.
Getting the accurate information from live application is essential to avoid chasing phantom problems. I recommend using code snippets that work very nicely with AngularJS.
Upgrade the framework
There is a huge (factor of 3) speedup in my primes
experiments between AngularJS 1.2 and 1.3 versions.
Optimize top bottleneck first
Chasing anything but the top place where the execution cycles are spent is less efficient.
Minimize number of watchers
Every watch expression will execute on every digest cycle. Use count watchers snippet and bindonce.
Simplify watch expressions - precompute data
If the data does not change, do not use two-way binding and do not use filters. Instead compute the data once.
Work in batches - use web workers
Application's JavaScript, AngularJS library, browser layout and painting all run in a single thread. Split the single run into small batches, or offload some parts to the separate threads using web workers.
Work on demand
Avoid precomputing things that might not be needed. Instead compute data right before the user demands it, for example using ng-infinite-scroll.
Limit DOM elements to visible data
Do not place into the DOM lots of nodes, even if the data is available. Remember that lots of DOM elements slow the browser layout and paint operations. Use 3rd party plugins that maintain only the visible elements, for example angular-vs-repeat.
AngularJS does NOT have a performance problem
Reread Improving Angular Web App Performance Example - I moved step by step from a simple prototype that anyone with a minimal programming experience could have written to a real-time 60fps, multi-threaded app. I could swap individual parts, precompute data, split computation and DOM updates into batches, and finally change the way the visible elements are treated - making all these changes while staying mostly in the Angular ecosystem. The framework just works from the start, and its individual parts can be easily super charged to behave much faster in a specific use case.