有效地分析不常用的函数

 收藏

说我有以下代码:

int data[4294967296];
int index = 0;
int lightweight_function() {
  return data[index++]++;
}

int main() {
  int count = 0;
  for (size_t i = 0; i < (1 << 32); ++i) {
   ++data[i];
   if (i % (1 << 27) == 0) {
     count += lightweight_function();
   }
  }
  return 0;
}

使用带有热点的intel vtune对上面的代码进行性能分析

vtune -collect hotspots ./a.out

almost always provides a profile output without mention of lightweight_function. If the performance of lightweight_function is the primary concern, what is the best way to measure its performance? The code executing frequently surrounding lightweight_function has a significant impact on the function's performance due to modifying the data accessed within the function body, modifying which parts are in the caches. Running the function exclusively in a tight loop is therefore not an option. Increasing the frequency at which it is called is also not an option because it changes the overall behavior with respect to shared CPU resources.

Are there tools that support querying performance counters at very high frequency only in between the time that a function is entered and exited? What is the standard method to measure the performance of these types of functions? Ideally there would be a way to collect only data within lightweight_function, which would highlight the frequent cache misses without including the noise from the rest of the more frequently executed blocks.

回复