my musings 

Facebook Twitter LinkedIn YouTube E-mail RSS
formats

cache locality optimization extensions in gcc

Tonight I thought I’d make a quick post a compiler optimization I ran across the other day. There are two GCC extensions with function attributes which allow you to take advantage of cache locality. For those of you who are not familiar with what cache locality is consider if you had 1024 bytes of memory with a page size of 64 bytes and only one page in the page table. If you have a function at offset 0, another at offset 8, and a third at offset 512 if you call the function at offset 0 then the function at 8 you are not going to have to fetch a new page between these two function calls since 0 and 8 are in the same page. On the other hand if you call the function at offset 0 and then you call the function at offset 512 a new page needs to be loaded into memory: a page fault. As you can see this takes longer and so it is desirable to have functions that are used often close to one another.

Now GCC has two extensions which allow you to take advantage of this: __attribute__((hot)) and __attribute__((cold)). These attributes allow you to mark a function as being used often (hot) or being used in-often (cold). This allows the compiler to place hot functions near each other in memory as to cause less page faults. These are overridden if you use -fprofile-use which allows you to pass in some profile statistics to GCC so it can make locality determinations automatically.

If you are interested in this check out the GCC documentation for more information. There are some other good attributes on that page as well. Have a great labor day weekend everyone!

Leave a Reply

Your email address will not be published. Required fields are marked *