What is the difference between a page and a paragraph?
In the real world, we all know that pages contain paragraphs and that paragraphs are full of sentences created from words.
In the world of Microsoft Windows Operating Systems, it is somewhat different – paragraphs contain pages!
In this article, I’m to going to explain what a page is and what a paragraph is and how they relate to each other and why this information can be useful in helping to identify and resolve certain memory-related bugs.
Virtual Memory Pages
A virtual memory page is the smallest unit of memory that can be mapped by the CPU. In the case of 32-bit x86 processors such as the Intel Pentium and AMD Athlon, a page is 4Kb. When you make a call to VirtualProtect() or VirtualQuery() you will be setting or querying the memory protection for sizes that are multiples of a page.
The size of a page may vary from CPU type to CPU type. For example, a 64-bit x86 CPU will have a page size of 8Kb.
You can determine the size of a page by calling GetSystemInfo() and reading the SYSTEM_INFO.dwPageSize value.
Virtual Memory Paragraphs
A virtual memory paragraph is the minimum amount of memory that can be committed/reserved using the VirtualAlloc() call. On 32 bit x86 CPUs this value is 64Kb (0x00010000). If you have ever used the debugger and looked at the load addresses of DLLs in the Modules list you may have noticed that DLLs always load on 64Kb boundaries. This is the reason – the area a DLL is loaded into is initialised by a call to VirtualAlloc to reserve the memory prior to the DLL being loaded.
You can determine the size of a paragraph by calling GetSystemInfo() and reading the SYSTEM_INFO.dwAllocationGranularity value.
Given these values, you can see that (on 32-bit x86 systems) a virtual memory paragraph is composed of 16 virtual memory pages.
How can I use this information?
If you are using VirtualAlloc() it is important to know the granularity at which the allocations will be returned. This is the size of a paragraph. This information is fundamental in deciding how you would implement a custom heap. You know there are fixed boundaries at which your data can exist. You can enumerate the list of possible paragraph locations very quickly (there are 32,768 possible locations in a 2GB space, as opposed to 2 billion locations if the paragraph could start anywhere).
Custom heaps
If you are writing a custom heap, a key indicator to keep track of is memory fragmentation and memory utilisation. Knowing your paragraph and page sizes you can inspect how each page and each paragraph of memory are used by the application and the custom heap to determine if there is wastage, what wastage there is and what form the wastage takes. This information could lead you to modify your heap algorithm to use pages differently to reduce memory fragmentation.
See Delete memory 5 times faster for one simple technique, using HeapAlloc(), the same principles apply here.
Loading large data files
Another use for this information is finding out why a certain large file will not load into memory despite Task Manager saying that you have 2GB of free memory.
It is not uncommon to find a forum posting somewhere from someone that has a large image file (a satellite photo, MRI scan, etc) that is about 1GB in size. They wish to load it into memory, do in-memory processing on it, save the results, discard the memory then repeat the process, often for numerous images. Typically on the third attempt to load a large file, the file will not load and the forum poster is left very confused.
The typical implementation is to allocate space for the large file using a call such as malloc() or operator new(). Both of these use the C runtime heap to allocate the memory.
The principle seems fine, but the problem is caused by memory fragmentation which results in a less accessible, totally usable, free space because the remaining free space blocks are separated into many smaller regions, most of which are smaller than any forthcoming large allocation required by the application. Without the information about where pages and paragraphs are situated, how big they are and what their status is, identifying the cause of this failure could be very time-consuming. Once you know the cause, you can think about allocating and managing your memory differently and prevent the bug from happening in the first place.
For situations like these, using HeapAlloc() with a dedicated heap (created using HeapCreate()) or even just directly using VirtualAlloc() will most likely lead to superior results than using the C runtime heap.
Tools
The first step in understanding such bugs is to be able to visualize the memory and to also inspect the various page and paragraph information.
To aid in these tasks we have just added a VM Pages view and VM Paragraphs view to VM Validator to make identifying such issues easier. VM Validator is a free download.
Memory Validator will also be updated with a VM Paragraphs view in the next release (Memory Validator already has a more detailed VM Pages view).
Thank you to Blake Miller of Invensys for suggesting an alternative wording for one paragraph of this article.