John Chamberlain |
Developer Diary |
Developer Diary · You Heard It Here First · Friday 20 February 2004 |
Principles of Programming: Allocate Only as Much Space as Needed |
Sometimes a principle is pretty obvious like "allocate only as much space as needed". Nevertheless, it pays to state because even though it is obvious not everyone follows it. This is because following it is not always easy. You need the principle to emphasize the importance of the idea so that you devote the creative energies the problem deserves to its solution.
In this case the difficulty is that often you do not know ahead of time how much memory will be needed so their is a temptation to allocate excess memory repetitively as you go along which in most cases is a mistake. For example, imagine collecting all the members of a list that satisfy a criteria into a second list. The correct way to do this is first scan the list and count the members that qualify, then allocate the memory for the new array and finally populate the new array with the exact same algorithm. If you suspect that this is inefficient because you must scan the array twice, do a test to find out for sure (using collections for example). What you will find is that doing multiple memory allocations is far more expensive than doing the scan twice (this a different principle). Memory allocation is so expensive that repeating any straightforward calculation twice is always justified.
One way to reduce the amount of code in your program and and guarantee that the algorithm is the same for both passes is to use the same loop for doing both operations. Here is an example: boolean zCounting = true; int ctIncludedMembers = 0; int xSubList = 0; while( true ){ for( int xList = 1; xList < ctListMembers; xList++ ) if( zInclude( aList[xList] ) ) if( zCounting ) ctIncludedMembers++; else aIncludedList[++xSubList] = aList[xList]; if( zCounting ){ aIncludedList = new List[ctIncludedMembers + 1]; zCounting = false; } else break; }The above algorithm makes two passes through the list, one to count the members to be included in the sublist and second to populate the sublist. Note that the algorithm assumes you are using one-based lists (that is another principle). What if you truly cannot determine the number of items that will be needed ahead of time? When you have an external or live source of data you will not be able to determine the number of things of a given type that might be in that data so it is impossible to pre-allocate the exact amount of memory needed for storage. In this case the rule is to pre-allocate the median (or typical) amount of storage required according to your best knowledge and then double the allocation every time the allocation amount is exceeded. If you follow these rules you will be allocating memory in the most efficient way in the long term. |
return to John Chamberlain's home · diary index |
Developer Diary · about · info@johnchamberlain.com · bio · Revised 20 February 2004 · Pure Content |