Table of Contents

Cycle Counting

Field Assembly language programming
Went Obsolete 1998
Made Obsolete By Multi-stage cpu pipelines, > 500mhz cpu cores, advanced caches,graphics co-processors, multi-tasking operating systems, high level languages such as java, perl and php
Knowledge Assumed A detailed understanding of the exact timings (“machine cycles”) for the execution of each processor instruction used in the program, including the effects of ram/rom accesses and other factors such as addressing modes.
When useful Processor constrained systems where it is critical to the application (usually, for some real time need) to get a job done within a definite amount of time, such as graphical applications or signal processing.

Far from obsolete. Might be obsolete in applications where cycle-exact instructions were always impossible to implement, but is still relevant where it ever was. It's like listing writing as obsolete, because some opt to use speech dictation.

Cycle Counting is an optimization technique, and was used to determine how long some section of code would take to execute, such as a subroutine or other critical section of code. Time on a processor is expressed as a number of 'machine cycles' or 'clock cycles', which is an exact measurement based on it's input frequency that all other hardware in the computer (usually) received timing from as well. If necessary, the 'Cycle Count' could be translated into real time by determing the processor speed (as in 'mhz' - millions of cycles per second, or 'ghz' - billions of cycles per second) by dividing the processor's cycle per second by the number of cycles used. This skill was used to great effect by programmers of early video game systems and home computers such as the Atari 400/800, Commodore 64/Vic-20, the Atari VCS and many arcade games, but it has a long history stretching all the way back to the very first computers and programmers. Cycle counting is a Matrix skill: If you can do it well, it means you truly understand the machine for what it is, and for you…there is no spoon. The idea is that you are programming in assembly language, the very lowest level of the machine possible, and all instructions are not created equal in terms of how long each one actually takes to complete. The processor instruction set manual or other manufacturer documentation would list the cycle times for each instruction and it's variations. And as with most things (perl comes to mind), there also are usually several different ways to get the same net result done, usually with various trade offs that may or may not be important to you. For example, the 6502 processor has a special optimization whereby memory references to 'page zero' (the first 256 bytes of memory, which is one 'addressing mode' of many) took less cycles to complete than memory references to anywhere else within it's 64k address space (References to 'page zero' also resulted in smaller instructions - two bytes instead of three, because you only needed 1 byte for the processor opcode and 1 byte to say what index you wanted. This in turn meant the processor had to spend less time loading the instruction to examine it, and less time setting up the address bus since it only had to manipulate 8 address lines instead of 16). There are lots of these types of cases on every processor ever made. Some applications like graphics and games demanded that certain things happen within a certain amount of time, usually in response to realtime events. For example, because early graphics systems had limited capabilities, it was often necessary to optimize the time spent drawing objects to ensure they could be done before the next frame began to display. The penalty for taking too long was undesirable affects such as incomplete displays, flashing, artifacts, or other 'display glitches'. And sometimes, it was possible to push the hardware to display more colors or objects on screen than was normally allowed if you were able to reprogram the display hardware at exactly the right time. So knowing exactly, in terms of clock cycles, how long some operation takes, was crucial to determining weather you will make your deadline. Optimizing software to run faster, really comes down to recoding it to take less clock cycles to complete. The techniques used include substituting one type of instruction with another (usually, one that did the same operation just with a different addressing mode like targeting a processor register instead of memory), using pre-computed tables (instead of multiplying by some constant every time thru the loop, why not have a table of answers that you can look up with a simple index?), unwinding loops (duplicate the inner code of a loop and reduce the number of compare/branch ops, or in some cases, simply do away completely with the loop and just write the operation N times in a row, which eliminates the branch/compare step entirely and may lend itself to more precomputed results). , and many other ingenious forms of chicanery that are far too numerous to mention here. Having an accurate cycle count is the only way to effectively evaluate whether optimization is needed and what effect various optimization techniques has to your bottom line. This type of optimization has become largely irrelevent due to huge advances in processors along with the mainstream use of high level languages instead of assembly for applications, and multitasking operating systems and multi-threaded applications. As our applications development tools have advanced, the need - and indeed, the opportunity - to gain low level famillarity with the machine, has diminished greatly. Peformance issues are now seen as simply 'get a faster computer'. In this author's experience however, most applications are coded very poorly and don't even try to be more effecient. Such as java and stringified variables, or passing by value instead of reference, or stepping thru a table serially instead of using a hash, or one of millions of other woefully ineffecient programming techniques that 'get the job done', but at the price of peformance.

 
skills/cyclecounting.txt · Last modified: 2012/02/13 17:37 by jammi
 
Recent changes RSS feed Creative Commons License Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki