In early computer architectures, processor operation was very simple and strictly sequential. In the first step, for each instruction the program counter (PC) would be used to send the next instruction address to memory. Potentially several clock cycles later, the instruction is returned from memory. Then the instruction would be decoded. Decoding produces a list of source and destination operands that the instruction operates on, and a specific operation that is to be performed. In the next step, source operands would be accessed and delivered to the arithmetic-logic unit (ALU). The ALU eventually performs the operation that was specified in the instruction and delivers a result. The result is then written back to the destination that was decoded. Finally, the PC would be updated to advance to the next instruction that is to be executed, after which the whole process starts from the beginning for the next instruction.
It is easy to see that in this type of design, many operations of the processor are unnecessarily serialized and large portions of the processor sit idle for a majority of time. For example, the ALU is only busy during the period where the operation is performed on the source operands, but sits idle during the rest of the time it takes to execute an instruction. It is not uncommon for an instruction to consume on the order of 10 clock cycles to execute. Processor architects often quote processor performance in instructions executed per clock cycle (IPC).
No comments:
Post a Comment