The start-up time for a load is the time to get the first word from memory into a register. Unlike simpler functional units, the initiation rate may not necessarily be one clock cycle because memory bank stalls can reduce effective throughput. Penalties for start-ups on load/store units are higher than those for arithmetic units
To maintain an initiation rate of one word fetched or stored per clock, the memory system must be capable of producing or accepting this much data. Spreading accesses across multiple independent memory banks usually delivers the desired rate. Having significant numbers of banks is useful for dealing with vector loads or stores that access rows or columns of data.
Most vector processors use memory banks, which allow multiple independent accesses rather than simple memory interleaving for three reasons
- Many vector computers support multiple loads or stores per clock, and the memory bank cycle time is usually several times larger than the processor cycle time. To support simultaneous accesses from multiple loads or stores, the memory system needs multiple banks and to be able to control the addresses to the banks independently.
- Most vector processors support the ability to load or store data words that are not sequential. In such cases, independent bank addressing, rather than interleaving, is required.
- Most vector computers support multiple processors sharing the same memory system, so each processor will be generating its own independent stream of addresses.
Memory is divided into banks that can be accessed independently; banks share address and data buses. The above architecture can start and complete one bank access per cycle. It can sustain 16 parallel accesses if they go to different banks.