Design Article
Comment
Steve Gabriel
We agree that a distributed approach does provide additional reliability, ...
Dr DSP
Are there issues with having a centralized state machine over a more distributed ...
FPGAs unleash potential of Flash memory for enterprise applications
David McIntyre, Altera Corporation
3/16/2012 10:00 AM EDT
Enterprise storage subsystems today are undergoing an essential transformation. The sheer volume of enterprise data and transactions is increasing by as much as 50-60% per year. The rapid proliferation of cloud computing and virtualization as a means to more efficiently manage these burgeoning data workloads has spawned explosive growth in the number and size of data centers. Along with the exponential growth in enterprise storage comes an imperative to improve memory subsystem performance capacity and value.
System administrators are finding that conventional storage architectures, which rely heavily on hard disk media, lack the performance to meet the demands of today’s workloads. Application architects are responding by adopting a holistic approach to memory architecture that combines conventional storage media with a new entrant in the enterprise space, flash memory. Long a preferred memory medium for consumer devices, NAND flash memory offers 10-100X performance improvement over that of hard disk drives (HDDs) for enterprise applications. Flash is also the most cost-effective non-volatile storage medium for frequently used data and applications. By using flash memory arrays, enterprises can dramatically reduce storage footprint, CPU and software licenses, and consequently, data center power, space and operation cost.
At the core of this new high performance memory subsystem is a PLD-based Programmable State Machine (PSM). The PSM supports RAID algorithms (essential to ensure data integrity), memory control and high-speed I/O functionality. Programmable logic devices are particularly advantageous for these state machines owing to their inherent design flexibility, embedded processors, hardened memory controllers and high-speed serial I/O blocks. A flash memory architecture supported by a PLD-based PSM uniquely enables data center administrators to respond to the growing demands on their storage resources by balancing performance needs with data integrity, system scalability and serviceability.
Leveraging the storage hierarchy
The various memory types and the role they play in a given storage subsystem can be viewed as a hierarchy, ordered in terms of performance, cost and capacity as illustrated in Figure 1. At the high end of the performance spectrum are embedded processor memory and L1/L2 cache. At the opposite end of the spectrum is tape back-up, which offers very high capacity storage at very low cost, but very slow speed. In between are the primary workhorses of storage subsystems: relatively fast and expensive DRAM, somewhat slower but less expensive and more capacious Flash RAM, and HDDs - traditionally the dominant storage medium in memory systems.
Today’s storage architectures employ a hybrid of memory types from throughout the hierarchy. Some systems might contain the spectrum of memory types, others only a couple. The mix and weighting of memory types in a given system depends on data volumes, processing workload and other factors. For data- and I/O-intensive applications such as enterprise data centers, a large number of memory types would be leveraged. HDD memory has been and will continue to be used for the majority of data storage and access operations. However, for mission critical applications that require the highest level of performance, flash is displacing spinning media because of its superior performance and competitive price point.
Flash for enterprise applications
The introduction of flash into enterprise storage systems is not without its challenges. Reliability concerns have historically been an impediment to use of flash in enterprise applications. These concerns have been at least somewhat allayed by the highly successful and ubiquitous use of flash in consumer products, from cell phones to PCs. Despite the consumer proof point, though, the nature of many enterprise applications, such as financial transaction processing, demand safeguards to ensure data integrity. By applying RAID algorithms to flash-based system design, though, data integrity for even the most sensitive applications can be assured.
Systems architects must also contend with multiple types of flash memory. There are currently two common varieties of flash: multi-level (MLC flash) and single-level charge (SLC flash). MLC flash allows for more than two possible states, while SLC offers only two. MLC is also less costly than SLC. Because the voltage states can drift in MLC flash, though, developers of many enterprise applications opt for the more reliable, more costly, SLC flash. Once again, though, developers are finding a means to utilize more economical MLC flash in enterprise applications through software. RAID algorithms and enhanced ECC protection can be deployed to mitigate MLC reliability risks.
Beyond flash, new memory types are evolving. Phase change memory, which purports faster switching times and more scalability than flash, is in its infancy. PCM technology currently lacks the capacity of flash, but eventually could be a viable option for deployment in the enterprise memory mix. In order to manage multiple memory types and take advantage of emerging technologies, the flexibility and adaptability of programmable logic devices for memory control ad management can prove invaluable in memory array subsystem design.
Flash memory access speed is another consideration. Often, I/O and access speed can be a limiting factor in memory system performance. Support for standards such as PCI Express (PCIe), which provide an efficient conduit between server farm and system, are critical. Such standards are evolving as rapidly as the memory types with which they interact, though. Thus, any flash-based subsystem needs to take into account this potential variability. PLDs, particularly those that offer PCIe cores and other optimized interfaces, offer the agility to keep up with and respond to fluctuating standards.
FPGA technology
When seeking a means to implement memory control functions and I/O interfaces for flash memory arrays, developers must consider the alternatives carefully. Conventional options, namely ASSP and ASIC devices, lack the flexibility required for the rapidly changing flash market. ASIC is also prohibitively expensive for the vast majority of applications. Instead, storage subsystem developers seeking to tap into the advantages of flash memory need the agility to rapidly respond and adapt to emerging memory types and evolving standards.
Programmable ICs such as FPGAs, by contrast, are ideally suited for flash-based enterprise applications. The inherently fast, low cost, low risk FPGA development cycle makes it possible to quickly adapt to changing requirements or to take advantage of new developments in memory technology and interface standards such as PCIe. With the high capacity of programmable devices today, it is also possible to build-in control and interfaces for multiple memory types such that a subsystem might support both MLC and SFC flash, for example. Critical fault-tolerant RAID algorithms are also easily implemented in FPGA logic.
FPGAs offer significant performance and customization advantages previously considered the exclusive purview of ASICs. PLDs today are produced on the most advanced silicon process nodes, so offer the highest performance available from available semiconductor technology. For example, 28-nm PLD interfaces can now transfer data across high-speed data traffic hubs at transmission rates up to 28 Gbps, meeting performance demands of even the highest speed protocols such as PCIe, SAS/SATA and Fibre Channel. FPGAs also now feature soft and hardened IP cores, such as memory controllers, embedded processors and transceiver blocks, that further enhance performance, enrich functionality and improve efficiency. Finally, advances in PLD packaging accommodate a generous number of high-speed I/O ports as well as general-purpose I/O pins.
Programmable state machine for flash cache
Memory array maker, Violin Memory, Inc., outlines the following high-level attributes of a memory array that scales cost-effectively and addresses the needs of next-generation, 24x7 enterprise data centers:
These attributes can be realized in new breed of storage array based on low cost per GB NAND flash memory. The architecture features two levels of flash functionality, flash control (vFLASH) and flash RAID (vRAID). The flash controller leverages flash technology read, write, erase operations and error conditions at the bit, block, plane and chip level in a flash translation layer. VFLASH functionality includes log-structured data layout and flash management “garbage collection” to keep space freed up. The RAID controller for flash memory should go beyond traditional RAID-1 and RAID-5 algorithms to address the unique characteristics of flash. For example, a 4+1 parity model is much more efficient and has lower latency than traditional algorithms, and also can cope more effectively with failures without requiring module replacement.
Both flash and RAID control can optimally be implemented in FPGA technology as illustrated in Figure 2. By implementing key algorithms in silicon-based state machines rather than the traditional microprocessor/software approach, significantly lower latency can be achieved. And, as mentioned previously, FPGA-based implementation results in a very flexible design that accommodates the rapid evolution of flash and associated features. A new design can be brought to market, and new opportunities be explored, very rapidly, at very low cost.
Also, by leveraging FPGA features such as memory controllers, transceiver blocks and high-speed interfaces to memory and PCIe cards, a highly optimized system can be brought to market in a matter of days or weeks, much faster than traditional approaches.
A new paradigm for enterprise storage
Enterprise storage systems today require the performance and cost advantages of flash memory to be competitive. The unique design challenges associated with flash, such as ensuring data integrity and dealing with emerging memory types and evolving standards, are readily addressed via deployment of FPGA-based PSM for memory management and I/O. The combination of FPGA technology and flash memory gives storage system architects a powerful means to attain the performance, while ensuring the system integrity, scalability and adaptability, required of even the most demanding workloads.
About the author
David McIntyre manages the Computer and Storage Business Unit at Altera Corporation. His responsibilities include driving top tier customer growth with initiatives and solutions.
With 20 years of experience at leading semiconductor and systems companies, David has held various engineering and marketing management positions including a director of strategic marketing post for the IBM Storage Systems Division.
If you found this article to be of interest, visit Programmable Logic Designline where you will find the latest and greatest design, technology, product, and news articles with regard to programmable logic devices of every flavor and size (FPGAs, CPLDs, CSSPs, PSoCs...).
Also, you can obtain a highlights update delivered directly to your inbox by signing up for my weekly newsletter – just Click Here to request this newsletter using the Manage Newsletters tab (if you aren't already a member you'll be asked to register, but it's free and painless so don't let that stop you [grin]).
System administrators are finding that conventional storage architectures, which rely heavily on hard disk media, lack the performance to meet the demands of today’s workloads. Application architects are responding by adopting a holistic approach to memory architecture that combines conventional storage media with a new entrant in the enterprise space, flash memory. Long a preferred memory medium for consumer devices, NAND flash memory offers 10-100X performance improvement over that of hard disk drives (HDDs) for enterprise applications. Flash is also the most cost-effective non-volatile storage medium for frequently used data and applications. By using flash memory arrays, enterprises can dramatically reduce storage footprint, CPU and software licenses, and consequently, data center power, space and operation cost.
At the core of this new high performance memory subsystem is a PLD-based Programmable State Machine (PSM). The PSM supports RAID algorithms (essential to ensure data integrity), memory control and high-speed I/O functionality. Programmable logic devices are particularly advantageous for these state machines owing to their inherent design flexibility, embedded processors, hardened memory controllers and high-speed serial I/O blocks. A flash memory architecture supported by a PLD-based PSM uniquely enables data center administrators to respond to the growing demands on their storage resources by balancing performance needs with data integrity, system scalability and serviceability.
Leveraging the storage hierarchy
The various memory types and the role they play in a given storage subsystem can be viewed as a hierarchy, ordered in terms of performance, cost and capacity as illustrated in Figure 1. At the high end of the performance spectrum are embedded processor memory and L1/L2 cache. At the opposite end of the spectrum is tape back-up, which offers very high capacity storage at very low cost, but very slow speed. In between are the primary workhorses of storage subsystems: relatively fast and expensive DRAM, somewhat slower but less expensive and more capacious Flash RAM, and HDDs - traditionally the dominant storage medium in memory systems.
Figure 1: Computer memory type hierarchy
Today’s storage architectures employ a hybrid of memory types from throughout the hierarchy. Some systems might contain the spectrum of memory types, others only a couple. The mix and weighting of memory types in a given system depends on data volumes, processing workload and other factors. For data- and I/O-intensive applications such as enterprise data centers, a large number of memory types would be leveraged. HDD memory has been and will continue to be used for the majority of data storage and access operations. However, for mission critical applications that require the highest level of performance, flash is displacing spinning media because of its superior performance and competitive price point.
Flash for enterprise applications
The introduction of flash into enterprise storage systems is not without its challenges. Reliability concerns have historically been an impediment to use of flash in enterprise applications. These concerns have been at least somewhat allayed by the highly successful and ubiquitous use of flash in consumer products, from cell phones to PCs. Despite the consumer proof point, though, the nature of many enterprise applications, such as financial transaction processing, demand safeguards to ensure data integrity. By applying RAID algorithms to flash-based system design, though, data integrity for even the most sensitive applications can be assured.
Systems architects must also contend with multiple types of flash memory. There are currently two common varieties of flash: multi-level (MLC flash) and single-level charge (SLC flash). MLC flash allows for more than two possible states, while SLC offers only two. MLC is also less costly than SLC. Because the voltage states can drift in MLC flash, though, developers of many enterprise applications opt for the more reliable, more costly, SLC flash. Once again, though, developers are finding a means to utilize more economical MLC flash in enterprise applications through software. RAID algorithms and enhanced ECC protection can be deployed to mitigate MLC reliability risks.
Beyond flash, new memory types are evolving. Phase change memory, which purports faster switching times and more scalability than flash, is in its infancy. PCM technology currently lacks the capacity of flash, but eventually could be a viable option for deployment in the enterprise memory mix. In order to manage multiple memory types and take advantage of emerging technologies, the flexibility and adaptability of programmable logic devices for memory control ad management can prove invaluable in memory array subsystem design.
Flash memory access speed is another consideration. Often, I/O and access speed can be a limiting factor in memory system performance. Support for standards such as PCI Express (PCIe), which provide an efficient conduit between server farm and system, are critical. Such standards are evolving as rapidly as the memory types with which they interact, though. Thus, any flash-based subsystem needs to take into account this potential variability. PLDs, particularly those that offer PCIe cores and other optimized interfaces, offer the agility to keep up with and respond to fluctuating standards.
FPGA technology
When seeking a means to implement memory control functions and I/O interfaces for flash memory arrays, developers must consider the alternatives carefully. Conventional options, namely ASSP and ASIC devices, lack the flexibility required for the rapidly changing flash market. ASIC is also prohibitively expensive for the vast majority of applications. Instead, storage subsystem developers seeking to tap into the advantages of flash memory need the agility to rapidly respond and adapt to emerging memory types and evolving standards.
Programmable ICs such as FPGAs, by contrast, are ideally suited for flash-based enterprise applications. The inherently fast, low cost, low risk FPGA development cycle makes it possible to quickly adapt to changing requirements or to take advantage of new developments in memory technology and interface standards such as PCIe. With the high capacity of programmable devices today, it is also possible to build-in control and interfaces for multiple memory types such that a subsystem might support both MLC and SFC flash, for example. Critical fault-tolerant RAID algorithms are also easily implemented in FPGA logic.
FPGAs offer significant performance and customization advantages previously considered the exclusive purview of ASICs. PLDs today are produced on the most advanced silicon process nodes, so offer the highest performance available from available semiconductor technology. For example, 28-nm PLD interfaces can now transfer data across high-speed data traffic hubs at transmission rates up to 28 Gbps, meeting performance demands of even the highest speed protocols such as PCIe, SAS/SATA and Fibre Channel. FPGAs also now feature soft and hardened IP cores, such as memory controllers, embedded processors and transceiver blocks, that further enhance performance, enrich functionality and improve efficiency. Finally, advances in PLD packaging accommodate a generous number of high-speed I/O ports as well as general-purpose I/O pins.
Programmable state machine for flash cache
Memory array maker, Violin Memory, Inc., outlines the following high-level attributes of a memory array that scales cost-effectively and addresses the needs of next-generation, 24x7 enterprise data centers:
- Performance: Seeks an order-of-magnitude improvement in latency and I/O operations per second (IOPS) over that attained by HDD, i.e. sub-millisecond latencies, and >200K IOPS per shelf, to better match processors.
- Cost: Deploys solid state memory, but at significantly reduced cost in terms of both cost per GB and cost per I/O.
- Reliability: Ensures no enterprise data are lost (via RAID algorithms) and that systems can be serviced without downtime.
These attributes can be realized in new breed of storage array based on low cost per GB NAND flash memory. The architecture features two levels of flash functionality, flash control (vFLASH) and flash RAID (vRAID). The flash controller leverages flash technology read, write, erase operations and error conditions at the bit, block, plane and chip level in a flash translation layer. VFLASH functionality includes log-structured data layout and flash management “garbage collection” to keep space freed up. The RAID controller for flash memory should go beyond traditional RAID-1 and RAID-5 algorithms to address the unique characteristics of flash. For example, a 4+1 parity model is much more efficient and has lower latency than traditional algorithms, and also can cope more effectively with failures without requiring module replacement.
Both flash and RAID control can optimally be implemented in FPGA technology as illustrated in Figure 2. By implementing key algorithms in silicon-based state machines rather than the traditional microprocessor/software approach, significantly lower latency can be achieved. And, as mentioned previously, FPGA-based implementation results in a very flexible design that accommodates the rapid evolution of flash and associated features. A new design can be brought to market, and new opportunities be explored, very rapidly, at very low cost.
Figure 2. Block diagram of a memory subsystem
Also, by leveraging FPGA features such as memory controllers, transceiver blocks and high-speed interfaces to memory and PCIe cards, a highly optimized system can be brought to market in a matter of days or weeks, much faster than traditional approaches.
A new paradigm for enterprise storage
Enterprise storage systems today require the performance and cost advantages of flash memory to be competitive. The unique design challenges associated with flash, such as ensuring data integrity and dealing with emerging memory types and evolving standards, are readily addressed via deployment of FPGA-based PSM for memory management and I/O. The combination of FPGA technology and flash memory gives storage system architects a powerful means to attain the performance, while ensuring the system integrity, scalability and adaptability, required of even the most demanding workloads.
About the author
David McIntyre manages the Computer and Storage Business Unit at Altera Corporation. His responsibilities include driving top tier customer growth with initiatives and solutions.With 20 years of experience at leading semiconductor and systems companies, David has held various engineering and marketing management positions including a director of strategic marketing post for the IBM Storage Systems Division.
If you found this article to be of interest, visit Programmable Logic Designline where you will find the latest and greatest design, technology, product, and news articles with regard to programmable logic devices of every flavor and size (FPGAs, CPLDs, CSSPs, PSoCs...).
Also, you can obtain a highlights update delivered directly to your inbox by signing up for my weekly newsletter – just Click Here to request this newsletter using the Manage Newsletters tab (if you aren't already a member you'll be asked to register, but it's free and painless so don't let that stop you [grin]).
Navigate to related information


Dr DSP
3/19/2012 12:05 PM EDT
Are there issues with having a centralized state machine over a more distributed approach from a reliability standpoint? If the state machine fails doesn't this single point of failure go against the RAID approach?
Sign in to Reply
Steve Gabriel
3/22/2012 2:01 PM EDT
We agree that a distributed approach does provide additional reliability, however, if the RAID mechanism spans across a series of FPGAs, then system level reliability requirements can be achieved.
Sign in to Reply