BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles FPGAs Supercharge Computational Performance

FPGAs Supercharge Computational Performance

Leia em Português

Key Takeaways

  • FPGAs can meet the performance demands of artificial intelligence and big data which are seeing exponential global growth
  • FPGAs increase processing speeds and reduce hardware costs by running massive number of processes concurrently and optimally managing data flow
  • Developments such as Amazon’s F1 instances are removing barriers of cost and usability, making FPGAs accessible to all businesses
  • FPGAs are ideal for any project needing low-latency processing of vast amounts of data including inference and Deep Neural Networks
  • As FPGA programming becomes simpler and cloud-based, using more accessible languages such as Go, adoption will rise rapidly
     

Data creation and its consumption is in overdrive, due in part to the escalation of artificial intelligence (AI) and deep learning technologies. McKinsey’s Nicolaus Henke estimates 90% of the data accessible today simply didn’t exist two years ago, while IDC predicts global investment in big data will exceed $203 billion by 2020, all of which is placing exponential performance demands on the next generation of computing. 

Demands for increased performance have traditionally been met by shrinking silicon process, but it’s long been the case that simply shrinking transistor size has not brought exponential speed gains, due to Dennard Scaling.

Multi-core scaling and virtualization has helped mitigate this, but the process is now running out of steam because of practical power limits.

As a result, Intel is projecting only marginal gains in performance for its processors for the next few generations – the exponential demands for computing power can no longer be met by ‘business as usual’.

With advances in CPUs delivering only marginal wins, businesses must look for alternatives to supercharge computational performance.

One of the most viable solutions comes in the form of Field Programmable Gate Arrays (FPGAs), originally designed for developing new hardware.

So how can FPGAs be used to supercharge computational performance? In use since the mid-eighties, FPGAs are reprogrammable circuitry on a chip.

Originally used to simulate chips and ensure that their design was working, they are now of increasing interest to software engineers due to their ability to efficiently process large amounts of data.

FPGAs are programmable like GPUs or CPUs but are aimed at parallel, low-latency, high-throughput problems like inference and Deep Neural Networks.

[Click on the image to enlarge it]

What makes FPGAs remarkable?

FPGAs have a number of benefits that make them appealing to software engineers, the most notable of which is speed. While FPGAs run at a slow clock speed relative to modern CPUs, they are fundamentally concurrent, rather than running streams of sequential instructions, with data flowing optimally between these concurrent operations, resulting in a dramatic net increase in performance. There is the potential for applications to run up to 100 times faster over the same code running on traditional CPUs.

FPGAs contain millions of reprogrammable logic blocks that can be used to perform many actions at the same time, delivering the benefits of parallelism and concurrency. When writing code, engineers can take advantage of this parallel architecture by breaking problems down into well-structured, self-contained processes that can run concurrently.

For example, when an image is processed non-concurrently, a single worker would process the whole image pixel by pixel. But when the same image is processed concurrently, it is broken down into pieces that are processed simultaneously by different workers, and then pieced back together. This makes the process more complex but far quicker - the incoming data must be split apart in an optimal way, distributed efficiently to the workers, then the processed data collected and reassembled, ideally without blocking the pipeline of work.

In a normal CPU, this involves data being pushed and pulled from memory, and costly protocols for processes to agree on what is the current state of memory. Even the largest Intel CPUs have only 18 cores. In comparison, in an FPGA, data flow can be engineered so it never leaves the chip. Tens of thousands of concurrent processes can occur, and the timing of the processing optimised so throughput is always maximal.

The processing speeds attainable through FPGAs are responsible for their second major benefit – cost. Significant cost savings are achievable using FPGAs as they enable increased speeds while decreasing hardware requirements because one FPGA can execute the tasks of many servers.

Finally, FPGAs have powerful next-generation interconnectivity and enhanced flexibility as they can be reprogrammed in the field to take advantage of the latest technological developments. Once they are up and running, FPGAs can be altered at any time to meet ever-changing business requirements.

Barriers to FPGA adoption

Despite the many benefits of FPGAs, they also introduce a number of challenges. The high initial cost of FPGA acquisition, the total cost of ongoing ownership, and issues with usability have traditionally prevented FPGA usage becoming mainstream.

Until recently, programming FPGAs required hardware engineers, with programming and reprogramming carried out in complex, low-level hardware definition languages such as Verilog

Hardware engineering is a highly specialized skill requiring years of experience to be effective and universities produce very few graduates each year with this specific expertise. The specialist chip design skillset required for this hardware configuration kept FPGA costs high, and meant innovation was limited and vertical specific. 

Historically, businesses that wanted to use FPGAs needed to acquire custom hardware, build a specific delivery team, and integrate it into their existing solutions. Only very high value solutions – such as military projects and hedge funds – have so far had the resources to use FPGAs for compute.

The future of FPGAs

While the implementation of FPGAs for compute remains relatively slow, recent innovations are steadily breaking down barriers to adoption. For instance the response to the release of Amazon Web Services' FPGA-equipped F1 instances – aimed at building custom accelerators to process compute-intensive problems – was exceptionally positive.

FPGA manufacturers and platform providers are also enabling programming in different languages, such as OpenCL, and FPGA development is gradually becoming more attainable for software engineers with the potential to program FPGAs from within cloud-based environments. This can be done using more accessible languages such as Go, which is easier and more productive for users from different backgrounds and languages. Microsoft has hinted that in the future its Azure cloud service will also give developers access to FPGAs.

[Click on the image to enlarge it]

These developments will bypass the need for hard-to-find hardware expertise or costly development budgets. Although FPGAs are still expensive, the costs associated with acquisition and ownership of FPGAs are less of a barrier than they used to be because FPGAs can now be hired by the hour. 

In the past, FPGAs were only used to tackle large amounts of data where the value return was high enough to justify a deep investment, or the problems were exceptionally complex and challenging – for instance in the military or financial sectors. But as FPGAs become more accessible, the technology can be used for any project where speed and cost are important factors.

The parallel computing enabled by FPGAs speeds up the processing and analysis of vast amounts of data by running numerous computational processes concurrently on a single server, meaning FPGAs can be used for image and video processing, online speech recognition, real-time data analytics, advertising technologies, and software-defined networking (SDN). 

Cloud FPGAs are being explored by security businesses for cryptographic algorithm acceleration, telecoms companies for networking and security, aerospace companies to process satellite data and apply machine-learning algorithms, and by financial services for hardware acceleration and to determine the credit risk of derivatives portfolios. 

While these use cases are exciting, they are only the tip of the iceberg in FPGA capabilities, and realising the full potential of this technology will take time. Although new products are regularly launched in the hardware world, innovations in those products tend to be incremental. But as platforms emerge to enable concurrent design and innovation in hardware development, FPGA usage will become cheaper and more realistic for all businesses – large or small – and adoption will increase.

The industry is still at the foothills of adoption but as their usage increases, FPGAs will allow every business – from single-person start-ups to established multinational companies – to take advantage of high-powered parallel computing that will drive continued technological innovation.

About the Author

Rob Taylor is a globally recognised technology leader who has been at the forefront of software innovation and open source technology since the 90s. Having founded two multi-million GBP software houses, including Codethink, which has delivered software infrastructure solutions to Nokia as well as leading automobile manufacturers and financial services providers, Rob was eager to build a product business focused on arming today's developers with tools for a post-Moore’s world. In 2015 Reconfigure.io was born, servicing its global customers from offices located in the UK and North America. To date, the company has raised total seed investment of $500,000 from angels in the UK, US, Switzerland, Japan and Holland. In April 2017, Reconfigure.io was announced as an early partner in Amazon Web Services' industry-first F1 Instances.    

 

Rate this Article

Adoption
Style

BT