Skip to content

Seamless Integration#

To provide seamless acceleration, we have integrated the Manager and the Accelerator cores with a high level interface. This interface makes it possible to accelerate applications executed through open source frameworks for big data analytic (like Apache Spark) without a single change in the original application code.

What this project basically does, is pairing together the frameworks with the FPGA Manager. Upon a task execution, the framework communicates with the FPGA Manager. The Manager then takes care of all the communication with the FPGA device and returns the computed data back to the requested client.

This way we provide a complete acceleration solution. From your perspective nothing has changed, except now your applications run much faster.


Setup - Install Steps#


Single node Setup (Amazon AWS)#

  • Launch InAccel's AMI. You will find InAccel's FPGA-Accelerated ML Suite on AWS Marketplace.

  • Acquire Apache Spark from a public mirror and export SPARK_HOME variable.
    Example:

wget -qO - "https://www.apache.org/dyn/closer.lua?action=download&filename=spark/spark-2.3.2/spark-2.3.2-bin-hadoop2.7.tgz" | tar xz
export SPARK_HOME=/home/ec2-user/spark-2.3.2-bin-hadoop2.7
  • Install InAccel by cloning InAccel repository on your node:
git clone https://bitbucket.org/inaccel/release.git inaccel && source inaccel/setup.sh

Cluster Setup (Amazon AWS)#

The following steps can be performed on any machine with a Linux distribution (inside or outside Amazon EC2).

  • Set your AWS Credentials. Assuming that you already have an AWS account set up, one option is to export the following environment variables:
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
  • Install Flintrock

Flintrock is a command-line tool for launching Apache Spark clusters. Flintrock requires Python 3.4 or newer, unless you use one of the standalone packages.

Recommended Way

  • Install Python3 pip (if required): sudo apt update && sudo apt install python3-pip
  • To get the latest release of Flintrock, simply run pip:
sudo pip3 install flintrock
flintrock --version
  • Configure Flintrock
    Flintrock lets you persist your desired configuration to a YAML file so that you don't have to keep typing out the same options over and over at the command line.
    To setup and edit the default config file, run the following command: flintrock configure

Sample config.yaml

services:
  hdfs:
    version: 2.8.5
  spark:
    version: 2.3.2

provider: ec2

providers:
  ec2:
    key-name: key_name # change accordingly
    identity-file: /path/to/key.pem # change accordingly
    instance-type: f1.2xlarge
    region: us-east-1
    ami: ami-0735b15809ec21331
    user: ec2-user
    min-root-ebs-size-gb: 35 # feel free to change
    ebs-optimized: yes
    instance-initiated-shutdown-behavior: stop

launch:
  num-slaves: 4 # feel free to change
  spark-executor-instances: 8
  install-hdfs: True
  install-spark: True

debug: false
  • Create a new cluster
    Having a config file, you can now launch a cluster: flintrock launch inaccel-demo-cluster

Warning

Since AWS performance is highly variable, the exact launch time can not be predicted. A typical launch of a medium size cluster takes around 10 minutes.

After launch has finished, login to the Master node: flintrock login inaccel-demo-cluster

  • Install InAccel by cloning InAccel repository on your node:
git clone https://bitbucket.org/inaccel/release.git inaccel && source inaccel/setup.sh

Destroy the cluster (Amazon AWS)#

Once you're done using a cluster, don't forget to destroy it with:

flintrock destroy inaccel-demo-cluster