jplater

Running a Molecule on Amazon Web Services

Blog Post created by jplater Employee on Oct 13, 2016

After learning about Amazon’s Elastic File System (EFS), I set out to explore how a Dell Boomi Molecule could be configured to use EFS as its shared file system. In this blog post we will walk through the steps I followed to install a Molecule on Amazon Web Services (AWS).

 

Editor's Note: This post has been updated in May 2018 to reflect configuration changes and options now available in AWS and Boomi.

 

**DISCLAIMER** Before we jump into it, let me quickly say that there are many factors that go into deciding how to set up and size your Molecule runtime environment. The specific system resources (CPU, memory, hard disk, etc.) required for your implementation will be based on your expected workload and processing requirements. This post makes no attempt to provide sizing recommendations and is intended to serve as a starting point for the AWS configuration. Now with that out of the way, let's get started!

 

 

 

Prerequisites

Before you can install a Molecule on AWS, you need to set up a few things on the AWS side. First, complete tasks 1-4 in Setting Up with Amazon EC2. We will be setting up Security Groups later so you can skip that for now. Second, decide into which region and Availability Zone you are going to install your Molecule. Be sure to pick a region where EFS is available.  Amazon EFS service availability by region can be found on the Regional Products and Services page. I am located on the East Coast, so I chose to use U.S. East (N. Virginia), us-east-1, and Availability Zone 'd'.

 

Create Security Groups

If you aren't familiar with security groups, they are very similar to firewalls. When you create a security group you specify what inbound and outbound traffic is allowed. The Molecule we are going to install will require three security groups (the fourth one listed below is the default security group):

 

SG1.png

 

  1. SSH
    • This group will be used to allow you to SSH to your Molecule EC2 instance. The inbound rule should allow SSH traffic from your IP address.
      • Inbound rules:
        SSH-Inbound.png
      • Outbound rules:
        SSH-Outbound.png
  2. AWS MOL
    • This group will allow the nodes within the Molecule to communicate with each other. This one is a little tricky to set up. Start by creating a new security group with any single rule — it doesn’t matter what it is, the rule is just needed in order to create a security group ID. Once created, edit the inbound rule to limit traffic to itself.

      • Inbound rules:
        AWSMOL-Inbound.png
      • Outbound rules:
        AWSMOL-Outbound.png
  3. EFS MOL
    • This group will allow the nodes within the Molecule to access your EFS.
      • Inbound rules:
        EFSMOL-Inbound.png
      • Outbound rules:
        EFMOL-Outbound.png

 

Create a Shared File System (Amazon Elastic File System)

The primary requirement for running a Molecule is that all nodes that are part of it must be using the same installation directory. Up until recently, this meant spinning up and maintaining an NFS server on an EC2 instance. Amazon EFS makes this much easier!

Amazon EFS file systems can automatically scale from gigabytes to petabytes of data without needing to provision storage. Tens, hundreds, or even thousands of Amazon EC2 instances can access an Amazon EFS file system at the same time, and Amazon EFS provides consistent performance to each Amazon EC2 instance. Amazon EFS is designed to be highly durable and highly available. With Amazon EFS, there is no minimum fee or setup costs, and you pay only for the storage you use

 

Amazon Elastic File System (Amazon EFS)

Setting up a new file system on EFS is pretty simple:

  1. Go to https://console.aws.amazon.com/efs/home and choose Create file system.
  2. Select the VPC you are planning to use for your Molecule.
  3. Under the Create mount targets section, select one or more Availability Zones where your Molecule will reside (I chose to limit my Molecule to one zone, 'us-east-1d', however EFS best practice is to create a mount target in each AZ) and associate the EFS MOL security group with the mount target.
    EFS1.png
  4. Add any tags you want and select General Purpose (default) as the performance mode.  Tags have no bearing on the Molecule setup, they are just there to help you track your AWS resources.
    EFS2.png
  5. Review your settings and click Create File System.
    EFS3.png

 

Create and Configure an EC2 Instance

Now that we have our security groups and storage ready to go, it is time to start spinning up an EC2 instance. To start, create a single node and mount the EFS to it:

  1. Go to https://console.aws.amazon.com/ec2/v2/home and choose Launch Instance.
  2. Select the Amazon Linux AMI (or some other Linux flavor, if you want).
  3. Choose an instance type (I started with the t2.micro because it was free). If you are planning to use the Molecule for real work, make sure you pick an instance that is large enough to handle your workload. Once you've selected an instance type, continue on to Configure Instance Details.
  4. Verify that the correct VPC is selected and continue on to Add Storage.
  5. I chose to leave the defaults here and just continued on to Tag Instance.
  6. I opted not to add tags and just continued on to Configure Security Group.
  7. Opt to Select an existing security group and check the boxes next to AWS - MOL and SSH. Move onto to the final screen, Review and Launch.
  8. Assuming everything looks good (my example is below), go ahead and click Launch.EC2-Review.png.
  9. Choose to use an existing key pair and select the key pair that you created as part of the prerequisites.

 

For a complete overview of all the steps involved in launching an instance, see Launching an Instance.

 

Once the instance starts up, connect to it and continue setting it up:

  1. SSH to your instance using the key pair you associated with the new instance.

    ssh -i <YOUR .pem FILE> ec2-user@<IPADDRESS>

  2. Update packages.

    sudo yum update

  3. Set up NFS mount.

    sudo mkdir /mnt/data

    sudo sh -c 'echo "$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone).<file-system-id>.efs.<aws-region>.amazonaws.com:/    /mnt/data    nfs4   defaults,vers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2  0  0" >> /etc/fstab'

    sudo mount -a

  4. Add a user for the Molecule to run as. I chose to create a user named "boomi".

    sudo useradd -u 510 boomi

  5. Create a base directory on the NFS mount where you will install the Molecule and make the "boomi" user the owner.

    sudo mkdir /mnt/data/molecule

    sudo chown boomi:boomi  /mnt/data/molecule

  6. Create a local work directory, symlink the Molecule install directory (just to make it easier to find), and make the "boomi" user the owner of the work directory.

    sudo mkdir -p /usr/local/boomi/work

    sudo ln -s /mnt/data/molecule /usr/local/boomi/molecule

    sudo chown boomi:boomi  /usr/local/boomi/work

  7. Download 'jdk-8u66-linux-x64.rpm' from the Java SE Development Kit 8u66 Download site and copy it over to the EC2 instance.

    # after downloading 'jdk-8u66-linux-x64.rpm' to my local computer I ran this scp command

    scp -i<YOUR .pem FILE> jdk-8u66-linux-x64.rpm ec2-user@<IPADDRESS>:/tmp/.

  8. Install Java 8.

    sudo rpm -i /tmp/jdk-8u66-linux-x64.rpm

    rm /tmp/jdk-8u66-linux-x64.rpm

 

At this point, the EC2 instance is all configured. I recommend creating a private Amazon Machine Image (AMI) of this instance so that it is easier to spin up additional Molecule nodes later. To do this:

  1. First stop the image (right-click on the instance you just finished configuring and select Instance State > Stop).
  2. Right-click on the instance again and select Image > Create Image.
    CreateImage1.png
  3. Provide an Image name and Image description and click Create Image.
    CreateImage2.png
  4. Once your AMI has been created (you can monitor it from the AMIs dashboard), start your instance back up (right-click on the instance you just finished configuring and select Instance State > Start).

 

Install and Configure a Molecule

Now that our EC2 instance is up, it is time to install the Molecule. Let's jump back onto the EC2 instance and get to work:

  1. SSH to your instance using the key pair.

    ssh -i <YOUR .pem FILE> ec2-user@<IPADDRESS>

  2. Switch to the "boomi" user you created earlier.

    sudo su boomi

  3. Download the 64-bit Molecule installer.

    cd /tmp

    wget https://platform.boomi.com/atom/molecule_install64.sh

    chmod u+x molecule_install64.sh

  4. Run the installer.

    [boomi@ip-XXX-XX-XX-XXX tmp]$ ./molecule_install64.sh 

    Starting Installer ...
    This will install Molecule on your computer.
    OK [o, Enter], Cancel [c]

    <ENTER>

    Enter User Name, Password, and Molecule Name
    Use the email address that you use to sign into your Boomi AtomSphere
    account in the User Name input below.
    User Name
    []
    <YOUR ATOMSPHERE USERNAME>
    Password

    <YOUR ATOMSPHERE PASSWORD>

    Molecule Name
    [ip-XXX-XX-XX-XXX.ec2.internal]
    AWS-EFS-Demo
    The entries below are required if the installation machine requires a proxy
    server to open HTTP connections outside of your network.
    Use Proxy Settings?
    Yes [y], No [n, Enter]

    <ENTER>

    Logging into Boomi AtomSphere
    Authenticating credentials
    Where should the Molecule be installed?
    [/home/boomi/Boomi_AtomSphere/Molecule/Molecule_AWS_EFS_Demo]
    /usr/local/boomi/molecule/Molecule_AWS_EFS_Demo
    Setup requires at least 5436 KB of free space to install, but the selected drive only has -9007199254740992 KB available.

     

    Do you want to continue anyway?
    Yes [y, Enter], No [n]

    <ENTER>

    Please select the location of the local and local temp directories. Both locations must be on the local disk and must have sufficient disk space for the Molecule's operation.
    Local Directory
    []
    /usr/local/boomi/work
    This setting sets the Working Data Local Storage Directory property
    (com.boomi.container.localDir in the container.properties file). For more
    information, see the reference guide.

     

    Local Temp Directory
    [/tmp]

    <ENTER>

    This setting overrides the default Java local temp directory (java.io.tmpdir
    in the atom.vmoptions file).
    Create symlinks?
    Yes [y, Enter], No [n]
    n
    Please read the following important information before continuing.

     

    The following will be installed when you choose to continue:

     

    Molecule - AWS-EFS-Demo
    Installation Directory - /mnt/data/molecule/Molecule_AWS_EFS_Demo
    Local Directory - /usr/local/boomi/work
    Local Temp Directory - /tmp
    Program Group - Boomi AtomSphere\Molecule - AWS-EFS-Demo

     

    [Enter]

    <ENTER>

    Retrieving Build Number
    Setup requires at least 5436 KB of free space to install, but the selected drive only has -9007199254740992 KB available.

     

    Do you want to continue anyway?
    Yes [y, Enter], No [n]

    <ENTER>
    Extracting files...

    Downloading Atom Files
    Downloading Molecule Files
    Retrieving Container Id
    Retrieving account keystore.
    Retrieving account trustore.
    Configuring Atom.
    The Molecule has been installed on your computer.
    Molecule name: AWS-EFS-Demo
    Preferred JVM: /usr/java/latest
    Finishing installation...
    [boomi@ip-XXX-XX-XX-XXX tmp]$

 

Once the installer completes, you can log into your AtomSphere account and view the newly created Molecule (Manage > Atom Management):

 

 

The final configuration change that needs to be made is to update the Molecule's network configuration. By default, a Molecule's nodes use a multicast (UDP)-based protocol to manage clustering. Since Amazon VPC doesn't support multicast (https://aws.amazon.com/vpc/faqs/), you need to update the Molecule to use unicast for cluster communication. The steps to do this are outlined in the Setting up unicast support help topic. Here is an example of what the Molecule Properties will look like after making the update:

 

 

Be sure to update the place holder in "Initial Hosts for Unicast" with the private IP address of your EC2 instance. Then check Restart on Save? and click Save.

 

At this point you have a fully functioning single node Molecule. Time to add more nodes!

 

Add Additional Nodes

  1. Go to https://console.aws.amazon.com/ec2/v2/home and choose Launch Instance.
  2. Select the My AMIs tab and select the AWS MOL Image we created earlier.
  3. Choose an instance type (I started with the t2.micro because it was free). If you are planning to use the Molecule for real work, make sure you pick an instance that is large enough to handle your workload. Once you've selected an instance type, continue on to Configure Instance Details.
  4. Verify that the correct VPC is selected. Expand the Advanced Details section and add the following user data so that the Molecule node is started when the instance is started.
    #!/bin/bash
    su boomi -c "/usr/local/boomi/molecule/Molecule_AWS_EFS_Demo/bin/atom start"
    Once that is done, continue on to Add Storage.
  5. I chose to leave the defaults here and just continued on to Tag Instance.
  6. I opted not to add tags and just continued on to Configure Security Group.
  7. Opt to Select an existing security group and check the boxes next to AWS - MOL and SSH. Move onto to the final screen, Review and Launch.
  8. Assuming everything looks good, go ahead and click Launch.
  9. Choose to use an existing key pair and select the key pair that you created as part of the prerequisites.

 

Once your new EC2 instance is up and running, log into AtomSphere, navigate to the Cluster Status panel, and verify that the new node is part of your Molecule.

 

 

So there you have it, you now have a Molecule running on Amazon AWS using EFS! Remember, this blog post is just meant to be a guide.  When setting up your Molecule you will need to determine the specific system resources (CPU, memory, hard disk, etc.) required to handle your workload.

 

Are you running a Molecule on AWS? If so, I'd love to hear about it.

Outcomes