April 6, 2018

Analytic DevOps in AWS- Part 2

In our previous post, we discussed the importance of developing a standard analytics DevOps process.  To be successful, this process needs to be supported by an Analytic Platform technology architecture that streamlines the deployment of analytic capabilities from development to production.  In this post, we will discuss Analytic DevOps architectures that have been successful. While we focus this post of architecting within Amazon Web Services (AWS), similar approaches can be applied to on-prem and other cloud environments.

Analytics DevOps Architecture

The goals of the Analytics DevOps architecture is to provide isolation from various stakeholder groups in the environment while allowing autonomy, scalability, and reliability.  The account structure within AWS has an enormous impact on these goals.

Monolithic Approach

Monolithic accounts, or accounts where multiple applications or services are contained within a single account, increase the complexity and become more difficult to manage as resources, users, and additional services are added.  Common administrative tasks become burdensome, changes to the infrastructure have the ability to negatively affect multiple applications or services, and administrators of the account make fewer changes due to fear of unknown ramifications.  The image below depicts a monolithic account in AWS.

Establishing this type of account architecture presents the following challenges:

Increased Complexity

Operational complexity is increased due to the overloading of resources within an account. With this shared model, there are no clear boundaries between the different services of the Analytics Platform (AP). Each AP service shares a centralized networking and permission model even though they have drastically different requirements. This complexity inhibits the visibility into service interaction and resource utilization which leads to unnecessary errors.

Permission Management

AWS uses their Identity and Access Management (IAM) service to securely control access to AWS resources.  At the core of IAM are policies that attach to the roles of the users. Assume that the administration of the account above will be accomplished by four different team members, one for each AP service.  This task would require four separate roles, one for each team member. However, the Data Ingest, EMR, and DSS / Tableau require EC2 instances. In order to grant access to administer the instances there are two options:

  1. Grant the 3 team members access to all of EC2.
  2. Restrict the 3 team members to their specific EC2 instances.

The first option is the easiest route and would allow the team members to administer their resources, however, would violate most InfoSec policies on separation of duties. The second option would require IAM policies that restrict each team member to their specific instances by instance ID.  With AWS being such a fluid environment and instances changing frequently, a tremendous amount of overhead would be spent on managing the policies.

Account Agility and Risk

The flexibility and agility of an account are greatly affected by the monolithic approach.  When several different services or applications share a central network layer, any changes to that network layer can have drastic effects. A simple change in the route table or security groups of the account could, in our case, shut down both development and production of the entire Analytics Platform.  Due to this type of risk, the administrators of the account will be less likely to make changes to improve the environment and any changes that are made will take longer than necessary and have greater risk.

Service-Oriented Approach

A service-oriented approach isolates accounts based on Analytics Platform services and their environment.  Isolation based on service minimizes the impact of critical events, allows environment specific permissions, and provides clear boundaries between development and production.  This level of isolation also provides the benefit of reducing the complexity of each account and gives account administrators more autonomy and control.

Benefits of this approach include:

Reduced Complexity

In the service-oriented approach, AP services are offloaded into their own account based on the environment.  Each account hosts its own networking layer, compute resources, IAM policies, etc. This provides team members clarity into the purpose of each service, reduces the complexity of the resource interaction, and lowers the possibility of unnecessary mistakes.

Permissions

Administrators have a substantial level of control over user permissions in the service-oriented approach.  In most cases, a developer should have wide-ranging privileges in development and restricted permissions in production.

Using the same example as before, assume that we want four team members to manage this account. Due to each account containing its own resources, there is no longer an issue with granting a user EC2 privileges.  The EMR admin can be given permissions to EC2 inside of the EMR account and have no access or visibility into the EC2 instances inside any of the other AP services.

Additionally, a developer working on the ingest can be given extensive permissions in the development account. However, in the production environment, the developer would be limited to a role that just allows the build and deployment process.

The service-oriented approach alleviates the issues that arrive from access control that is experienced in a shared environment.

Agility and Risk

By having accounts separated, the blast radius of a critical event is mitigated. For example, an issue with the development ingestion service will not inhibit current operations in production DSS.  In the unlikely event of a breach, the malicious actor would be contained to one particular service of the Analytics Platform and would not have control over the entire platform.

Development and experimentation can happen more freely within each development account. Developers no longer have to consider if their actions will affect different services of the Analytics Platform, eliminating inter-team communication on development and allowing for greater team autonomy and increased efficiency.

Summary

When considering implementing an Analytics DevOps framework, one must consider not only the Analytics DevOps process but also the technology architecture and account setup required to support the process.  As analytics groups leverage AWS and other cloud services as PaaS rather than IaaS, the traditional DevOps process must be adapted to accommodate this new architecture.

Read Next: Who Owns the Analytic Platform?