Skip to main content

Prerequisites

Before setting up the Amazon S3 integration, ensure you have:

AWS Setup Requirements

1. Create S3 Bucket

1

Access AWS Console

Log in to your AWS Management Console and navigate to S3
2

Create Bucket

Create a new S3 bucket with a unique name (must be globally unique)
3

Configure Permissions

Set up appropriate bucket policies for data access
4

Enable Versioning

Optionally enable versioning for data backup and recovery

2. Create IAM Role

1

Access IAM Console

Navigate to the IAM service in your AWS console
2

Create Role

Create a new IAM role for cross-account access
3

Configure Trust Policy

Set up trust relationship with Masivo’s AWS account
4

Attach Policies

Attach S3 read/write policies to the role

3. Required IAM Permissions

S3 Permissions

  • s3:GetBucketLocation - s3:ListBucket - s3:GetBucketVersioning
  • s3:PutObject - s3:PutObjectAcl - s3:GetObject
  • s3:PutObject for all folder paths - s3:GetObject for data retrieval

Configuration Steps

Step 1: Access Integration Settings

1

Navigate to CDP

Go to your Masivo dashboard and select the CDP section
2

Open Integrations

Click on “Integrations” in the CDP menu
3

Add New Integration

Click “Add Integration” and select “Amazon S3” from the catalog

Step 2: Configure Basic Settings

Required Fields

Select the AWS region where your S3 bucket is located (e.g., us-east-1, eu-west-1)
Enter the name of your S3 bucket (must be globally unique)
Enter the ARN of the IAM role that Masivo will use to access your bucket

Step 3: Configure Export Settings

File Format Options

JSON

Human-readable format, larger files, easy to process with any tool

Parquet

Columnar format, smaller files, optimized for analytics and big data processing

AWS Configuration Details

IAM Role Trust Policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::MASIVO_ACCOUNT_ID:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "MASIVO_INTEGRATION_ID"
        }
      }
    }
  ]
}

S3 Bucket Policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::YOUR_ACCOUNT_ID:role/MASIVO_S3_ROLE"
      },
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*"
    }
  ]
}

Data Export Process

1

Data Collection

Masivo collects customer data, events, and transactions
2

Batch Processing

Data is processed in batches for efficient export
3

Compression

Data is compressed using gzip for storage efficiency
4

S3 Upload

Compressed data is uploaded to your S3 bucket

File Naming Convention

File Structure

{DATA_TYPE}_{DATE}_batch_{TIMESTAMP}.{FORMAT}.gz
  • CUSTOMER_2024-01-15_batch_1705123456789.json.gz - EVENT_2024-01-15_batch_1705123456789.parquet.gz

Monitoring and Troubleshooting

Health Monitoring

1

S3 Access

Monitor S3 bucket access and upload success rates
2

Data Quality

Check for data validation errors and missing fields
3

Export Performance

Monitor export speed and batch processing status
4

Error Handling

Review error logs and failed export attempts

Common Issues

1

Check Role ARN

Verify the IAM role ARN is correct
2

Trust Policy

Ensure trust policy allows Masivo access
3

Permissions

Verify S3 permissions are properly configured
1

Bucket Name

Verify bucket name is correct and accessible
2

Region

Ensure region matches your bucket location
3

Bucket Policy

Check bucket policy allows role access
I