Prerequisites
Before setting up the Amazon S3 integration, ensure you have:AWS Setup Requirements
1. Create S3 Bucket
1
Access AWS Console
Log in to your AWS Management Console and navigate to S3
2
Create Bucket
Create a new S3 bucket with a unique name (must be globally unique)
3
Configure Permissions
Set up appropriate bucket policies for data access
4
Enable Versioning
Optionally enable versioning for data backup and recovery
2. Create IAM Role
1
Access IAM Console
Navigate to the IAM service in your AWS console
2
Create Role
Create a new IAM role for cross-account access
3
Configure Trust Policy
Set up trust relationship with Masivo’s AWS account
4
Attach Policies
Attach S3 read/write policies to the role
3. Required IAM Permissions
S3 Permissions
Bucket Access
Bucket Access
s3:GetBucketLocation
-s3:ListBucket
-s3:GetBucketVersioning
Object Operations
Object Operations
s3:PutObject
-s3:PutObjectAcl
-s3:GetObject
Folder Operations
Folder Operations
s3:PutObject
for all folder paths -s3:GetObject
for data retrieval
Configuration Steps
Step 1: Access Integration Settings
1
Navigate to CDP
Go to your Masivo dashboard and select the CDP section
2
Open Integrations
Click on “Integrations” in the CDP menu
3
Add New Integration
Click “Add Integration” and select “Amazon S3” from the catalog
Step 2: Configure Basic Settings
Required Fields
AWS Region
AWS Region
Select the AWS region where your S3 bucket is located (e.g., us-east-1,
eu-west-1)
S3 Bucket Name
S3 Bucket Name
Enter the name of your S3 bucket (must be globally unique)
IAM Role ARN
IAM Role ARN
Enter the ARN of the IAM role that Masivo will use to access your bucket
Step 3: Configure Export Settings
File Format Options
JSON
Human-readable format, larger files, easy to process with any tool
Parquet
Columnar format, smaller files, optimized for analytics and big data
processing
AWS Configuration Details
IAM Role Trust Policy
S3 Bucket Policy
Data Export Process
1
Data Collection
Masivo collects customer data, events, and transactions
2
Batch Processing
Data is processed in batches for efficient export
3
Compression
Data is compressed using gzip for storage efficiency
4
S3 Upload
Compressed data is uploaded to your S3 bucket
File Naming Convention
File Structure
File Names
File Names
{DATA_TYPE}_{DATE}_batch_{TIMESTAMP}.{FORMAT}.gz
Examples
Examples
CUSTOMER_2024-01-15_batch_1705123456789.json.gz
-EVENT_2024-01-15_batch_1705123456789.parquet.gz
Monitoring and Troubleshooting
Health Monitoring
1
S3 Access
Monitor S3 bucket access and upload success rates
2
Data Quality
Check for data validation errors and missing fields
3
Export Performance
Monitor export speed and batch processing status
4
Error Handling
Review error logs and failed export attempts
Common Issues
IAM Role Issues
IAM Role Issues
1
Check Role ARN
Verify the IAM role ARN is correct
2
Trust Policy
Ensure trust policy allows Masivo access
3
Permissions
Verify S3 permissions are properly configured
S3 Access Issues
S3 Access Issues
1
Bucket Name
Verify bucket name is correct and accessible
2
Region
Ensure region matches your bucket location
3
Bucket Policy
Check bucket policy allows role access