Amazon S3 - 무한 확장 가능한 객체 스토리지

S3의 기본부터 고급 활용까지, 실전 프로젝트 경험을 바탕으로 정리

Posted Nov 27, 2024

By kimmin1kk

28 min read

Amazon S3란?

Amazon S3(Simple Storage Service)는 업계 최고 수준의 확장성, 데이터 가용성, 보안 및 성능을 제공하는 객체 스토리지 서비스임.

파일을 객체(Object)로 저장하며, 버킷(Bucket)이라는 컨테이너에 보관함. 사실상 무제한 용량을 제공하며, 99.999999999%(11 9’s)의 내구성을 보장함.

왜 S3를 사용해야 하는가?

전통적인 파일 스토리지의 문제점

온프레미스 파일 서버:

┌─────────────────────────────────────┐
│  File Server (NAS/SAN)              │
│  - 초기 비용: 수천만원              │
│  - 용량 제한: 10TB (확장 시 추가 비용)│
│  - 관리 인력 필요                    │
│  - 재해 복구 별도 구축 필요          │
│  - 단일 지점 장애 위험              │
└─────────────────────────────────────┘

문제점:

고정 용량: 미리 용량 산정 필요, 부족하면 즉시 확장 어려움
높은 초기 비용: 하드웨어 구매, 설치, 네트워크 구성
관리 부담: 하드웨어 유지보수, 펌웨어 업데이트, 모니터링
확장성 제한: 물리적 공간과 예산에 의존
복잡한 DR: 데이터 백업과 재해 복구 별도 구축

S3의 해결 방법

┌─────────────────────────────────────┐
│  Amazon S3                          │
│  - 초기 비용: $0                    │
│  - 용량: 무제한 (사용한 만큼 과금)   │
│  - 관리: AWS가 모두 담당            │
│  - 재해 복구: 자동 (3개 AZ 복제)    │
│  - 99.999999999% 내구성            │
└─────────────────────────────────────┘

장점:

무제한 확장성: 1바이트부터 페타바이트까지 즉시 확장
종량제 과금: 사용한 저장 용량과 데이터 전송량만 지불
제로 관리: 하드웨어, 소프트웨어 관리 불필요
높은 내구성: 연간 데이터 손실 확률 0.000000001%
글로벌 접근: 전 세계 어디서나 HTTP/HTTPS로 접근

S3의 핵심 개념

1. 버킷 (Bucket)

S3에서 객체를 저장하는 최상위 컨테이너.

버킷 특징:

전역 고유 이름: 전 세계에서 유일해야 함
리전 종속: 특정 AWS 리전에 생성됨
플랫 구조: 버킷 내 계층 구조는 논리적으로만 존재 (프리픽스로 표현)

  
// 버킷 생성 (AWS SDK v3)
import { S3Client, CreateBucketCommand } from '@aws-sdk/client-s3';

const s3Client = new S3Client({ region: 'ap-northeast-2' });

await s3Client.send(new CreateBucketCommand({
  Bucket: 'my-unique-bucket-name-20250109',
  CreateBucketConfiguration: {
    LocationConstraint: 'ap-northeast-2'  // us-east-1 외 리전은 필수
  }
}));

버킷 명명 규칙:

소문자, 숫자, 하이픈(-), 마침표(.)만 사용
3-63자 길이
IP 주소 형식 금지 (192.168.1.1 같은 이름 X)
‘xn–‘로 시작 금지
‘-s3alias’로 끝나면 안 됨

2. 객체 (Object)

S3에 저장되는 기본 단위. 파일과 메타데이터로 구성됨.

객체 구성 요소:

Key: 객체의 고유 식별자 (파일 경로처럼 보이지만 실제로는 단일 문자열)
Value: 실제 데이터 (0바이트 ~ 5TB)
Version ID: 버저닝 활성화 시 각 버전의 고유 ID
Metadata: 객체에 대한 정보 (Content-Type, 커스텀 메타데이터 등)
Access Control: 객체 수준 권한

객체 키 예시:

images/2025/01/profile.jpg
logs/year=2025/month=01/day=09/server.log
data/users/user-123/settings.json

논리적으로는 폴더처럼 보이지만, 실제로는 모두 단일 키임.

3. 스토리지 클래스

용도와 액세스 패턴에 따라 다양한 스토리지 클래스 제공.

클래스	용도	가용성	최소 보관 기간	비용 (GB당/월)
S3 Standard	자주 액세스하는 데이터	99.99%	없음	$0.023
S3 Intelligent-Tiering	액세스 패턴 알 수 없음	99.9%	없음	$0.023 + 모니터링
S3 Standard-IA	가끔 액세스, 즉시 필요	99.9%	30일	$0.0125
S3 One Zone-IA	재생성 가능한 데이터	99.5%	30일	$0.01
S3 Glacier Instant Retrieval	분기별 1회 액세스	99.9%	90일	$0.004
S3 Glacier Flexible Retrieval	연간 1-2회 액세스	99.99%	90일	$0.0036
S3 Glacier Deep Archive	7-10년 장기 보관	99.99%	180일	$0.00099

사용 사례 예시:

웹 애플리케이션:
- 사용자 프로필 이미지 → S3 Standard
- 30일 이상 된 로그 → S3 Standard-IA
- 90일 이상 된 백업 → S3 Glacier Flexible Retrieval
- 법적 보관 요구사항 문서 → S3 Glacier Deep Archive

실전 프로젝트 활용 사례

사례 1: 이미지 처리 파이프라인

아키텍처:

사용자 업로드 → S3 Presigned URL (uploads/)
              → EventBridge 이벤트 발생
              → Lambda 이미지 처리
              → S3 저장 (thumbnails/, resized/)

Presigned URL 생성 (안전한 업로드):

  
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';

const s3Client = new S3Client({});

async function generateUploadUrl(fileName, contentType) {
  const key = `uploads/${Date.now()}-${fileName}`;

  const command = new PutObjectCommand({
    Bucket: 'my-image-bucket',
    Key: key,
    ContentType: contentType,
    Metadata: {
      uploadedBy: 'user-123',
      originalFileName: fileName
    }
  });

  // 5분 동안 유효한 업로드 URL 생성
  const uploadUrl = await getSignedUrl(s3Client, command, {
    expiresIn: 300
  });

  return { uploadUrl, key };
}

// 클라이언트에서 Presigned URL로 직접 업로드
// fetch(uploadUrl, {
//   method: 'PUT',
//   headers: { 'Content-Type': 'image/jpeg' },
//   body: imageFile
// });

장점:

보안: AWS 자격 증명을 클라이언트에 노출하지 않음
직접 업로드: 서버를 거치지 않고 S3로 직접 업로드 (대역폭 절약)
시간 제한: 만료 시간 설정으로 URL 남용 방지

폴더 구조:

my-image-bucket/
├── uploads/          # 원본 이미지
│   └── 1704772800-photo.jpg
├── thumbnails/       # 200x200 썸네일
│   └── 1704772800-photo.jpg
└── resized/          # 800x600 리사이즈
    └── 1704772800-photo.jpg

S3 EventBridge 통합:

  
// Lambda가 S3 이벤트를 수신
export const handler = async (event) => {
  // EventBridge를 통한 S3 이벤트
  const bucket = event.detail.bucket.name;
  const key = event.detail.object.key;

  // uploads/ 폴더의 파일만 처리
  if (!key.startsWith('uploads/')) {
    return;
  }

  // 이미지 다운로드 및 처리...
};

사례 2: CSV 파일 처리 파이프라인

라이프사이클 정책을 통한 비용 최적화:

  
// Serverless Framework 설정
resources:
  Resources:
    InputBucket:
      Type: AWS::S3::Bucket
      Properties:
        BucketName: csv-pipeline-input
        LifecycleConfiguration:
          Rules:
            # 처리된 CSV는 30일 후 삭제
            - Id: DeleteProcessedFiles
              Status: Enabled
              Prefix: processed/
              ExpirationInDays: 30

    ErrorBucket:
      Type: AWS::S3::Bucket
      Properties:
        BucketName: csv-pipeline-error
        LifecycleConfiguration:
          Rules:
            # 에러 파일은 90일 후 Glacier로 이동
            - Id: ArchiveErrorFiles
              Status: Enabled
              Transitions:
                - TransitionInDays: 90
                  StorageClass: GLACIER

CSV 스트리밍 처리 (메모리 효율적):

  
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
import { parse } from 'csv-parse';
import { Readable } from 'stream';

async function processCsvFromS3(bucket, key) {
  const command = new GetObjectCommand({ Bucket: bucket, Key: key });
  const { Body } = await s3Client.send(command);

  // S3 Body를 Node.js Readable 스트림으로 변환
  const stream = Body instanceof Readable ? Body : Readable.from(Body);

  const records = [];

  // 스트리밍 파싱 (한 번에 메모리에 올리지 않음)
  const parser = stream.pipe(parse({
    columns: true,
    skip_empty_lines: true
  }));

  for await (const record of parser) {
    // 레코드별 처리 (메모리 효율적)
    const validated = validateRecord(record);
    if (validated) {
      records.push(validated);
    }
  }

  return records;
}

처리된 파일 저장:

  
import { PutObjectCommand } from '@aws-sdk/client-s3';
import { stringify } from 'csv-stringify/sync';

async function saveProcessedCsv(bucket, key, records) {
  const csvContent = stringify(records, {
    header: true,
    columns: ['id', 'date', 'product', 'quantity', 'price', 'totalAmount']
  });

  await s3Client.send(new PutObjectCommand({
    Bucket: bucket,
    Key: `processed/${key}`,
    Body: csvContent,
    ContentType: 'text/csv',
    Metadata: {
      processedAt: new Date().toISOString(),
      recordCount: String(records.length)
    }
  }));
}

사례 3: 데이터 레이크 구축

Hive 스타일 파티셔닝:

s3://data-lake-bucket/
└── raw-logs/
    └── year=2025/
        └── month=01/
            └── day=09/
                └── logs-2025-01-09.json

Athena 쿼리 시 파티션 프루닝:
SELECT * FROM logs
WHERE year=2025 AND month=1 AND day=9;
# → year=2025/month=01/day=09/ 폴더만 스캔 (비용 절감)

데이터 업로드:

  
async function uploadLogFile(date, data) {
  const year = date.getFullYear();
  const month = String(date.getMonth() + 1).padStart(2, '0');
  const day = String(date.getDate()).padStart(2, '0');

  const key = `raw-logs/year=${year}/month=${month}/day=${day}/logs-${year}-${month}-${day}.json`;

  await s3Client.send(new PutObjectCommand({
    Bucket: 'data-lake-bucket',
    Key: key,
    Body: JSON.stringify(data),
    ContentType: 'application/json'
  }));
}

버전 관리 활성화:

  
import { PutBucketVersioningCommand } from '@aws-sdk/client-s3';

await s3Client.send(new PutBucketVersioningCommand({
  Bucket: 'data-lake-bucket',
  VersioningConfiguration: {
    Status: 'Enabled'
  }
}));

// 동일 키로 업로드 시 새 버전 생성
// 삭제 시에도 delete marker만 추가 (실제 삭제 X)

S3 고급 기능

1. S3 이벤트 알림

S3 → EventBridge → Lambda 패턴 (권장):

  
# Serverless Framework
provider:
  eventBridge:
    useCloudFormation: true

functions:
  processImage:
    handler: handler.process
    events:
      - eventBridge:
          pattern:
            source:
              - aws.s3
            detail-type:
              - Object Created
            detail:
              bucket:
                name:
                  - my-image-bucket
              object:
                key:
                  - prefix: uploads/

장점:

여러 대상에 동일 이벤트 전송 가능
이벤트 필터링 강력함
재시도 정책 설정 가능
SQS, SNS, Lambda 등 다양한 대상 지원

2. S3 Transfer Acceleration

대용량 파일을 빠르게 전송하는 기능. CloudFront 엣지 로케이션을 활용.

  
// Transfer Acceleration 엔드포인트 사용
const s3Client = new S3Client({
  region: 'ap-northeast-2',
  endpoint: 'https://my-bucket.s3-accelerate.amazonaws.com'
});

// 일반 업로드보다 50-500% 빠를 수 있음 (지역 간 전송 시)

활용 시나리오:

전 세계 사용자가 한국 리전 S3에 업로드
GB 단위 파일 전송
네트워크 지연시간이 큰 환경

3. S3 Multipart Upload

5MB 이상 파일은 멀티파트 업로드 권장 (5GB 이상은 필수).

  
import {
  CreateMultipartUploadCommand,
  UploadPartCommand,
  CompleteMultipartUploadCommand
} from '@aws-sdk/client-s3';

async function multipartUpload(bucket, key, fileBuffer) {
  // 1. 멀티파트 업로드 시작
  const { UploadId } = await s3Client.send(new CreateMultipartUploadCommand({
    Bucket: bucket,
    Key: key
  }));

  const partSize = 5 * 1024 * 1024; // 5MB
  const parts = [];

  // 2. 각 파트 업로드
  for (let i = 0; i < fileBuffer.length; i += partSize) {
    const partNumber = Math.floor(i / partSize) + 1;
    const chunk = fileBuffer.slice(i, i + partSize);

    const { ETag } = await s3Client.send(new UploadPartCommand({
      Bucket: bucket,
      Key: key,
      UploadId,
      PartNumber: partNumber,
      Body: chunk
    }));

    parts.push({ ETag, PartNumber: partNumber });
  }

  // 3. 멀티파트 업로드 완료
  await s3Client.send(new CompleteMultipartUploadCommand({
    Bucket: bucket,
    Key: key,
    UploadId,
    MultipartUpload: { Parts: parts }
  }));
}

장점:

병렬 업로드로 속도 향상
네트워크 오류 시 실패한 파트만 재업로드
일시 중지 및 재개 가능

4. S3 Select (서버 측 필터링)

객체 내용을 필터링하여 필요한 데이터만 전송.

  
import { SelectObjectContentCommand } from '@aws-sdk/client-s3';

const command = new SelectObjectContentCommand({
  Bucket: 'my-bucket',
  Key: 'data.csv',
  ExpressionType: 'SQL',
  Expression: `
    SELECT s.product, s.quantity, s.price
    FROM S3Object s
    WHERE s.quantity > 100
  `,
  InputSerialization: {
    CSV: { FileHeaderInfo: 'USE', RecordDelimiter: '\n', FieldDelimiter: ',' }
  },
  OutputSerialization: {
    JSON: { RecordDelimiter: '\n' }
  }
});

const response = await s3Client.send(command);

// 스트림 결과 처리
for await (const event of response.Payload) {
  if (event.Records) {
    const records = event.Records.Payload.toString();
    console.log(records);
  }
}

비용 절감:

1GB 파일에서 10MB 데이터만 필요한 경우
S3 Select: 1GB 스캔 + 10MB 전송 비용
일반 다운로드: 1GB 전송 비용
약 80% 비용 절감

S3 보안 및 접근 제어

1. 버킷 정책 (Bucket Policy)

버킷 수준에서 JSON 기반 권한 설정.

  
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PublicReadGetObject",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-public-bucket/public/*",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": "203.0.113.0/24"
        }
      }
    }
  ]
}

실전 예시 - CloudFront에서만 접근 허용:

  
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "cloudfront.amazonaws.com"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "StringEquals": {
          "AWS:SourceArn": "arn:aws:cloudfront::123456789012:distribution/EDFDVBD6EXAMPLE"
        }
      }
    }
  ]
}

2. IAM 정책

사용자/역할 기반 권한 설정.

  
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::my-bucket/users/${aws:username}/*"
      ]
    }
  ]
}

각 사용자는 users/본인아이디/ 폴더에만 접근 가능.

3. S3 Block Public Access

실수로 버킷을 공개하는 것을 방지하는 안전장치.

  
import { PutPublicAccessBlockCommand } from '@aws-sdk/client-s3';

await s3Client.send(new PutPublicAccessBlockCommand({
  Bucket: 'my-bucket',
  PublicAccessBlockConfiguration: {
    BlockPublicAcls: true,        // 새로운 public ACL 차단
    IgnorePublicAcls: true,        // 기존 public ACL 무시
    BlockPublicPolicy: true,       // 새로운 public 정책 차단
    RestrictPublicBuckets: true    // public 정책이 있어도 접근 차단
  }
}));

권장 사항: 모든 버킷에 활성화 (CDN은 CloudFront 사용)

4. 객체 암호화

서버 측 암호화 옵션:

방식	설명	키 관리	비용
SSE-S3	S3 관리 키로 암호화	AWS 자동	무료
SSE-KMS	AWS KMS 키로 암호화	사용자 제어 가능	KMS API 호출 비용
SSE-C	고객 제공 키로 암호화	완전한 사용자 제어	무료 (키 관리 책임)
클라이언트 측	업로드 전 암호화	완전한 사용자 제어	무료 (키 관리 책임)

기본 암호화 설정:

  
import { PutBucketEncryptionCommand } from '@aws-sdk/client-s3';

await s3Client.send(new PutBucketEncryptionCommand({
  Bucket: 'my-bucket',
  ServerSideEncryptionConfiguration: {
    Rules: [{
      ApplyServerSideEncryptionByDefault: {
        SSEAlgorithm: 'AES256'  // SSE-S3
      },
      BucketKeyEnabled: true  // 비용 절감 (KMS 호출 감소)
    }]
  }
}));

S3 성능 최적화

1. 키 네이밍 전략 (과거 권장사항, 현재는 불필요)

2018년 7월 이전:

# 나쁜 예 (핫 파티션 발생)
logs/2025-01-09-001.log
logs/2025-01-09-002.log
logs/2025-01-09-003.log

# 좋은 예 (랜덤 프리픽스로 분산)
a3b2/logs/2025-01-09-001.log
f8c1/logs/2025-01-09-002.log
d5e9/logs/2025-01-09-003.log

2018년 7월 이후: S3가 자동으로 파티셔닝하므로 순차적 키 사용 가능. 랜덤 프리픽스 불필요.

2. 요청 성능 한도

S3 성능 한도 (2020년 기준):

3,500 PUT/COPY/POST/DELETE 요청/초 (프리픽스당)
5,500 GET/HEAD 요청/초 (프리픽스당)

프리픽스를 분산하면 성능 향상:

# 단일 프리픽스
uploads/  → 3,500 PUT/s

# 10개 프리픽스
uploads/0/  → 3,500 PUT/s
uploads/1/  → 3,500 PUT/s
...
uploads/9/  → 3,500 PUT/s
총 35,000 PUT/s 가능

3. CloudFront 통합

정적 콘텐츠는 CloudFront를 통해 배포.

  
// S3 Origin Access Identity (OAI) 설정
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity EXAMPLE"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket/*"
    }
  ]
}

장점:

전 세계 엣지 로케이션에서 캐싱
S3 요청 수 감소 (비용 절감)
HTTPS 기본 제공
커스텀 도메인 지원

S3 비용 최적화

1. 스토리지 클래스 자동 전환

Intelligent-Tiering 사용:

  
await s3Client.send(new PutObjectCommand({
  Bucket: 'my-bucket',
  Key: 'document.pdf',
  Body: fileContent,
  StorageClass: 'INTELLIGENT_TIERING'
}));

// 30일 액세스 없으면 자동으로 Infrequent Access로 이동
// 90일 액세스 없으면 Archive Instant Access로 이동
// 추가 비용: 객체당 월 $0.0025 (1000개당 $2.50)

라이프사이클 정책:

  
resources:
  Resources:
    MyBucket:
      Type: AWS::S3::Bucket
      Properties:
        LifecycleConfiguration:
          Rules:
            - Id: TransitionOldLogs
              Status: Enabled
              Prefix: logs/
              Transitions:
                - TransitionInDays: 30
                  StorageClass: STANDARD_IA
                - TransitionInDays: 90
                  StorageClass: GLACIER_IR
                - TransitionInDays: 365
                  StorageClass: DEEP_ARCHIVE
              ExpirationInDays: 2555  # 7년 후 삭제

비용 절감 예시 (1TB 데이터):

S3 Standard (1년 보관): $276
↓
30일 후 Standard-IA 전환: $138 (50% 절감)
↓
90일 후 Glacier 전환: $48 (82% 절감)

2. 불완전한 멀티파트 업로드 정리

업로드 실패 시 파트가 남아 비용 발생.

  
LifecycleConfiguration:
  Rules:
    - Id: CleanupIncompleteUploads
      Status: Enabled
      AbortIncompleteMultipartUpload:
        DaysAfterInitiation: 7  # 7일 후 미완료 업로드 삭제

3. 요청 비용 최적화

S3 Batch Operations로 대량 작업:

개별 API 호출:
100만 객체 × PUT 요청 = $5.00

S3 Batch Operations:
100만 객체 일괄 처리 = $0.25 (95% 절감)

S3와 다른 AWS 서비스 통합

1. S3 + Lambda (서버리스 처리)

  
functions:
  processFile:
    handler: handler.process
    events:
      - s3:
          bucket: my-bucket
          event: s3:ObjectCreated:*
          rules:
            - prefix: uploads/
            - suffix: .jpg

2. S3 + Athena (SQL 쿼리)

  
-- S3 데이터를 SQL로 쿼리
CREATE EXTERNAL TABLE logs (
  timestamp STRING,
  level STRING,
  message STRING
)
PARTITIONED BY (year INT, month INT, day INT)
STORED AS JSON
LOCATION 's3://my-bucket/logs/';

-- 파티션 추가
MSCK REPAIR TABLE logs;

-- 쿼리 실행 (스캔한 데이터양만큼 과금)
SELECT level, COUNT(*) as count
FROM logs
WHERE year=2025 AND month=1
GROUP BY level;

3. S3 + Glue (ETL)

  
// Glue Crawler가 자동으로 스키마 발견
const crawler = new GlueClient({});

await crawler.send(new StartCrawlerCommand({
  Name: 'my-s3-crawler'
}));

// Glue Data Catalog에 테이블 생성
// Athena, EMR, Redshift Spectrum에서 사용 가능

4. S3 + SageMaker (ML 학습 데이터)

  
import sagemaker

# S3에서 학습 데이터 로드
s3_input_train = sagemaker.inputs.TrainingInput(
    s3_data='s3://my-bucket/training-data/',
    content_type='text/csv'
)

# 모델 학습 결과를 S3에 저장
estimator = sagemaker.estimator.Estimator(
    output_path='s3://my-bucket/models/'
)

실전 팁 및 Best Practices

1. CORS 설정 (웹 애플리케이션)

  
import { PutBucketCorsCommand } from '@aws-sdk/client-s3';

await s3Client.send(new PutBucketCorsCommand({
  Bucket: 'my-bucket',
  CORSConfiguration: {
    CORSRules: [
      {
        AllowedHeaders: ['*'],
        AllowedMethods: ['GET', 'PUT', 'POST'],
        AllowedOrigins: ['https://myapp.com'],
        ExposeHeaders: ['ETag'],
        MaxAgeSeconds: 3000
      }
    ]
  }
}));

2. 객체 태그를 통한 분류 및 비용 추적

  
await s3Client.send(new PutObjectTaggingCommand({
  Bucket: 'my-bucket',
  Key: 'document.pdf',
  Tagging: {
    TagSet: [
      { Key: 'Project', Value: 'Marketing' },
      { Key: 'Department', Value: 'Sales' },
      { Key: 'Sensitivity', Value: 'Confidential' }
    ]
  }
}));

// Cost Allocation Tags로 부서별 S3 비용 추적 가능

3. S3 인벤토리로 대량 객체 관리

  
import { PutBucketInventoryConfigurationCommand } from '@aws-sdk/client-s3';

await s3Client.send(new PutBucketInventoryConfigurationCommand({
  Bucket: 'source-bucket',
  Id: 'DailyInventory',
  InventoryConfiguration: {
    Destination: {
      S3BucketDestination: {
        Bucket: 'arn:aws:s3:::inventory-bucket',
        Format: 'CSV'
      }
    },
    IsEnabled: true,
    Schedule: {
      Frequency: 'Daily'
    },
    IncludedObjectVersions: 'Current',
    OptionalFields: [
      'Size', 'LastModifiedDate', 'StorageClass', 'ETag'
    ]
  }
}));

// 매일 객체 목록 CSV 파일 생성
// Athena로 분석 가능

4. 대용량 삭제 작업

  
// 1000개씩 배치 삭제
async function deleteAllObjects(bucket, prefix) {
  let continuationToken;

  do {
    const listResponse = await s3Client.send(new ListObjectsV2Command({
      Bucket: bucket,
      Prefix: prefix,
      MaxKeys: 1000,
      ContinuationToken: continuationToken
    }));

    if (listResponse.Contents && listResponse.Contents.length > 0) {
      await s3Client.send(new DeleteObjectsCommand({
        Bucket: bucket,
        Delete: {
          Objects: listResponse.Contents.map(obj => ({ Key: obj.Key }))
        }
      }));

      console.log(`Deleted ${listResponse.Contents.length} objects`);
    }

    continuationToken = listResponse.NextContinuationToken;
  } while (continuationToken);
}

5. S3 Access Logs로 감사 추적

  
import { PutBucketLoggingCommand } from '@aws-sdk/client-s3';

await s3Client.send(new PutBucketLoggingCommand({
  Bucket: 'my-bucket',
  BucketLoggingStatus: {
    LoggingEnabled: {
      TargetBucket: 'my-log-bucket',
      TargetPrefix: 'access-logs/'
    }
  }
}));

// 모든 S3 요청이 로그로 기록됨
// Athena로 분석하여 비정상 접근 탐지

S3 vs 다른 스토리지 비교

S3 vs EBS (Elastic Block Store)

특성	S3	EBS
유형	객체 스토리지	블록 스토리지
용량	무제한	최대 64TB
접근 방식	HTTP API	파일 시스템 마운트
가격	$0.023/GB/월	$0.10/GB/월 (gp3)
사용 사례	파일, 백업, 데이터 레이크	데이터베이스, OS 디스크
공유	여러 인스턴스 동시 접근	단일 인스턴스만

S3 vs EFS (Elastic File System)

특성	S3	EFS
유형	객체 스토리지	파일 스토리지 (NFS)
프로토콜	HTTP/HTTPS	NFS v4.1
가격	$0.023/GB/월	$0.30/GB/월
성능	높은 처리량	낮은 지연시간
사용 사례	대용량 파일, 정적 컨텐츠	공유 파일 시스템

S3 vs Glacier

Glacier는 S3의 스토리지 클래스 중 하나임.

스토리지 클래스	검색 시간	비용/GB/월
S3 Standard	즉시	$0.023
Glacier Instant Retrieval	밀리초	$0.004
Glacier Flexible Retrieval	1-5분	$0.0036
Glacier Deep Archive	12시간	$0.00099

마치며

Amazon S3는 단순한 파일 스토리지를 넘어 클라우드 아키텍처의 핵심 구성 요소임.

S3를 선택해야 할 때:

정적 파일 호스팅 (이미지, 동영상, 문서)
데이터 백업 및 아카이빙
데이터 레이크 구축
대용량 파일 배포
로그 및 분석 데이터 저장
애플리케이션 리소스 저장

주요 장점:

무제한 확장성: 1바이트부터 엑사바이트까지
11 9’s 내구성: 사실상 데이터 손실 없음
저렴한 비용: 사용한 만큼만 지불
풍부한 통합: Lambda, Athena, Glue, CloudFront 등
강력한 보안: 암호화, 액세스 제어, 감사 로깅

실전 프로젝트에서는 S3를 중심으로 Lambda(처리), DynamoDB(메타데이터), EventBridge(이벤트 라우팅)를 조합하여 서버리스 데이터 파이프라인을 구축할 수 있음.

라이프사이클 정책, 스토리지 클래스 최적화, CloudFront 통합을 통해 비용을 크게 절감하면서도 높은 성능과 가용성을 유지할 수 있음.

도움이 되셨길 바랍니다! 😀

AWS, Storage, S3

This post is licensed under CC BY 4.0 by the author.

Amazon S3란?

왜 S3를 사용해야 하는가?

전통적인 파일 스토리지의 문제점

S3의 해결 방법

S3의 핵심 개념

1. 버킷 (Bucket)

2. 객체 (Object)

3. 스토리지 클래스

실전 프로젝트 활용 사례

사례 1: 이미지 처리 파이프라인

사례 2: CSV 파일 처리 파이프라인

사례 3: 데이터 레이크 구축

S3 고급 기능

1. S3 이벤트 알림

2. S3 Transfer Acceleration

3. S3 Multipart Upload

4. S3 Select (서버 측 필터링)

S3 보안 및 접근 제어

1. 버킷 정책 (Bucket Policy)

2. IAM 정책

3. S3 Block Public Access

4. 객체 암호화

S3 성능 최적화

1. 키 네이밍 전략 (과거 권장사항, 현재는 불필요)

2. 요청 성능 한도

3. CloudFront 통합

S3 비용 최적화

1. 스토리지 클래스 자동 전환

2. 불완전한 멀티파트 업로드 정리

3. 요청 비용 최적화

S3와 다른 AWS 서비스 통합

1. S3 + Lambda (서버리스 처리)

2. S3 + Athena (SQL 쿼리)

3. S3 + Glue (ETL)

4. S3 + SageMaker (ML 학습 데이터)

실전 팁 및 Best Practices

1. CORS 설정 (웹 애플리케이션)

2. 객체 태그를 통한 분류 및 비용 추적

3. S3 인벤토리로 대량 객체 관리

4. 대용량 삭제 작업

5. S3 Access Logs로 감사 추적

S3 vs 다른 스토리지 비교

S3 vs EBS (Elastic Block Store)

S3 vs EFS (Elastic File System)

S3 vs Glacier

마치며

도움이 되셨길 바랍니다! 😀

Trending Tags