Saga 패턴으로 분산 트랜잭션 처리하기: Payment Service 실전 사례

서론

마이크로서비스에서 여러 서비스를 거치는 비즈니스 로직을 만들 때 가장 까다로운 부분이 분산 트랜잭션입니다.

단일 데이터베이스에서는 ACID 트랜잭션이 잘 동작합니다. 하지만 마이크로서비스 환경에서는 각 서비스가 독립적인 데이터베이스를 가지기 때문에 2PC(Two-Phase Commit) 같은 분산 트랜잭션 프로토콜을 쓰기 어렵습니다.

이 글에서는 실제 프로젝트에서 Saga 패턴을 써서 분산 트랜잭션을 처리한 경험을 공유합니다.

1. 문제 상황: Payment Service의 분산 트랜잭션

시나리오: 결제 승인 프로세스

결제 서비스에서 결제를 승인하는 과정은 세 단계로 나뉩니다:

1. PG(Payment Gateway) 승인
   ↓
2. 외부 서비스 호출 (Order 생성 또는 Credit 발급)
   ↓
3. 결제 완료 처리

문제는 이겁니다:

Step 1 (PG 승인)이 성공했는데 Step 2 (Order 생성)가 실패하면?
Step 2가 성공했는데 Step 3이 실패하면?
각 단계가 다른 서비스나 외부 API를 호출하므로 전통적인 트랜잭션 롤백이 불가능합니다.

Payment Failed After Money Withdrawn

계좌에서 돈은 빠져나갔는데 주문은 안 들어간 상황. 고객 입장에서는 "돈만 받고 물건은 안 주는 거냐"는 불만이 당연합니다. 이런 일을 막기 위해 Saga 패턴이 필요합니다.

전통적인 접근 방식의 한계

// ❌ 이렇게 하면 안 됨 - 분산 환경에서는 불가능
@Transactional
fun approvePayment(...) {
    pgService.approve()  // 외부 API
    orderService.create()  // 다른 서비스
    paymentRepository.save()  // 로컬 DB
    // 하나라도 실패하면 전체 롤백? → 불가능!
}
// ❌ 이렇게 하면 안 됨 - 분산 환경에서는 불가능
@Transactional
fun approvePayment(...) {
    pgService.approve()  // 외부 API
    orderService.create()  // 다른 서비스
    paymentRepository.save()  // 로컬 DB
    // 하나라도 실패하면 전체 롤백? → 불가능!
}

왜 불가능한가?

PG 서비스는 외부 API라서 우리가 제어할 수 없습니다.
Order 서비스는 다른 마이크로서비스로 독립적인 DB를 가집니다.
각 단계가 이미 커밋된 상태라서 롤백이 불가능합니다.

2. Saga 패턴 소개

Saga 패턴이란?

Saga 패턴은 긴 트랜잭션을 여러 개의 작은 트랜잭션으로 나누고, 각 트랜잭션이 실패하면 보상 트랜잭션(Compensating Transaction)을 실행해서 이전 단계들을 되돌리는 패턴입니다.

Saga 패턴 흐름도

다이어그램 로딩 중...

두 가지 구현 방식

1. Choreography (오케스트레이션 없음)

각 서비스가 이벤트를 발행하고, 다른 서비스가 이를 구독해서 다음 단계를 수행합니다.

Payment Service → [PG 승인 완료 이벤트 발행]
Order Service → [이벤트 구독] → Order 생성

장점:

중앙 집중식 오케스트레이터가 필요 없습니다.
서비스 간 결합이 느슨합니다.

단점:

전체 흐름을 파악하기 어렵습니다.
디버깅이 복잡합니다.
보상 트랜잭션 관리가 어렵습니다.

2. Orchestration (오케스트레이션)

중앙 오케스트레이터가 전체 흐름을 관리하고 각 서비스를 순차적으로 호출합니다.

Orchestrator
  → Step 1: PG 승인
  → Step 2: Order 생성
  → Step 3: 완료 처리
  (실패 시 보상 트랜잭션 실행)

장점:

전체 흐름이 명확합니다.
보상 트랜잭션 관리가 쉽습니다.
디버깅이 용이합니다.

단점:

중앙 집중식 오케스트레이터가 필요합니다.
오케스트레이터가 단일 장애점이 될 수 있습니다.

우리 프로젝트는 Orchestration 방식을 선택했습니다. 이유는 보상 트랜잭션 관리가 더 명확하고, 디버깅이 쉬워서입니다.

3. 실제 프로젝트 적용: Payment Service

아키텍처 개요

다이어그램 로딩 중...

흐름:

Step 1: PG 승인 (NicePay API)
Step 2: Order 생성 (Product Service)
Step 3: 결제 완료 처리
실패 시: 보상 트랜잭션 실행

상태 머신 설계

결제는 여러 상태를 거치며, 각 상태 전이는 엄격하게 제어됩니다:

enum class ProductPaymentStatus {
    PENDING,              // 초기 상태
    PG_APPROVED,          // PG 승인 완료
    COMPLETED,            // 최종 완료
    PG_APPROVE_FAILED,    // PG 승인 실패
    CREATE_ORDER_FAILED,   // Order 생성 실패
    COMPENSATION_NEEDED,  // 보상 필요 (수동 처리)
    CANCELLED,            // 취소됨
    PG_CANCEL_FAILED,     // PG 취소 실패
    ORDER_CANCEL_FAILED,  // Order 취소 실패
}
enum class ProductPaymentStatus {
    PENDING,              // 초기 상태
    PG_APPROVED,          // PG 승인 완료
    COMPLETED,            // 최종 완료
    PG_APPROVE_FAILED,    // PG 승인 실패
    CREATE_ORDER_FAILED,   // Order 생성 실패
    COMPENSATION_NEEDED,  // 보상 필요 (수동 처리)
    CANCELLED,            // 취소됨
    PG_CANCEL_FAILED,     // PG 취소 실패
    ORDER_CANCEL_FAILED,  // Order 취소 실패
}

상태 전이 규칙:

PENDING → PG_APPROVED → COMPLETED (성공 경로)
PENDING → PG_APPROVE_FAILED (PG 승인 실패)
PG_APPROVED → CREATE_ORDER_FAILED (Order 생성 실패)
PG_APPROVED → COMPENSATION_NEEDED (보상 실패)

도메인 모델: 상태 전이 메서드

도메인 모델에 상태 전이 로직을 넣어서 불변성을 보장합니다:

data class ProductPayment(
    val id: String,
    val userId: String,
    val amount: Int,
    val status: ProductPaymentStatus,
    val pgTid: String? = null,
    val referenceId: String? = null,
    // ...
) {
    fun canApprove(): Boolean {
        return status == ProductPaymentStatus.PENDING
    }

    fun markPgApproved(tid: String?): ProductPayment {
        require(status == ProductPaymentStatus.PENDING) {
            "PG 승인 상태로 변경할 수 없습니다: $status"
        }
        return this.copy(
            status = ProductPaymentStatus.PG_APPROVED,
            pgTid = tid,
            updatedAt = Instant.now(),
        )
    }

    fun markAsCompleted(referenceId: String?): ProductPayment {
        require(status == ProductPaymentStatus.PG_APPROVED) {
            "완료 상태로 변경할 수 없습니다: $status"
        }
        return this.copy(
            status = ProductPaymentStatus.COMPLETED,
            referenceId = referenceId,
            updatedAt = Instant.now(),
        )
    }
}
data class ProductPayment(
    val id: String,
    val userId: String,
    val amount: Int,
    val status: ProductPaymentStatus,
    val pgTid: String? = null,
    val referenceId: String? = null,
    // ...
) {
    fun canApprove(): Boolean {
        return status == ProductPaymentStatus.PENDING
    }

    fun markPgApproved(tid: String?): ProductPayment {
        require(status == ProductPaymentStatus.PENDING) {
            "PG 승인 상태로 변경할 수 없습니다: $status"
        }
        return this.copy(
            status = ProductPaymentStatus.PG_APPROVED,
            pgTid = tid,
            updatedAt = Instant.now(),
        )
    }

    fun markAsCompleted(referenceId: String?): ProductPayment {
        require(status == ProductPaymentStatus.PG_APPROVED) {
            "완료 상태로 변경할 수 없습니다: $status"
        }
        return this.copy(
            status = ProductPaymentStatus.COMPLETED,
            referenceId = referenceId,
            updatedAt = Instant.now(),
        )
    }
}

핵심 포인트:

상태 전이는 도메인 모델 내부에서 검증됩니다.
불변 객체로 상태를 변경합니다 (copy 사용).
잘못된 상태 전이는 예외를 발생시킵니다.

4. Saga 패턴 구현: 결제 승인 프로세스

전체 흐름

@Service
@Transactional
class ProductPaymentService(
    private val paymentRepository: PaymentRepository,
    private val productServiceClient: ProductServiceClient,
    private val nicePayService: NicePayService,
    private val productPaymentStatusService: ProductPaymentStatusService,
    private val notificationService: NotificationService,
) : ProductPaymentUseCase {

    override fun approvePayment(
        orderId: String,
        tid: String?,
        authToken: String?,
        amount: String?,
    ): ProductPayment {
        val payment = paymentRepository.findProductPaymentByPgOrderId(orderId)
            ?: throw IllegalArgumentException("결제를 찾을 수 없습니다: $orderId")

        require(payment.canApprove()) {
            "이미 처리된 결제입니다: ${payment.status}"
        }

        var currentPayment = payment

        try {
            // Step 1: PG 승인
            if (payment.pgType == PgType.NICEPAY && tid != null) {
                nicePayService.approvePayment(
                    tid = tid,
                    amount = payment.amount,
                )
                logger.info("PG 승인 완료: paymentId=${payment.id}")

                // 중간 상태 저장 (별도 트랜잭션)
                currentPayment = productPaymentStatusService.markAsPgApproved(payment, tid)
            }

            // Step 2: Order 생성
            val order = productServiceClient.createOrder(
                productType = currentPayment.productType,
                productId = currentPayment.productId,
                userId = currentPayment.userId,
                paymentId = currentPayment.id,
            )
            logger.info("Order 생성 완료: paymentId=${currentPayment.id}, orderId=${order.id}")

            // Step 3: 최종 완료
            val updated = currentPayment.markAsCompleted(order.id)
            val savedPayment = paymentRepository.saveProductPayment(updated)

            notificationService.sendPaymentSuccessNotification(savedPayment)

            return savedPayment
        } catch (e: Exception) {
            // Saga 패턴: 상태별 실패 처리 및 보상 트랜잭션
            handleFailure(currentPayment, e, tid)
            throw e
        }
    }
}
@Service
@Transactional
class ProductPaymentService(
    private val paymentRepository: PaymentRepository,
    private val productServiceClient: ProductServiceClient,
    private val nicePayService: NicePayService,
    private val productPaymentStatusService: ProductPaymentStatusService,
    private val notificationService: NotificationService,
) : ProductPaymentUseCase {

    override fun approvePayment(
        orderId: String,
        tid: String?,
        authToken: String?,
        amount: String?,
    ): ProductPayment {
        val payment = paymentRepository.findProductPaymentByPgOrderId(orderId)
            ?: throw IllegalArgumentException("결제를 찾을 수 없습니다: $orderId")

        require(payment.canApprove()) {
            "이미 처리된 결제입니다: ${payment.status}"
        }

        var currentPayment = payment

        try {
            // Step 1: PG 승인
            if (payment.pgType == PgType.NICEPAY && tid != null) {
                nicePayService.approvePayment(
                    tid = tid,
                    amount = payment.amount,
                )
                logger.info("PG 승인 완료: paymentId=${payment.id}")

                // 중간 상태 저장 (별도 트랜잭션)
                currentPayment = productPaymentStatusService.markAsPgApproved(payment, tid)
            }

            // Step 2: Order 생성
            val order = productServiceClient.createOrder(
                productType = currentPayment.productType,
                productId = currentPayment.productId,
                userId = currentPayment.userId,
                paymentId = currentPayment.id,
            )
            logger.info("Order 생성 완료: paymentId=${currentPayment.id}, orderId=${order.id}")

            // Step 3: 최종 완료
            val updated = currentPayment.markAsCompleted(order.id)
            val savedPayment = paymentRepository.saveProductPayment(updated)

            notificationService.sendPaymentSuccessNotification(savedPayment)

            return savedPayment
        } catch (e: Exception) {
            // Saga 패턴: 상태별 실패 처리 및 보상 트랜잭션
            handleFailure(currentPayment, e, tid)
            throw e
        }
    }
}

핵심: 실패 처리 및 보상 트랜잭션

private fun handleFailure(
    currentPayment: ProductPayment,
    e: Exception,
    tid: String?,
) {
    logger.error("결제 승인 실패: paymentId=${currentPayment.id}, error=${e.message}", e)

    notificationService.sendPaymentFailureNotification(currentPayment, e)

    // Saga 패턴: 상태별 실패 처리 및 보상 트랜잭션
    when (currentPayment.status) {
        ProductPaymentStatus.PENDING -> {
            // Step 1 실패: PG 승인 실패
            // 보상 트랜잭션 불필요 (아직 PG 승인 전)
            productPaymentStatusService.markAsPgApproveFailed(currentPayment, tid)
        }
        ProductPaymentStatus.PG_APPROVED -> {
            // Step 2 실패: Order 생성 실패 (PG는 이미 성공)
            // 보상 트랜잭션 필요: PG 취소
            try {
                compensatePgApproval(currentPayment, tid)
                productPaymentStatusService.markAsCreateOrderFailed(currentPayment)
            } catch (compensationException: Exception) {
                logger.error(
                    "보상 트랜잭션 실패: paymentId=${currentPayment.id}, " +
                        "error=${compensationException.message}",
                    compensationException,
                )
                // 보상 트랜잭션도 실패 → 수동 처리 필요
                productPaymentStatusService.markAsCompensationNeeded(
                    currentPayment,
                    e.message,
                    compensationException.message,
                )
            }
        }
        else -> {
            logger.warn(
                "예상치 못한 상태에서 실패: paymentId=${currentPayment.id}, " +
                    "status=${currentPayment.status}",
            )
        }
    }
}
private fun handleFailure(
    currentPayment: ProductPayment,
    e: Exception,
    tid: String?,
) {
    logger.error("결제 승인 실패: paymentId=${currentPayment.id}, error=${e.message}", e)

    notificationService.sendPaymentFailureNotification(currentPayment, e)

    // Saga 패턴: 상태별 실패 처리 및 보상 트랜잭션
    when (currentPayment.status) {
        ProductPaymentStatus.PENDING -> {
            // Step 1 실패: PG 승인 실패
            // 보상 트랜잭션 불필요 (아직 PG 승인 전)
            productPaymentStatusService.markAsPgApproveFailed(currentPayment, tid)
        }
        ProductPaymentStatus.PG_APPROVED -> {
            // Step 2 실패: Order 생성 실패 (PG는 이미 성공)
            // 보상 트랜잭션 필요: PG 취소
            try {
                compensatePgApproval(currentPayment, tid)
                productPaymentStatusService.markAsCreateOrderFailed(currentPayment)
            } catch (compensationException: Exception) {
                logger.error(
                    "보상 트랜잭션 실패: paymentId=${currentPayment.id}, " +
                        "error=${compensationException.message}",
                    compensationException,
                )
                // 보상 트랜잭션도 실패 → 수동 처리 필요
                productPaymentStatusService.markAsCompensationNeeded(
                    currentPayment,
                    e.message,
                    compensationException.message,
                )
            }
        }
        else -> {
            logger.warn(
                "예상치 못한 상태에서 실패: paymentId=${currentPayment.id}, " +
                    "status=${currentPayment.status}",
            )
        }
    }
}

보상 트랜잭션 구현

/**
 * 보상 트랜잭션: PG 승인 취소
 */
private fun compensatePgApproval(payment: ProductPayment, tid: String?) {
    if (tid != null && payment.pgType == PgType.NICEPAY) {
        logger.info("PG 승인 취소 시작: paymentId=${payment.id}")
        try {
            nicePayService.cancelPayment(
                tid = tid,
                amount = payment.amount,
                reason = "Order 생성 실패로 인한 자동 취소",
            )
            logger.info("PG 승인 취소 완료: paymentId=${payment.id}")
        } catch (e: Exception) {
            logger.error("PG 승인 취소 실패: paymentId=${payment.id}, error=${e.message}", e)
            throw e  // 보상 트랜잭션 실패는 상위로 전파
        }
    }
}
/**
 * 보상 트랜잭션: PG 승인 취소
 */
private fun compensatePgApproval(payment: ProductPayment, tid: String?) {
    if (tid != null && payment.pgType == PgType.NICEPAY) {
        logger.info("PG 승인 취소 시작: paymentId=${payment.id}")
        try {
            nicePayService.cancelPayment(
                tid = tid,
                amount = payment.amount,
                reason = "Order 생성 실패로 인한 자동 취소",
            )
            logger.info("PG 승인 취소 완료: paymentId=${payment.id}")
        } catch (e: Exception) {
            logger.error("PG 승인 취소 실패: paymentId=${payment.id}, error=${e.message}", e)
            throw e  // 보상 트랜잭션 실패는 상위로 전파
        }
    }
}

5. 중간 상태 저장: 별도 트랜잭션

왜 중간 상태를 저장해야 하나?

각 단계가 성공하면 즉시 상태를 저장해야 합니다. 그래야 실패했을 때 어느 단계에서 실패했는지 알 수 있고, 보상 트랜잭션을 올바르게 실행할 수 있습니다.

Propagation.REQUIRES_NEW 사용

abstract class BasePaymentStatusService<T : Payment>(
    protected val paymentRepository: PaymentRepository,
) {
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    protected open fun markStatus(
        payment: T,
        statusName: String,
        changeStatus: (T) -> T,
    ): T {
        logger.info("$statusName 상태로 저장: paymentId=${payment.id}")
        val updated = changeStatus(payment)
        return savePayment(updated)
    }

    protected abstract fun savePayment(payment: T): T
}
abstract class BasePaymentStatusService<T : Payment>(
    protected val paymentRepository: PaymentRepository,
) {
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    protected open fun markStatus(
        payment: T,
        statusName: String,
        changeStatus: (T) -> T,
    ): T {
        logger.info("$statusName 상태로 저장: paymentId=${payment.id}")
        val updated = changeStatus(payment)
        return savePayment(updated)
    }

    protected abstract fun savePayment(payment: T): T
}

REQUIRES_NEW를 사용하는 이유:

외부 서비스 호출 전에 현재 상태를 저장합니다.
외부 서비스 호출이 실패해도 이미 저장된 상태는 유지됩니다.
보상 트랜잭션 실행 시 올바른 상태 정보를 사용할 수 있습니다.

흐름:

1. PG 승인 API 호출
2. 성공 → markAsPgApproved() (REQUIRES_NEW 트랜잭션)
   → DB에 PG_APPROVED 상태 저장
3. Order 생성 API 호출
4. 실패 → 보상 트랜잭션 실행
   → 저장된 PG_APPROVED 상태를 기반으로 PG 취소

6. 다양한 결제 타입에 대한 확장

전략 패턴 활용

프로젝트에는 세 가지 결제 타입이 있습니다:

ProductPayment: 상품 결제 (Order 생성)
CreditPayment: 크레딧 충전 (Credit 발급)
QuotaPayment: 할당량 구매 (Quota 발급)

각 타입마다 Step 2가 다르지만, 전체 흐름은 동일합니다:

// ProductPayment
Step 1: PG 승인
Step 2: Order 생성 (Product Service)
Step 3: 완료

// CreditPayment
Step 1: PG 승인
Step 2: Credit 발급 (Supporters Service)
Step 3: 완료

// QuotaPayment
Step 1: PG 승인
Step 2: Quota 발급 (Book Plus Service)
Step 3: 완료
// ProductPayment
Step 1: PG 승인
Step 2: Order 생성 (Product Service)
Step 3: 완료

// CreditPayment
Step 1: PG 승인
Step 2: Credit 발급 (Supporters Service)
Step 3: 완료

// QuotaPayment
Step 1: PG 승인
Step 2: Quota 발급 (Book Plus Service)
Step 3: 완료

공통 추상화

abstract class BasePaymentStatusService<T : Payment> {
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    protected open fun markStatus(...) { ... }
}

// 각 타입별 구현
class ProductPaymentStatusService : BasePaymentStatusService<ProductPayment>
class CreditPaymentStatusService : BasePaymentStatusService<CreditPayment>
class QuotaPaymentStatusService : BasePaymentStatusService<QuotaPayment>
abstract class BasePaymentStatusService<T : Payment> {
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    protected open fun markStatus(...) { ... }
}

// 각 타입별 구현
class ProductPaymentStatusService : BasePaymentStatusService<ProductPayment>
class CreditPaymentStatusService : BasePaymentStatusService<CreditPayment>
class QuotaPaymentStatusService : BasePaymentStatusService<QuotaPayment>

7. 실패 시나리오별 처리

시나리오 1: PG 승인 실패

Step 1: PG 승인 → 실패

처리:

보상 트랜잭션이 필요 없습니다 (아직 외부 서비스 호출 전).
상태를 PG_APPROVE_FAILED로 변경합니다.

ProductPaymentStatus.PENDING -> {
    productPaymentStatusService.markAsPgApproveFailed(currentPayment, tid)
}
ProductPaymentStatus.PENDING -> {
    productPaymentStatusService.markAsPgApproveFailed(currentPayment, tid)
}

시나리오 2: Order 생성 실패 (PG는 성공)

Step 1: PG 승인 → 성공 ✅
Step 2: Order 생성 → 실패 ❌

처리:

보상 트랜잭션을 실행합니다: PG 취소.
상태를 CREATE_ORDER_FAILED로 변경합니다.

ProductPaymentStatus.PG_APPROVED -> {
    try {
        compensatePgApproval(currentPayment, tid)  // PG 취소
        productPaymentStatusService.markAsCreateOrderFailed(currentPayment)
    } catch (compensationException: Exception) {
        // 보상 트랜잭션도 실패
        productPaymentStatusService.markAsCompensationNeeded(...)
    }
}
ProductPaymentStatus.PG_APPROVED -> {
    try {
        compensatePgApproval(currentPayment, tid)  // PG 취소
        productPaymentStatusService.markAsCreateOrderFailed(currentPayment)
    } catch (compensationException: Exception) {
        // 보상 트랜잭션도 실패
        productPaymentStatusService.markAsCompensationNeeded(...)
    }
}

시나리오 3: 보상 트랜잭션 실패

Step 1: PG 승인 → 성공 ✅
Step 2: Order 생성 → 실패 ❌
보상: PG 취소 → 실패 ❌

처리:

상태를 COMPENSATION_NEEDED로 변경합니다.
수동 처리가 필요합니다 (관리자 알림).
에러 정보를 metadata에 저장합니다.

productPaymentStatusService.markAsCompensationNeeded(
    currentPayment,
    e.message,  // 원본 에러
    compensationException.message,  // 보상 에러
)
productPaymentStatusService.markAsCompensationNeeded(
    currentPayment,
    e.message,  // 원본 에러
    compensationException.message,  // 보상 에러
)

8. 취소 프로세스

취소도 Saga 패턴

결제 취소도 여러 단계를 거치므로 Saga 패턴을 적용합니다:

override fun cancelPayment(paymentId: String): ProductPayment {
    val payment = paymentRepository.findProductPaymentById(paymentId)
        ?: throw IllegalArgumentException("결제를 찾을 수 없습니다: $paymentId")

    require(payment.canCancel()) { "취소할 수 없는 상태입니다: ${payment.status}" }

    // Step 1: PG 취소
    if (payment.pgType == PgType.NICEPAY && payment.pgTid != null) {
        try {
            nicePayService.cancelPayment(
                tid = payment.pgTid,
                amount = payment.amount,
                reason = "결제 취소",
            )
            logger.info("PG 취소 완료: paymentId=${payment.id}")
        } catch (e: Exception) {
            logger.error("PG 취소 실패: paymentId=${payment.id}, error=${e.message}", e)
            productPaymentStatusService.markAsPgCancelFailed(payment, e.message)
            throw IllegalStateException("PG 취소 실패로 인해 취소를 진행할 수 없습니다.", e)
        }
    }

    // Step 2: 주문 취소
    if (payment.status == ProductPaymentStatus.COMPLETED && payment.referenceId != null) {
        try {
            productServiceClient.cancelOrder(
                productType = payment.productType,
                productId = payment.productId,
                userId = payment.userId,
                paymentId = payment.id,
            )
            logger.info("주문 취소 완료: paymentId=${payment.id}")
        } catch (e: Exception) {
            logger.error("주문 취소 실패: paymentId=${payment.id}, error=${e.message}", e)
            productPaymentStatusService.markAsOrderCancelFailed(payment, e.message)
            throw IllegalStateException(
                "PG 취소는 완료되었으나 주문 취소 실패. 수동 처리 필요: paymentId=${payment.id}",
                e,
            )
        }
    }

    // Step 3: 최종 취소 완료
    val cancelled = payment.cancel()
    return paymentRepository.saveProductPayment(cancelled)
}
override fun cancelPayment(paymentId: String): ProductPayment {
    val payment = paymentRepository.findProductPaymentById(paymentId)
        ?: throw IllegalArgumentException("결제를 찾을 수 없습니다: $paymentId")

    require(payment.canCancel()) { "취소할 수 없는 상태입니다: ${payment.status}" }

    // Step 1: PG 취소
    if (payment.pgType == PgType.NICEPAY && payment.pgTid != null) {
        try {
            nicePayService.cancelPayment(
                tid = payment.pgTid,
                amount = payment.amount,
                reason = "결제 취소",
            )
            logger.info("PG 취소 완료: paymentId=${payment.id}")
        } catch (e: Exception) {
            logger.error("PG 취소 실패: paymentId=${payment.id}, error=${e.message}", e)
            productPaymentStatusService.markAsPgCancelFailed(payment, e.message)
            throw IllegalStateException("PG 취소 실패로 인해 취소를 진행할 수 없습니다.", e)
        }
    }

    // Step 2: 주문 취소
    if (payment.status == ProductPaymentStatus.COMPLETED && payment.referenceId != null) {
        try {
            productServiceClient.cancelOrder(
                productType = payment.productType,
                productId = payment.productId,
                userId = payment.userId,
                paymentId = payment.id,
            )
            logger.info("주문 취소 완료: paymentId=${payment.id}")
        } catch (e: Exception) {
            logger.error("주문 취소 실패: paymentId=${payment.id}, error=${e.message}", e)
            productPaymentStatusService.markAsOrderCancelFailed(payment, e.message)
            throw IllegalStateException(
                "PG 취소는 완료되었으나 주문 취소 실패. 수동 처리 필요: paymentId=${payment.id}",
                e,
            )
        }
    }

    // Step 3: 최종 취소 완료
    val cancelled = payment.cancel()
    return paymentRepository.saveProductPayment(cancelled)
}

9. 모니터링 및 알림

실패 알림

각 실패 시나리오에서 알림을 발송합니다:

catch (e: Exception) {
    logger.error("결제 승인 실패: paymentId=${payment.id}, error=${e.message}", e)
    
    // 실패 알림 발송
    notificationService.sendPaymentFailureNotification(payment, e)
    
    // 보상 트랜잭션 처리
    handleFailure(currentPayment, e, tid)
    throw e
}
catch (e: Exception) {
    logger.error("결제 승인 실패: paymentId=${payment.id}, error=${e.message}", e)
    
    // 실패 알림 발송
    notificationService.sendPaymentFailureNotification(payment, e)
    
    // 보상 트랜잭션 처리
    handleFailure(currentPayment, e, tid)
    throw e
}

수동 처리 필요 상태

COMPENSATION_NEEDED 상태는 수동 처리가 필요하므로:

관리자 대시보드에 표시합니다.
주기적으로 재시도 스케줄러를 실행합니다.
알림을 발송합니다.

10. Saga 패턴의 장단점

장점

분산 트랜잭션 처리 가능

여러 서비스에 걸친 트랜잭션을 처리할 수 있습니다.
각 서비스의 독립성을 유지합니다.

부분 실패 처리

어느 단계에서 실패했는지 명확히 알 수 있습니다.
보상 트랜잭션으로 일관성을 유지합니다.

확장성

새로운 단계 추가가 쉽습니다.
각 서비스가 독립적으로 발전할 수 있습니다.

가용성

일부 서비스가 실패해도 다른 서비스는 정상 동작합니다.
전체 시스템이 다운되지 않습니다.

단점

복잡성 증가

보상 트랜잭션 로직이 필요합니다.
상태 관리가 복잡합니다.

일관성 지연 (Eventual Consistency)

모든 단계가 완료될 때까지 일관성이 보장되지 않습니다.
중간 상태에서 조회 시 불일치가 가능합니다.

보상 트랜잭션 실패 처리

보상 트랜잭션도 실패할 수 있습니다.
수동 처리 필요 상황이 발생할 수 있습니다.

11. 주의사항 및 베스트 프랙티스

1. 멱등성 보장

각 단계와 보상 트랜잭션은 멱등성(idempotency)을 보장해야 합니다:

// ✅ 멱등성 보장
fun approvePayment(orderId: String, ...): ProductPayment {
    val payment = paymentRepository.findProductPaymentByPgOrderId(orderId)
    
    // 이미 처리된 경우 재처리하지 않음
    if (payment.status == ProductPaymentStatus.COMPLETED) {
        return payment
    }
    
    // ...
}
// ✅ 멱등성 보장
fun approvePayment(orderId: String, ...): ProductPayment {
    val payment = paymentRepository.findProductPaymentByPgOrderId(orderId)
    
    // 이미 처리된 경우 재처리하지 않음
    if (payment.status == ProductPaymentStatus.COMPLETED) {
        return payment
    }
    
    // ...
}

2. 타임아웃 처리

외부 서비스 호출 시 타임아웃을 설정합니다:

@FeignClient(name = "product-service")
interface ProductServiceClient {
    @PostMapping("/orders")
    fun createOrder(@RequestBody request: CreateOrderRequest): OrderResponse
    // 타임아웃은 Feign 설정에서 관리
}
@FeignClient(name = "product-service")
interface ProductServiceClient {
    @PostMapping("/orders")
    fun createOrder(@RequestBody request: CreateOrderRequest): OrderResponse
    // 타임아웃은 Feign 설정에서 관리
}

3. 재시도 전략

일시적 실패에 대한 재시도:

@Retryable(
    value = [RetryableException::class],
    maxAttempts = 3,
    backoff = Backoff(delay = 1000, multiplier = 2)
)
fun createOrder(...): OrderResponse {
    // ...
}
@Retryable(
    value = [RetryableException::class],
    maxAttempts = 3,
    backoff = Backoff(delay = 1000, multiplier = 2)
)
fun createOrder(...): OrderResponse {
    // ...
}

4. 상태 저장

각 단계의 결과를 저장해서 추적 가능하게 합니다:

data class ProductPayment(
    val status: ProductPaymentStatus,
    val pgTid: String?,  // PG 승인 결과
    val referenceId: String?,  // Order ID
    val metadata: Map<String, String>,  // 추가 정보
    // ...
)
data class ProductPayment(
    val status: ProductPaymentStatus,
    val pgTid: String?,  // PG 승인 결과
    val referenceId: String?,  // Order ID
    val metadata: Map<String, String>,  // 추가 정보
    // ...
)

5. 로깅

상세한 로깅으로 디버깅을 쉽게 합니다:

logger.info("PG 승인 완료: paymentId=${payment.id}")
logger.error("결제 승인 실패: paymentId=${payment.id}, error=${e.message}", e)
logger.warn("보상 트랜잭션 실패: paymentId=${payment.id}")
logger.info("PG 승인 완료: paymentId=${payment.id}")
logger.error("결제 승인 실패: paymentId=${payment.id}, error=${e.message}", e)
logger.warn("보상 트랜잭션 실패: paymentId=${payment.id}")

12. 실제 운영 경험

발생한 문제들

문제 1: 보상 트랜잭션 실패

상황:

PG 승인 성공
Order 생성 실패
PG 취소 API 호출 실패 (네트워크 오류)

해결:

COMPENSATION_NEEDED 상태로 저장했습니다.
관리자 대시보드에 알림을 보냈습니다.
수동으로 PG 취소를 처리했습니다.
향후 자동 재시도 스케줄러를 추가할 예정입니다.

문제 2: 중복 처리

상황:

클라이언트가 타임아웃 후 재시도
이미 처리 중인 결제가 중복 처리됨

해결:

상태 체크로 중복 처리를 방지했습니다.
멱등성을 보장했습니다.

개선 사항

자동 재시도 스케줄러
- COMPENSATION_NEEDED 상태의 결제를 주기적으로 재시도합니다.
상태 모니터링
- 각 상태별 결제 수를 모니터링합니다.
- 이상 상태 알림을 설정합니다.
보상 트랜잭션 강화
- 보상 트랜잭션에도 재시도 로직을 추가합니다.
- 지수 백오프를 적용합니다.

13. Next Actions: 개선할 수 있는 부분

현재 구현은 Saga 패턴의 핵심을 잘 구현하고 있지만, 다음과 같은 개선 사항을 고려할 수 있습니다:

1. 자동 재시도 스케줄러 구현

COMPENSATION_NEEDED 상태의 결제를 주기적으로 재시도하는 스케줄러를 추가하여 수동 처리 부담을 줄일 수 있습니다.

2. 보상 트랜잭션 재시도 로직 강화

지수 백오프를 사용한 재시도 메커니즘을 추가하여 일시적 네트워크 오류를 자동으로 복구할 수 있습니다.

3. 모니터링 및 알림 강화

Prometheus + Grafana로 각 상태별 결제 수를 모니터링하고, 이상 상태를 조기 감지할 수 있습니다.

4. 멱등성 키 강화

명시적인 멱등성 키를 도입하여 네트워크 재시도 시 중복 처리를 완전히 방지할 수 있습니다.

5. 분산 추적 강화

Zipkin 또는 Jaeger를 도입하여 전체 Saga 흐름을 시각화하고 디버깅 시간을 단축할 수 있습니다.

6. Saga 오케스트레이터 프레임워크 도입 (장기적)

Temporal이나 Eventuate 같은 프레임워크를 도입하여 Saga 로직을 중앙화하고 자동 재시도 및 보상 트랜잭션을 관리할 수 있습니다.

결론

Saga 패턴은 마이크로서비스 환경에서 분산 트랜잭션을 처리하는 강력한 패턴입니다.

핵심 포인트:

긴 트랜잭션을 작은 트랜잭션으로 분할
각 단계의 상태를 명확히 저장
실패 시 보상 트랜잭션 실행
보상 트랜잭션 실패 시 수동 처리 프로세스

적용 시 고려사항:

상태 머신 설계
보상 트랜잭션 구현
멱등성 보장
모니터링 및 알림
수동 처리 프로세스

최종 코드 구조:

// Orchestrator
@Service
class ProductPaymentService {
    fun approvePayment(...) {
        try {
            step1()  // PG 승인
            step2()  // Order 생성
            step3()  // 완료
        } catch (e: Exception) {
            handleFailure()  // 보상 트랜잭션
        }
    }
}

// 상태 관리
abstract class BasePaymentStatusService {
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    fun markStatus(...) { ... }
}

// 도메인 모델
data class ProductPayment {
    fun markPgApproved(...) { ... }
    fun markAsCompleted(...) { ... }
}
// Orchestrator
@Service
class ProductPaymentService {
    fun approvePayment(...) {
        try {
            step1()  // PG 승인
            step2()  // Order 생성
            step3()  // 완료
        } catch (e: Exception) {
            handleFailure()  // 보상 트랜잭션
        }
    }
}

// 상태 관리
abstract class BasePaymentStatusService {
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    fun markStatus(...) { ... }
}

// 도메인 모델
data class ProductPayment {
    fun markPgApproved(...) { ... }
    fun markAsCompleted(...) { ... }
}

Saga 패턴을 올바르게 적용하면 분산 환경에서도 안정적인 트랜잭션 처리가 가능합니다.

Saga 패턴으로 분산 트랜잭션 처리하기: Payment Service 실전 사례

Saga 패턴으로 분산 트랜잭션 처리하기: Payment Service 실전 사례

서론

1. 문제 상황: Payment Service의 분산 트랜잭션

시나리오: 결제 승인 프로세스

전통적인 접근 방식의 한계

2. Saga 패턴 소개

Saga 패턴이란?

Saga 패턴 흐름도

두 가지 구현 방식

1. Choreography (오케스트레이션 없음)

2. Orchestration (오케스트레이션)

3. 실제 프로젝트 적용: Payment Service

아키텍처 개요

상태 머신 설계

도메인 모델: 상태 전이 메서드

4. Saga 패턴 구현: 결제 승인 프로세스

전체 흐름

핵심: 실패 처리 및 보상 트랜잭션

보상 트랜잭션 구현

5. 중간 상태 저장: 별도 트랜잭션

왜 중간 상태를 저장해야 하나?

Propagation.REQUIRES_NEW 사용

6. 다양한 결제 타입에 대한 확장

전략 패턴 활용

공통 추상화

7. 실패 시나리오별 처리

시나리오 1: PG 승인 실패

시나리오 2: Order 생성 실패 (PG는 성공)

시나리오 3: 보상 트랜잭션 실패

8. 취소 프로세스

취소도 Saga 패턴

9. 모니터링 및 알림

실패 알림

수동 처리 필요 상태

10. Saga 패턴의 장단점

장점

단점

11. 주의사항 및 베스트 프랙티스

1. 멱등성 보장

2. 타임아웃 처리

3. 재시도 전략

4. 상태 저장

5. 로깅

12. 실제 운영 경험

발생한 문제들

문제 1: 보상 트랜잭션 실패

문제 2: 중복 처리

개선 사항

13. Next Actions: 개선할 수 있는 부분

1. 자동 재시도 스케줄러 구현

2. 보상 트랜잭션 재시도 로직 강화

3. 모니터링 및 알림 강화

4. 멱등성 키 강화

5. 분산 추적 강화

6. Saga 오케스트레이터 프레임워크 도입 (장기적)

결론

참고 자료

댓글