Skip to content

Added Payment Service design#25

Open
ganeshkumarm1 wants to merge 2 commits intomainfrom
payment-service
Open

Added Payment Service design#25
ganeshkumarm1 wants to merge 2 commits intomainfrom
payment-service

Conversation

@ganeshkumarm1
Copy link
Collaborator

No description provided.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 3 blunder(s) and 18 issue(s). Please address the blunders before merging. (5 thing(s) done really well!)

— Automated review by SweetCodey Design Reviewer

@@ -0,0 +1,522 @@
# Payment System

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title should be ALL CAPS per the style guide (e.g., # PAYMENT SYSTEM).

---

## Requirements
### Functional Requirements

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLUNDER: Functional requirements are using bullet points instead of the HTML table format with Requirement and Description columns. Either format is acceptable per the guide, but the non-functional requirements below also use bullet points — so at least it's consistent. However, I think a key functional requirement is missing: Refunds. You mention 'refunded' as a transaction state in the Manage Payment Transaction requirement, but there's no explicit requirement for initiating refunds. For a payment system, this feels like a significant gap.

* **Payment Authorization** - The system should send payment requests to the appropriate payment networks, or banks and receive authorization or rejection responses.
* **Idempotent Payment Processing** - The system should prevent duplicate charges when the same payment request is retried due to network failures or client retries.

### Non-Functional Requirements

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-functional requirements should use HTML tables per the style guide. Currently using bullet points.

### Non-Functional Requirements
* **Consistency** - The system must maintain correct and consistent transaction states. A payment should never result in conflicting states such as both success and failure, and duplicate charges must be avoided.
* **Security** - The payment system handles sensitive financial data. Payment information must be protected using strong encryption both in transit and at rest.
* **Availability** – The system should remain highly available (e.g., 99.99% uptime) so that merchants can continue accepting payments even during peak traffic

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Missing period at the end of the Availability requirement sentence.

* **Security** - The payment system handles sensitive financial data. Payment information must be protected using strong encryption both in transit and at rest.
* **Availability** – The system should remain highly available (e.g., 99.99% uptime) so that merchants can continue accepting payments even during peak traffic
* **Scalability** – The system should be able to handle large spikes in transaction volume, especially during peak shopping periods or promotional events.
* **Idempotency** – The system should ensure that retrying the same payment request does not result in duplicate transactions or multiple charges.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idempotency is already listed as a functional requirement (line 52). Having it in both functional and non-functional requirements is redundant — consider keeping it in one place to avoid confusion.


![](Resources/DiveDeep_HSM.png)

You can think of it like a **secure vault where the key always stays inside**. Normally, we would take the key out of a locker to lock or unlock something. With an HSM, we don’t take the key out. Instead, we bring the item to the vault, the operation happens inside, and you receive the result while the key remains safely inside.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this analogy. 'Bring the item to the vault' makes the concept instantly understandable.


You can think of it like a **secure vault where the key always stays inside**. Normally, we would take the key out of a locker to lock or unlock something. With an HSM, we don’t take the key out. Instead, we bring the item to the vault, the operation happens inside, and you receive the result while the key remains safely inside.

### Preventing Payment Data Loss

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The durability section covers database replication and WAL well, but it doesn't directly address the specific problem stated: the gateway crashing before saving the bank's approval. How does the system recover in that exact scenario? Event replay from the Payment Event Stream? Reconciliation? The problem is stated but the specific recovery mechanism isn't clearly connected.


---

## API Design

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No rate limiting is discussed anywhere in the API design. For a payment system with public-facing endpoints, this is important to prevent abuse.

@@ -0,0 +1,522 @@
# Payment System

<!-- toc -->

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TOC doesn't have entries using the styled section header format, but that's fine since it uses markdown-toc. However, there's no entry for the 'Authorization vs Settlement' or 'Payment State Machine' subsections which are significant parts of the HLD.

* Although the user experiences a simple redirect and OTP verification, several systems — including the payment processor, card network directory server, and the bank’s Access Control Server — coordinate behind the scenes to complete this authentication step.
13. The **Issuing Bank** verifies the OTP and returns the authorization result (`AUTHORIZED` or `DECLINED`).

**Payment State Machine**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The payment state machine is mentioned and there's an image reference, but the valid state transitions aren't enumerated in text. If the image fails to load, readers have no way to understand the states. Can we list the valid transitions (e.g., CREATED → AUTH_PENDING → AUTHORIZED → ...) in text as well?

@github-actions github-actions bot dismissed their stale review March 19, 2026 11:20

Superseded by new review on latest commit.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 14 issue(s) to address. No critical blunders. (11 thing(s) done really well!)

— Automated review by SweetCodey Design Reviewer


## Introduction
### What is a payment system?
Let’s travel back to the early 1990s. Imagine you go shopping at a store. After selecting your items, you go to the checkout counter and pay for them. The most common way to pay is with cash. In some cases, such as large purchases like a car or a house, people use other payment methods like Demand Drafts (DD) or cheques. The payee deposits the DD or cheque in their bank, and the bank transfers the money from the payer’s account to the payee’s account.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the storytelling approach here — starting from the 1990s and building up to digital payments is a great way to set context for beginners!


---

## Glossary

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The glossary section with clear definitions and real-world examples (HDFC, Stripe, Visa) is excellent. Really helps ground abstract concepts.

<td>Notify merchant backend asynchronously for key payment events (e.g., payment.authorized, payment.failed, payment.refunded) so merchant systems can update order state reliably.</td>
</tr>
<tr>
<td><b>Idempotent write operations</b></td>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idempotency is great to call out, but it reads more like a cross-cutting technical concern than a functional requirement from the user/merchant perspective. Consider moving it to NFRs or the API design section where it's already discussed in detail.

</tr>
</table>

### Non-Functional Requirements

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a Latency NFR here? For a payment system, response time matters — users sitting on a checkout page expect fast feedback. Something like "Payment initiation should respond within 1-2 seconds" would be useful.

</tr>
</table>

### Non-Functional Requirements

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add a Reliability NFR? For a payment system, exactly-once processing semantics (or at-least-once with idempotency) is critical. A payment should never be charged twice or silently dropped.

* Firewalls - Install and maintain network security controls.

### Keeping Sensitive Data Secure
A **Hardware Security Module (HSM)** is a special device that is used to securely store encryption keys. These keys are used to encrypt/decrypt sensitive data such as card numbers. In a normal system, an application might retrieve the encryption key from the storage, use it to encrypt/decrypt data. This means that the key temporarily exists in the application server’s memory, which can be risky if the system is compromised.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HSM explanation with the vault analogy on line 615 is excellent — exactly the kind of beginner-friendly explanation the style guide calls for.


You can think of it like a **secure vault where the key always stays inside**. Normally, we would take the key out of a locker to lock or unlock something. With an HSM, we don’t take the key out. Instead, we bring the item to the vault, the operation happens inside, and you receive the result while the key remains safely inside.

### Preventing Payment Data Loss

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "gateway crashed before saving" scenario is a perfect motivating example. And tying it back to the event stream + reconciliation at line 639 closes the loop beautifully.


---

## High Level Design

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HLD section doesn't mention rate limiting anywhere. For a payment system, rate limiting is critical to prevent abuse (e.g., card testing attacks where bots try thousands of stolen cards). Can we add this?


---

## High Level Design

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no mention of monitoring and alerting in the entire design. For a payment system, observability is critical — things like tracking payment success rates, latency percentiles, and alerting on anomalous failure spikes. Can we add at least a brief mention?


---

## Deep Dive Insights

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deep dive section covers durability and security well, but there's no discussion of concurrency handling. What happens if two refund requests for the same payment arrive simultaneously? Or if a retry of the /pay endpoint hits the system while the first request is still processing? Optimistic locking or DB-level constraints should be mentioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant