Solving for Uncertainty: Refunds
“O aching time! O moments big as years!
All as ye pass swell out the monstrous truth,
And press it so upon our weary griefs
That unbelief has not a space to breathe.”
-John Keats, describing his Refunds experience.
Dude, Where’s my refund?
I have always admired the effort that goes into ensuring that a payment is completed in the fastest possible time frame. Inside Razorpay’s black box, a chain of logically linked systems ensures that a Payment is recorded and pushed in the most seamless manner across geographies, time zones, and Thatte Idlis, all in a matter of milliseconds. The beauty of the Razorpay’s Payment system lies in a labyrinth of complicated programs that help in devising the most optimized way of routing a Transaction, while at the same time ensuring simplicity, security, and compliance with the regulations of Indian Banking ecosystem. It’s a marvelous work of engineering. Card Networks, Banks and Payment Gateways – I have a profound respect for the efforts of each and every stakeholder plays in ensuring the success, speed, and safety of a Payment.
In this movie, Refunds, sadly, is the evil twin of payment. “Why does a refund take time?” is not just the most visited blog entry on Razorpay’s website, but is actually a question that gets answered quite unsatisfactorily every day using a Freshdesk macro in the gazillion refund tickets that get piled up every day with Support (both Razorpay’s and Merchant’s). The answer to this question talks about mind-bogglingly boring terms like the Reconciliation cycle, clearing of books, file-based upload and unavailability of an alternate, by the very same chain of stakeholders who boast about their technical prowess for processing a payment.
Refund is a frustrating territory. In Razorpay’s case, it used to be a disaster a year and a half ago when I set my foot in this firm. In all honesty, I have cribbed several times in all of my rants on 1st-floor balcony of the Product office, that Refund happens to be a 99% problem. Nobody bats an eye for all 99.92% successful refunds. Everyone loses their mind as soon as one of those 0.08% refund requests get escalated on Twitter, more so if a well-known influencer goes on Twitter about his delayed refund for late-night Waffle delivery on Swiggy.
Don’t Book My Show
The problem of refunds started pinching the Core Payments offering as the scale of our payments shot up. After a deep-dive into our refunds data and about a dozen painful calls with almost all of our Key accounts, we identified the buckets where we were facing tons of Refund failures. UPI was found to be one of the major culprits.
To all those who are unaware of the enigma called UPI Refunds, one needs to understand that NPCI does not give out any separate API for processing Refunds. This leaves the Payment Service Providers (PSPs), the guardians of NPCI highway, to use any means necessary for shooting Refunds back to the customer account. So you have one kind of integration where a Refund is triggered as a peer-2-peer (p2p) fund transfer directly to a customer’s account. On the other hand, you have an integration that simply generates a file after a payment facilitator hits its API and uploads that manually on the NPCI portal. Among this conundrum, the demons of refunds are the Deemed transactions that lurk in the remaining 1% bracket of all your payments and are notoriously known for their pending response to API calls and elusive behavior in the Partner Bank MIS report. Basically, if you have a failed payment, with your money debited and a booking failure of your Train ticket, you’ve probably been Deemed.
Similar to payments, in the earlier design, we had a binary system of Refund management. Failed or Processed: a refund could only exist in these two states post its creation. There were numerous problems with this kind of setup. You can have a refund that appears as Failed but is actually successful. You can also have a refund that you’ve retried multiple times but the Gateway is giving out a response that you don’t understand and hence you require further clarity from the Gateway/Banking partner’s tech team before you can make a decision. Also, some refunds are practically timed out so maybe they can be retried via the system after verification. But, there are new failures in the system that need to be addressed. Wait. There’s something wrong with the Bank’s Refund API today. We now have 5000 failed refunds piled up in the system over the weekend. They need to be retried. But, what about the 15-day old case where we have an escalation now? But what about the 45-day old case where the refund is still in a failed state? How many times have we tried that? Any record of it in the system? Shall I pull logs? But I don’t have access to pull logs. Shall I pull the data? But I don’t have access to the data. Let me place a request with Business Analytics and Tech Support team. OMG. Is it lunchtime already? Wait. The containers are empty. What? No Lunch? God. I hate my life. F*** YOU REFUNDS. YOU F*** MY HAPPINESS. GOD I HATE THIS JOB.
In retrospect, the above paragraph was the life of the Refund operations & tech team for a whole year. In order to set up a sound operational process, it was imperative that the right kind of data along with relevant tools was exposed to the agent, with clear instructions on the usage and expectation of the task. Scrooge was born as a savior for battling the uncertainty and reducing the load on Tech and Operations.
Ghost of the Refunds Past
Scrooge is the fancy name we use for our Refund service. It’s an AI-powered (started with the famous if-else statement algorithm) retrying tool for Refund failures. The Data Store for Scrooge records the details of each Refund attempt along in a simple State Management system. These are further acted upon by the Refund Ops team depending upon the nature of Refund failure.
The migration of our older system to Scrooge took 4 months. These 4 months were filled with numerous back and forth various teams internally. Honestly, this was a good kind of pain; pain that makes you stronger over time. Scrooge, though loathed in the beginning by almost everyone, was subsequently accepted as a respected guardian for the Refund nightmares. The road was paved with some serious rounds of war-rooms with members of 3 separate ops, tech and product teams putting their heads together to close the open refund cases. All of the tribal information was slowly (and painfully) codified in Scrooge’s brain. With some critical incidents and mildly catatonic cases of double-refunds, we as an organization finally solved the refund and reconciliation related intricacies of all UPI and non-UPI gateways. Naturally, with patience and grit the Refunds platform team, along with other technology and operations teams, fixed things in near-real-time to take Scrooge to the next level, courageously battling the incoming issues at the same time.
This paved the way for us to explore another dimension of the Refund Consumer experience.
Bro, do you even Instant?
Quite interestingly, Refunds via Fund Transfer was something that we had shipped almost a year ago for one of our merchants. One of our BFSI clients ran on something we internally called a bank transfer refund. Bank transfer refund processed Third Party Validation (TPV) Payment refunds, that consisted of the Bank Account information in the Orders request, directly via Fund Transfer. We had previously implemented a similar payout mechanism for Smart Collect Refunds. The challenge, however, was handling the edge cases of Payout Failures. This was also interesting as there was a lot of buzz around Instant Refunds in the market. This was February and it was only a matter of time we came up with a fully-fledged platform offering for this product. Learning from the experience of honorable PMs before me, I started by writing the merchant facing API document. Later, I bounced the same with several prospective clients who were interested in the first iteration of the product. The first round of feedback opened my eyes to the bigger picture. It also set us on the right path at the very beginning as we corrected our course towards building something that our customers really wanted.
All of it followed a simple idea: how to effectively solve our Merchant’s Refunds problem with the least amount of effort from their end. Long story short, post its launch we quickly closed around 25 enterprise customers in no time. It also opened a whole host of possibilities for our Payment Gateway at places where it was previously difficult for us to set us our footing.
The 7-Star Refund Experience
Refunds problem is far from getting solved. On a daily basis, we see new kinds of errors and issues across gateways and integrations. New integrations come up with their own set of tribulations.
In my opinion, in an organization that aggregates close to 50+ third-party integrations and caters to millions of transactions every day, that’s just the way of life. If you look into the numbers, we’ve decreased our weekly Refund failures by 99.95% from March. With Scrooge, we were also able to set-up an Actionable dashboard so that the Finance Operations team could forefront all Refund issues in a structured and timely manner, thereby preventing escalations. The process now keeps evolving with the onslaught of new integrations and exciting new FinTech products.
With Instant Refunds now in the foray, we are now challenging the status quo by delivering the refund amount to the end-customer within a minute. This breaks the inherent barrier towards refunds in the minds of the end-consumers and develops a unique selling point for the merchant using Razorpay’s services. This shall also boost the overall internet economy by increasing the trust of the end-consumer in the digital payment ecosystem, absolving her of the hassle of people and processes in this digital world.
The idea is pretty simple. If Payments take less than a minute to complete then why should refunds take 5 days?
If you are interested in solving similar problems and want to march onto a career in Fintech and Payments, we are growing the team across functions and levels. Do check out our current open positions here.
**Image courtesy: imgflip.com, Ethan Anderton – Slashfilm.com