The Mystery of the Redirecting URLs

An overview of a challenge faced by one of our summer interns, and the road to its eventual solution.

The Mystery of the Redirecting URLs

By  Vedarth Sharma

I recently completed an internship at Gojek, and in the time I spent at the company, I got to work on some interesting problems. This blog is an attempt to document one of them.

The challenge: Accomplish a Central Authentication Service (CAS) login, and gain access to an internal website, programmatically.

There were two conditions imposed on us:
- This had to be part of a project written in Golang,
- We could not use anything else except the standard libraries of Golang.

Doing it manually, should have been pretty easy. There was a website, let’s call it, website1.com. When we went there, it asked us to press the login button, which redirected us to another website, let’s call it, website2.com?website1encodedurl. So website1 was redirecting us to website2 with its own url encoded as query parameter. When we entered the credentials, it took us to the homepage of website1 (not the login page mind you). So obviously we were handling multiple redirecting urls. This seemed straightforward enough at first glance, but it soon became apparent we’d have to approach this in a stepwise manner.

The Road to a Solution

The first step was to programmatically do a CAS login. A CAS login requires a One-Time Password (OTP) as the password. This OTP is not fixed, and changes every 30 seconds. To generate that, you need a secret token and use Google Authenticator to get the OTP.

When I was new, I thought that the code was being sent to the browser from the server. But I realised later that the OTP was getting generated even when I was not connected to the Internet! So there must be some algorithm at work here. All I needed to do was to write the code which takes care of this. Luckily, I didn’t have to look far to find an answer.

Awesome, we now had an OTP. ✌️

All we needed now was to submit the form to do the CAS login. The form was tricky as well. Login pages generally have a hidden field in them that requires a token, which is necessary to submit the form. Luckily, a simple GET request gave us the said token. After we sent a POST request to submit the form, I got a successful login message.

Mission accomplished? Not quite.

Even though our POST request managed to log us in, we could not gain access to website1.com. We tried to use the cookies obtained from the successful login, but it was useless.

It was time to get our hands dirty. We fired up the developer tools and checked all the requests. Two things stood out.

First was that there was a Location header which had the ticket as a query parameter. Secondly, we noticed that there were two types of cookies. The website2 required a completely different cookie to maintain sessions (and in our case — initialise the session). That’s the reason we were denied access from website2 even after CAS login.

We had cookie1, but we needed cookie2 as well.

When you hit a roadblock, the best strategy is to get back to the basics

We tried recreating the request response cycle in Postman. Much to our surprise, it was working in Postman. What was Postman doing to the requests that our good old browser was not? It almost felt like the required cookie was getting generated magically.

Digging Deeper

It was time to analyse website1 more closely. It was doing something we were clearly missing out on. Remember whenwebsite1 was redirecting to website2, it was also providing a query parameter? We changed the query parameter to our local server and redirected the request there.

Jackpot. We got the ticket.

So the ticket was being sent to the query parameter url. Problem 1 was solved. Although it was a hacky solution, it worked. Secondly, we had to solve the second cookie issue. We were pretty stuck on that one.

Back to the drawing board then. We scrapped the hacky solution and tried CAS login using curl. The curl requests failed to log us in.

Here was the issue — we were going through a lot of redirects. Normally, browsers have a cookie store built-in to help maintain sessions. When we were hitting website2.com?website1encodedurl we were supposed to get cookie2 and use it before getting redirected to the next url. But we were not using the cookie store in curl requests, and were getting redirected directly to the last url in Golang code.

So here was the solution we finally implemented — we used checkredirect in client so that we can get to the next url and not jump directly to the last request.

client := &http.Client{
   CheckRedirect: func(req *http.Request, via []*http.Request) error {
      return http.ErrUseLastResponse
   },
}

After this, everything else was pretty straightforward. We just had to modify each request by incorporating the data gained from last response. We stopped redirection by using the above client, and supplied the necessary cookies with each request. This helped us in maintaining the session even after getting redirected several times. Voila, login successful.

Liked what you read here? Gojek is an engineer’s playground, with an endless supply of interesting problems to solve. If that sounds like your jam, check our gojek.jobs and come help us solve them. Meanwhile, if you’d like to read more of our blog posts, sign up for our newsletter!