Outside of the most commonly known and supported URI schemes (http:// and https://), most modern web browsers support a full roster of alternate and legacy URI schemes.
Here are some alternate URI schemes you may be familiar with already:
And a few you probably haven’t put much thought into, or didn’t know existed:
URI schemes are the combination of a few separate parts, designed by third parties and adopted into major web browsers. All URI schemes have the following structure:
Lets break this down into separate pieces before discussing where protocols can be a security issue.
and:// - This is where the scheme is declared, and the browser first checks this in order to determine what to do with the following strings.
my:stuff - This is the “path” often used to express a hierarchy of data. You don’t see this in very many URI’s, but in older systems a scheme might declare username and password for an email as
@test.com - Is the host, you see this most frequently in http:// and https:// URI’s. This is used to identify a relevant location.
:8080 - Here we declare the port. All websites and servers require a port to be declared, but you don’t see it often in http:// or https:// URI’s because browsers default to port 80 (invisibly) if no other port is declared in the URI.
location/sublocation - Refers to a file location on the host.
?key=things&things=keys - The query, used to pass through conditional data in the URI as if it was a variable.
#section - The URI fragment, also used to pass through conditional data. Typically used to preserve state on the client.
Why do we need to worry about URI Schemes?
As you probably have guessed, the URI scheme is so flexible that many companies have taken advantage of it to produce their own schemes which have than been adopted into web browsers. Many of these schemes are still supported for legacy reasons, but have not been updated in years or even decades.
You’ve probably seen this in code before, but not recognized it. The most common use case is:
This trick has been used to create links that did not adhere to normal link behavior in many browsers. Really what’s happening is the link is executing the statement
void(0); which returns
This use case is generally harmless.
Consider the following scenario:
Jon is a hacker who is targeting a popular eCommerce site called BuyStuffNow.
BuyStuffNow is like Amazon, except it allows you to buy from one product page without ever leaving (lots of apps do this). The order process goes like this: click a product -> type in credit card -> click purchase to complete transaction.
Below the product is a comments section, where guests can comment on the item and give reviews / warnings to each other.
Jon devises a plan to craft a malicious link to put in his BuyStuffNow comment. He writes a legitimate review, with a legitimate link:
“Hi friends. IF YOU ARE GOING TO BUY THIS PLEASE READ THIS REVIEW FIRST. I used the product for three hours, than this happened.”
The link will be constructed as such:
To a uneducated user, this is simply a link to a legitimate review on another website. The user may enter their credit card information and than decide to read the review - in which case their credit card information is sent to a malicious web server and stored for malicious uses.