URLs

When a user enters text in the search bar of a browser, the text must consist of a well formed Universal Resource Identifier (URI).  If the URL identifies a web page we refer to the URI as a Universal Resource Locator (URL).

A URI has the following general form:

 scheme://[user[:password]@]host[:port][/path][?query][#fragment]

Brackets [] refer to optional parts to a URI.  This implies that the only required parts are scheme://host

  • scheme refers to the name of the Internet Protocol being used.  When requesting web pages, this will often be http or https.  The scheme is followed by a colon and two slashes.
  • user and password identify credentials on the server.
  • host refers to the domain name of the server on which the resource is located.
  • port refers to the port number on which the web server is listening. Web servers usually listen to port 80.
  • path identifies a specific resource on the server.  The format of path depends on the type of web server that is running on the server.  In our case, we’re using an Apache web server which stores the web documents on the underlying hierarchal file system, so the path to a resource is the path to the resource from the root directory for the domain (public_html).  The path may or may not include a file name.  If a file name is not included in the path then the server searches for a file named index.html or index.php.
  • query specifies a string of data that can be passed to a server side script file.
  • fragment specifies an secondary resource identifier.  If the resource is a web page this often refers to an element in the page which the browser can scroll down to.

© 2017, Eric. All rights reserved.