Creating System.Uri in PHP

.NET PHP URI
For the past ten years, I've been building a library of reusable code in PHP. Initially, it was a set of functions that I could use on multiple projects, simply by copying the files into a new directory.

After I learnt object orientation in 2012, I started rebuilding the library with classes instead of stand-alone functions().

Through extensive programming of C# in .NET, I had discovered the beauty of namespaces, so in 2015 I decided to rebuild the entire library using namespaces.

By 2016, I'd fell in love with interfaces and my library took a great leap forward.

Fast forward to 2020, and I'm introducing .NET-style classes and programming techniques such as generics to my library.

Recently, I have decided to create the PHP equivalent of the System.Uri class from .NET

See: docs.microsoft.com

This is still a work in progress, and no doubt i will continue to update this article with more powerful and reliable code.


<?php

namespace System
{
use ACA\Text\URI\Host;
use ACA\Text\URI\QueryString;
use ACA\Text\URI\NamedID;
use ACA\International\Locale;
use ACA\Collections\IReadOnlyStringDictionary;
use ACA\Arrays\INumericArray;
use ACA\Time\DateTime;

/**
* Provides an object representation of a uniform resource identifier (URI) and easy access to the parts of the URI.
* @author Antony Charles Allen
* @since 24th July 2020
* @link docs.microsoft.com
*/
final class Uri implements INumericArray
{
private const SCHEME_REGEX = '[a-z][a-z-]+';
private const MAX_LENGTH = 65519;

public const LEFT_PART_AUTHORITY = 1;
public const LEFT_PART_PATH = 2;
public const LEFT_PART_QUERY = 3;

public const UriSchemeHttp = 'http';
public const UriSchemeHttps = 'https';


private const DEFAULT_PORTS = array(
80 => self::UriSchemeHttp,
443 => self::UriSchemeHttps
);


#region Private fields
private string $scheme = '';
private ?Host $host = null;
private int $port = 0;
private ?Locale $locale = null;
private array $segments = array();
private ?QueryString $query = null;
private string $fragment = '';
#endregion


#region Microsoft .NET Core Properties
/**
* Gets whether the Uri instance is absolute.
*/
public function IsAbsoluteUri() : bool
{
return !is_null($this->Scheme()) && !is_null($this->host);
}

/**
* Gets the scheme name for this URI.
*/
public function Scheme() : ?string
{
if (strlen($this->scheme) > 0) return $this->scheme;

if ($this->port > 0 && array_key_exists($this->port, self::DEFAULT_PORTS))
{
return self::DEFAULT_PORTS[$this->port];
}

return null;
}

/**
* Gets the Domain Name System (DNS) host name or IP address and the port number for a server.
* @link docs.microsoft.com
*/
public function Authority() : string
{
if (!$this->IsAbsoluteUri()) throw new \Exception("This instance represents a relative URI, and this property is valid only for absolute URIs.");

$authority = $this->host->__toString();

if ($this->port > 0 && !$this->IsDefaultPort()) $authority .= ':'.$this->port;

return $authority;
}

/**
* Gets the port number of this URI.
* The port number defines the protocol port used for contacting the server referenced in the URI.
* If a port is not specified as part of the URI, the Port property returns the default value for the protocol.
* If there is no default port number, this property returns -1.
*/
public function Port() : int
{
if (!$this->IsAbsoluteUri()) throw new \Exception("This instance represents a relative URI, and this property is valid only for absolute URIs.");

if ($this->port > 0) return $this->port;

$scheme = $this->Scheme();

if (!is_null($scheme) && in_array($scheme, self::DEFAULT_PORTS))
{
return array_search($scheme, self::DEFAULT_PORTS);
}

return -1;
}

/**
* Gets whether the port value of the URI is the default for this scheme.
* @return bool A Boolean value that is true if the value in the Port property is the default port for this scheme; otherwise, false.
* @link docs.microsoft.com
*/
public function IsDefaultPort() : bool
{
return ($this->port > 0 &&
array_key_exists($this->port, self::DEFAULT_PORTS) &&
self::DEFAULT_PORTS[$this->port] === $this->Scheme());
}

/**
* Gets the absolute path of the URI.
*/
public function AbsolutePath() : string
{
if (!$this->IsAbsoluteUri()) throw new \Exception("This instance represents a relative URI, and this property is valid only for absolute URIs.");

$str = '';

if (!is_null($this->locale)) $str .= '/'.$this->locale;

$str .= $this->Path();

return $str;
}

/**
* Gets an array containing the path segments that make up the specified URI.
* NOTE: This excludes the locale, if present in the URI.
* @return string[]
*/
public function Segments() : array
{
return $this->segments;
}

/**
* Gets any query information included in the specified URI.
*/
public function Query() : string
{
if (!is_null($this->query)) return $this->query->__toString();

return '';
}
#endregion


#region Other properties
/**
* The path excluding the locale
*/
public function Path() : string
{
$str = '/';

$i = count($this->segments);

foreach ($this->segments as $seg)
{
$str .= urlencode($seg);

$i--;

if ($i > 0) $str .= '/';
}

return $str;
}

/**
* Whether this URI is a reference to a filename with an extension
*/
public function IsFilename() : bool
{
$top = $this->TopPathSegment();

if (preg_match('/^.+\.[a-z0-9]{2,}$/i', $top))
return true;
else
return false;
}
#endregion


/**
* Initializes a new instance of the Uri class with the specified URI.
* @param string $uriString A string that identifies the resource to be represented by the Uri instance.
*/
public function __construct(string $uriString)
{
if (strlen($uriString) === 0) throw new \InvalidArgumentException(self::class.' constructor cannot be blank string!');

if (strlen($uriString) > self::MAX_LENGTH) throw new \InvalidArgumentException("uriString [$uriString] exceeds ".self::MAX_LENGTH.' characters');

$this->Process($uriString);
}

private function Process(string $uriString) : void
{
if (strlen($uriString) === 0) throw new \InvalidArgumentException(__METHOD__." parameter cannot be blank string!");

$uriString = urldecode($uriString);

#region Scheme
if (preg_match('/^('.self::SCHEME_REGEX.'):\/\/'.Host::HOST_FIRST_CHAR_REGEX.'/i', $uriString, $matches))
{
$scheme = strtolower($matches[1]);

if (strlen($scheme) > 1023) throw new \InvalidArgumentException("The scheme [$scheme] exceeds 1023 characters.");

if (static::CheckSchemeName($scheme))
{
$this->scheme = $scheme;
}
else throw new \InvalidArgumentException("The scheme [$scheme] in [$uriString] is not correctly formed.");

$uriString = substr($uriString, strlen($this->scheme)+1);
}
#endregion

#region Authority
if (preg_match('/^\/\/('.Host::HOST_FIRST_CHAR_REGEX.'[a-z0-9-\.]*[a-z0-9])((:([0-9]{1,5}))|\/|$)/i', $uriString, $matches))
{
$host = strtolower($matches[1]);

$this->host = new Host($host);

if (array_key_exists(4, $matches))
{
$this->port = (int)$matches[4];
}

$len = strlen($matches[0]);

if ($matches[2] === '/') $len--;

$uriString = substr($uriString, $len);
}
#endregion

if (strlen($uriString) === 0) return;

#region Check if absolute URI
if (substr($uriString, 0, 1) === '/') // absolute URI
{
$this->segments = array();

$uriString = substr($uriString, 1);
}
#endregion

if (strlen($uriString) === 0) return;

#region Path Segments
if (preg_match('/^([^\?#]+?)(\?|#|$)/', $uriString, $matches))
{
$path = $matches[1];

$len = strlen($path);

do {
$path = str_replace('//', '/', $path, $count);
}
while ($count > 0);

$segments = explode('/', $path);

for ($i=0; $i<count($segments); $i++)
{
$segment = $segments[$i];

if ($segment === '.') continue;

if ($segment === '..')
{
if (count($this->segments) > 0) unset($this->segments[array_key_last($this->segments)]);

continue;
}

if ($i === 0 && Locale::IsCompleteLocaleString($segment))
{
$this->locale = new Locale($segment);

continue;
}

$this->segments[] = $segment;
}

$uriString = substr($uriString, $len);
}
#endregion

if (strlen($uriString) === 0) return;

#region Query String
if (preg_match('/^(\?.*?)(#|$)/', $uriString, $matches))
{
$query = $matches[1];

$len = strlen($query);

$this->query = new QueryString($query);

$uriString = substr($uriString, $len);
}
#endregion

if (strlen($uriString) === 0) return;

if (preg_match('/^#(.*)$/', $uriString, $matches))
{
$this->fragment = $matches[1];
}
else throw new \InvalidArgumentException("Could not process the remaining uriString [$uriString]");
}


#region Static Methods
/**
* Determines whether the specified scheme name is valid.
* @param string $schemeName The scheme name to validate.
* @link docs.microsoft.com
*/
public static function CheckSchemeName(string $schemeName) : bool
{
$schemeName = strtolower($schemeName);

if (preg_match('/^'.self::SCHEME_REGEX.'$/i', $schemeName)) return true;

else return false;
}
#endregion


#region Public Methods from .NET
/**
* Gets the specified portion of a Uri instance.
* @param int $part One of the LEFT_PART_ constants on this class
*/
public function GetLeftPart(int $part) : string
{
if ($part <= 0) throw new \InvalidArgumentException("The part [$part] must be greater than zero!");
if ($part > self::LEFT_PART_QUERY) throw new \InvalidArgumentException("The part [$part] must not be greater than ".self::LEFT_PART_QUERY);

if (!$this->IsAbsoluteUri()) throw new \Exception("The current Uri instance is not an absolute instance.");

$str = '';

switch ($part)
{
case self::LEFT_PART_QUERY:
if ($this->HasQuery() && !$this->QueryString()->IsEmpty())
{
$str = $this->Query();
}

case self::LEFT_PART_PATH:
$str = $this->AbsolutePath() . $str;

case self::LEFT_PART_AUTHORITY:
$str = $this->Scheme().'://'.$this->Authority() . $str;
}

return $str;
}
#endregion


public static function Current() : self
{
$str = '';

if (array_key_exists('REQUEST_SCHEME', $_SERVER)) $str .= $_SERVER['REQUEST_SCHEME'].':';

$str .= '//';

if (array_key_exists('HTTP_HOST', $_SERVER) || array_key_exists('SERVER_NAME', $_SERVER))
{
if (array_key_exists('HTTP_HOST', $_SERVER) && array_key_exists('SERVER_NAME', $_SERVER))
{
if (strpos($_SERVER['HTTP_HOST'], $_SERVER['SERVER_NAME']) !== false)
{
$str .= preg_replace('/:[\d]+$/', '', $_SERVER['HTTP_HOST']);
}
else throw new \Exception($_SERVER['HTTP_HOST'] . ' is not part of ' . $_SERVER['SERVER_NAME']);
}
else if (array_key_exists('HTTP_HOST', $_SERVER))
{
$str .= $_SERVER['HTTP_HOST'];
}
else $str.= $_SERVER['SERVER_NAME'];
}

if (array_key_exists('SERVER_PORT', $_SERVER)) $str .= ':'.$_SERVER['SERVER_PORT'];

if (array_key_exists('REQUEST_URI', $_SERVER)) $str .= $_SERVER['REQUEST_URI'];

return new self($str);
}


private static function PathSegmentIsNumeric(string $segment) : bool
{
if (is_numeric($segment)) return true;

if (preg_match('/^[0-9]+-/', $segment)) return true;

return false;
}


/**
* Get the a path segment, regardless of whether it is numeric or not
* @param int $index The zero-based index of the path segment
*/
public function GetPathSegment(int $index) : string
{
if (array_key_exists($index, $this->segments)) return $this->segments[$index];

return '';
}


public function TopPathSegment() : string
{
if (count($this->segments) > 0) return $this->segments[count($this->segments) - 1];

else return '';
}


/**
* Number of breadcrumbs in the URI
* @return int
*/
public function NumBreadcrumbs() : int
{
return count($this->Breadcrumbs());
}

/**
* Get an array of page-style breadcrumbs in the path
* @return string[]
*/
public function Breadcrumbs() : array
{
$array = array();

foreach ($this->segments as $segment)
{
if (!self::PathSegmentIsNumeric($segment)
&& !preg_match('/^[a-f0-9]{32}$/i', $segment)
&& !preg_match('/^[a-f0-9]{40}$/i', $segment))
$array[] = $segment;
}

return $array;
}

/**
* Get the string breadcrumb by index
* Returns an empty string if it doesn't exist
* @param int $index The zero-based index
* @return string
*/
public function GetBreadcrumb(int $index) : string
{
$i = 0;

$breadcrumbs = $this->Breadcrumbs();

foreach ($breadcrumbs as $b)
{
if ($i === $index) return $b;

else $i++;
}

return '';
}


/**
* Get the last non-numeric string in the path
* @return string
*/
public function TopBreadcrumb() : string
{
$breadcrumbs = $this->Breadcrumbs();

if (count($breadcrumbs) > 0)
{
return $breadcrumbs[count($breadcrumbs)-1];
}

return '';
}


/**
* Number of path segments in the Uri
* @return int
*/
public function NumPathSegments() : int
{
return count($this->segments);
}


public function RemovePathSegment(int $index) : bool
{
if (!array_key_exists($index, $this->segments)) return false;

unset($this->segments[$index]);

$this->segments = array_values($this->segments);

return true;
}


/**
* Get the ID from the path by index
* @param int $index The index of the numeric paths
* @return int
*/
public function GetID(int $index) : int
{
$i = 0;

foreach ($this->segments as $path)
{
if (is_numeric($path))
{
if ($index === $i) return $path;

$i++;
}
else if (NamedID::IsNamedID($path, $id))
{
if ($index === $i)
{
return $id;
}

$i++;
}
}

return 0;
}

public function GetNamedID(int $index) : ?NamedID
{
$i = 0;

foreach ($this->segments as $path)
{
if (is_numeric($path))
{
if ($index === $i) return null;

$i++;
}
else if (NamedID::IsNamedID($path, $id))
{
if ($index === $i)
{
return new NamedID($path);
}

$i++;
}
}

return null;
}


/**
* Get the MD5 or SHA1 hash from the path
* @param int $index
* @return string
*/
public function GetHash(int $index) : string
{
$i = 0;

foreach ($this->Segments() as $path)
{
if (preg_match('/^[0-9a-f]{32}$/i', $path) || preg_match('/^[0-9a-f]{40}$/i', $path))
{
if ($index === $i) return $path;

$i++;
}
}

return '';
}


/**
* Get the date by index from the path segments, otherwise null if no date exists
* @param int $index The zero-based index
*/
public function GetDate(int $index=0) : ?DateTime
{
$i = 0;

foreach ($this->segments as $segment)
{
if (preg_match('/^[0-9]{4}-[01][0-9]-[0-3][0-9]$/', $segment))
{
if ($index === $i && DateTime::IsValidDate($segment)) return new DateTime($segment);

$i++;
}
}

return null;
}


public function RemoveHash(int $index) : void
{
$i = 0;

foreach ($this->segments as $k => $segment)
{
if (preg_match('/^[0-9a-f]{32}$/i', $segment) || preg_match('/^[0-9a-f]{40}$/i', $segment))
{
if ($index === $i)
{
unset($this->segments[$k]);
$this->segments = array_values($this->segments);
return;
}

$i++;
}
}
}


public function HasQuery() : bool
{
return !is_null($this->query);
}

public function QueryString() : ?QueryString
{
return $this->query;
}


/**
* Remove the whole query string from the URI
*/
public function RemoveQueryString() : void
{
$this->query = null;
}


/**
* Remove a key from the query string
* @param string $key The key to remove
*/
public function RemoveQueryStringKey(string $key) : bool
{
if (!$this->HasQuery()) return false;

return $this->QueryString()->RemoveKey($key);
}


/**
* Add a string to the end of the path
* @param string $breadcrumb The segment to append
*/
public function AddSegment(string $segment)
{
$this->segments[] = $segment;
}


public function Domain() : string
{
if (!is_null($this->host)) return $this->host->Domain();

else return '';
}


public function ContainsBreadcrumbs(self $URI) : bool
{
$breadcrumbs = $URI->Breadcrumbs();

$count = count($breadcrumbs);

if ($count === 0)
{
if ($this->NumBreadcrumbs() === 0) return true;

else return false;
}

$mycrumbs = $this->Breadcrumbs();

if (count($mycrumbs) < $count) return false;

for ($i=0; $i<$count; $i++)
{
$other = $breadcrumbs[$i];

$mine = $mycrumbs[$i];

if ($other !== $mine) return false;
}

return true;
}


public function Locale() : ?Locale
{
return $this->locale;
}


public function SetLocale(Locale $locale) : void
{
$this->locale = $locale;
}


public function AddQueryString(string $key, string $value = null) : void
{
if (!$this->HasQuery()) $this->query = new QueryString('?');

if (is_null($value))
$this->query->Add($key);
else
$this->query->Add($key, $value);
}


/**
* Merge key-values from a query string into the Uri
* @param IReadOnlyStringDictionary $dictionary
*/
public function MergeQueryString(IReadOnlyStringDictionary $dictionary)
{
$keys = $dictionary->Keys();

foreach ($keys as $key)
{
$value = '';

if ($dictionary->TryGetValue($key, $value))
{
$this->AddQueryString($key, $value);
}
}
}


public function SetPath(string $path) : void
{
if (strlen($path) === 0) throw new \InvalidArgumentException("Path cannot be empty!");

if (substr($path, 0, 1) !== '/') throw new \InvalidArgumentException("Path [$path] must begin with a forward slash");

if (substr($path, 0, 2) === '//') throw new \InvalidArgumentException("Path [$path] cannot be a host");

if (strpos($path, '?') !== false) throw new \InvalidArgumentException("Path [$path] must not contain a query string");

if (strpos($path, '#') !== false) throw new \InvalidArgumentException("Path [$path] must not contain a fragment");

$this->Process($path);
}


/**
* Set a breadcrumb in the path
* @param int $index The zero-based index
* @param string $breadcrumb The breadcrumb to set
*/
public function SetPathSegment(int $index, string $breadcrumb) : void
{
$this->segments[$index] = $breadcrumb;

ksort($this->segments);
}



public function SetFragment(string $fragment) : void
{
$this->Process('#'.$fragment);
}


public function __toString() : string
{
$str = '';

if ($this->IsAbsoluteUri())
{
$str = $this->GetLeftPart(self::LEFT_PART_QUERY);
}
else
{
if (!is_null($this->host))
{
$str .= '//'.$this->host;

if ($this->port > 0) $str .= ':'.$this->port;
}

if (count($this->segments) > 0)
{
foreach ($this->segments as $segment)
{
$str .= '/'.urlencode($segment);
}
}
else if (!is_null($this->host)) $str .= '/';

if ($this->HasQuery() && !$this->query->IsEmpty())
{
$str .= $this->Query();
}
}

if ($this->fragment) $str .= '#'.$this->fragment;

return $str;
}
}
}


The keen .NET programmers out there may notice that this class is not identical to .NET's System.Uri, as there are some additional functions that allow you to modify the Uri object. In .NET, you should use a UriBuilder class. The reason for this is that it makes the Uri class in .NET immutable. You need a UriBuilder to access properties and make changes before getting a new Uri object. I have not chosen to do this in PHP at the moment, but perhaps I'll come to it later.

Please feel free to copy this code and use it for your own projects. I'll be posting the Host, QueryString, NamedID and Locale classes in separate posts a bit later.
Hey you! I need your help!

Thanks for reading! All the content on this site is free, but I need your help to spread the word. Please support me by:

  1. Sharing my page on Facebook
  2. Tweeting my page on Twitter
  3. Posting my page on LinkedIn
  4. Bookmarking this site and returning in the near future
Thank you!