Semantic web and Schema.org

Wednesday 22nd July 2020, 11:55:49 Thursday 30th July 2020, 08:54:09

HTML5 Structured Data Schema

When it comes to web and SEO, you think you've heard it all. There are various ways of representing information in HTML such that it can be presented in the browser and understood by search engines. HTML5 introduced new elements such as article, main, time and many more. These help you, as a web developer, to structure your content and make it easier for search engines to identify what you are trying to convey.

And so began semantic web.
See: www.w3schools.com for a list of the semantic elements.

The next level is structured data. Using special attributes in HTML (or even JSON), you can make it extremely clear what you are describing, be it a person, a business, an article, an event, a product, or a vast range of other types.

In object oriented terms, you can think of every type of object as a class, all of which inherit from a concrete base class called "Thing".
See: schema.org for the properties defined in a Thing.

Even intangible (non-physical) types, such as a Brand, Language or Occupation can be described with Schema, and they all inherit from Thing. In this way, it is relatively easy (although time consuming) to define classes in your code which inherit from each subtype as necessary.

For example, a BlogPosting class inherits from SocialMediaPosting, which inherits from Article, which inherits from CreativeWork, which inherits from Thing.
Indeed, if you View Source on this very page, you should find the attributes related to this schema.
Also see: schema.org for a complete list of valid properties.

Some properties are other objects, some properties are strings, numbers, dates, etc., and some are arrays of any of those. It all depends on which property. Schema.org helps you to decide how best to structure it.

The main advantage of doing this, other than clarity of code, is the ability to encourage 'snippets' to appear in search engines such as Google. I have high hopes that in future, more search engines will read Schema properties from HTML code and there will be a semantic web revolution. After all, if you are looking for a specific type of 'Thing' on the web, why not introduce the possibility to filter search results by Schema type? Then introduce the possibility to filter results by the value of specific properties.

Free text search is great, and Google has spent a long time perfecting their sentiment analysis, so they can guess what you're really after. But semantic web provides clarity of sentiment, without compromising the design elements of your site.

The nitty gritty

So how does it work? There are three HTML attributes that you need to be aware of:

itemtype
itemscope
itemprop

The itemtype is used to define a container for which item you are defining. Typically, this might be a containing div. The itemtype attribute's value should be prefixed with "https://schema.org/", for example as "https://schema.org/Article" for an Article.

You should always specify the itemscope attribute on every element that has an itemtype attribute. The itemscope attribute is boolean, so you don't need to specify a value.

The itemprop attribute is used to indicate which property this tag or object represents. That can be as simple as itemprop="name" for a string element, or you can combine itemprop, itemtype and itemscope to declare that the property is another object, with its own itemprops and subtypes.

As for specifying the content, this is pretty easy. For tags such as p, span or div, place the string content between the open and closing tags. For the anchor tag a, an itemprop of contentUrl will refer to the href attribute. For meta tags, use the content attribute. Simple!

How can i verify my markup?

There are a few tools out there.

search.google.com is my personal favourite, but unfortunately this tool is being deprecated.

linter.structured-data.org is a bit crude, but still useful for performing some validation.

search.google.com is Google's new tool, but at the moment it isn't very useful, as it doesn't validate all schema, it only validates those items that Google wants to show in snippets.