Peter Strumolo | Personal Portfolio

HTML Topics

The following are important HTML topics and commonly used in Front-End Developer interviews. This page is where I explore each HTML topic and link to helpful resources.

Doctype and Browser Layout Modes
HTML vs XHTML
Content in Multiple Languages?
Data- Attributes
HTML5 as a Web Platform
Storage
Good Ol' <script> Tag
The Placement of <link> and <script>
Progressive Rendering

Doctype and Browser Layout Modes

The HTML <!doctype> declaration instructs the web browser what version of HTML the page is written in.

In HTML5 there is only one doctype. Including doctype ensures the browser will follow the relevant specification. Without it the browser may have compatibility issues.

The layout engines of browsers use three modes to support websites: quirks mode, almost standards mode, and full standards mode. Before web standards were introduced at W3C there were two versions that pages were written in: Navigator and IE. In Quirks mode the browser will emulate nonstandard behavior in Navigator 4 and IE 5—very important for supporting websites that were built before W3C standards. In almost standards mode only a limited number of quirks are implemented. Finally, in full standards mode the behavior of the browser should be as described in the HTML and CSS spec.

Bringing `doctype` back

Browsers use a doctype to decide which of the above modes to implement. <!doctype html> ensures full standards mode. All existing browsers today use full standards mode for this doctype. Important Note: anything written before <!doctype> will trigger quirks mode in IE 9 and older.

What’s in quirks mode and almost standard mode?

MDN has a list of differences between standards mode and quirks mode behavior here.

The big take away from almost standards mode is that it is just like standards mode except it works like quirks mode with height calculations for line boxes and some inline elements in them. Check out the layout of images inside tables cells to see this in action. Inline boxes that have no non-whitespace text as a child and have no border, padding, or margin do not influence the size of the line box (line-height is ignored) and they do not get a height (background) larger than that of their descendants.

HTML vs XHTML

HTML

HyperText Markup Language is a language that specifies webpage structure. HTML documents are plain text, structured with elements that are surrounded by matching opening and closing tags (<...>).

XHTML

Extensible HyperText Markup Language. HTML can travel to a browser in HTML syntax or XHTML (XML syntax). XML is a markup language with tags that resemble HTML tags. Originally, it was meant to replace HTML

HTML5 Standard

HTML5 defines both of these syntaxes. The MIME type, indicates the choice of syntax. You can’t declare XHTML in a special doctype, but must be present in the Content-Type header. MIME cannot go into an HTML meta tag or it will be ignored.

For XHTML:

HTTP/1.1 200 OK
Content-Type: application/xhtml+xml

<html xml:lang=“en” xmlns=“http://www.w3.org/1999/xhtml”>
...

Differences

XML is flexible because users can define their own tags.
XML rules are rigorous and makes authors work very precisely with regards to syntax.
HTML is a presentation language.
XML is a data description language (more applications than just Web).
HTML is SGML syntax.

Problems with serving XHTML

Many problems occur when you serve your page as text/html and believe you are writing in XHTML, obviously. But problems occur when the server sends the application/xhtml+xml content type as well:

Comment markers sometimes are handled differently.
Special markup characters used in the inline contents of a style or script element will be parsed as markup instead of character data.
CSS and DOM specs have special provisions for HTML that don’t apply to XHTML causing unexpected behavior
- Example: a white gap around your page if you have a background on the <body>.
- More examples
document.write() doesn’t work in XHTML treated as XML.
Table structure in the DOM is different.

Content in Multiple Languages

There are a number of important details to note when deciding how to serve content with multiple languages.

First, use a language attribute on the html tag for a default language of the text, always.

<html lang=“en”>

Second, use language attributes on elements surrounding content in other languages on the page. This will allow you to style or process it differently.

<span title=“Spanish”>
    <a lang=“es” href=“/spanish”>Español</a>
</span>

Note: I put the title attribute in a span so that the title attribute stays in English, not Spanish. We want to hover and see “Spanish” as the title, not “Español”.

Third, use a span or div around content that is in a different language but doesn’t have an element.

The HTTP content-Language header can be used to provide metadata about the intended audience of the page, and can indicate more than one language. Don’t use the http-equiv attribute on a meta element!

Example:
Content-Language: en, hi, pa

Resources

w3.org HTML Language Declarations

`data-` Attributes

data- attributes allow you to associate data with a particular element but not have any other defined meaning. You can store extra information on a standard element.

Little History

Before HTML5, developers put extra data into class and rel attributes. Here is an example if we wanted to define how many items in a list and disregard older messages.

<ul id=“someList” class=“user_Pete list-length_5 age_365”>…</ul>

`data-` attributes to the rescue

With data attributes you can use any lowercase name prefixed with data- and store anything that can be string encoded. The values are private to the page (ignored by search engine bots).

The result is a cleanup element. Basically, anytime you need to associate data with an element that can’t be used using a different attribute data- attributes are the answer.

HTML5 as a web platform

HTML5 is a platform designed to be usable by all Open Web developers. It is the latest versions of the language (HyperText Markup) adding new elements, attributes, and behaviors, and contains a larger set of technologies to allow for more diverse and powerful Web sites and apps.

Semantics

For me, the first building block is more semantic HTML, meaning HTML that reinforces the information and meaning of the document. Core elements of HTML have more meaning, and additional tools like ARIA, help our content grow in meaning.

Not only all that, but a more semantic HTML helps separate our document from the presentation layer (CSS). This means we aren’t laying out our document so that it works for our styles/design.

HTML is a structural language that allows for additional programatic attributes that apply presentation and interaction. HTML5 provides more semantic HTML that will give more contextual meaning to your document, if you choose.

Communication with the server

HTML5 contains new technologies for server communication. Web Sockets now make it possible to open communication sessions between the browser and server. Non-HTML data can be exchanged. Server-sent events API allows the server to have control over pushing events to the client (not previously possible). WebRTC (Real-Time Communications) enables audio/video streaming (and other data sharing) between different clients. You can share application data peer-to-peer without the need to install plug-ins.

Going offline; Storage

Service Workers and AppCache

AppCache allowed us the ability to specify assets to cache really easily, but made assumptions about what you were trying to do. Service Workers continue the goal of providing users with a better offline experience and giving us control for asset caching and custom network requests. A Service Worker allows you to easily set up an app to use cached assets first (a default experience when offline) before getting more data from the network. This is called “Offline First”. This leads to the discussion of native apps.

Audio & Video

Graphical presentation

Performance

Device Access

Sophisticated Styling

Good Resources

MDN HTML5 Technologies Resource

Storage

Cookies

The technology called “cookies” began in the early days of the web as a way to know when requests came from the same web browser.

document.cookie can get and set the cookies with the current document. Basically, when a server receives and HTTP request, it can send a Set-Cookie header with the response and then subsequent requests to the same server sent the cookie value in the form of a Cookie HTTP header.

REST and reliability issues

The problem with cookies begins with them being attached to entire site resource identifies instead of with the particular application state (current set of rendered representations). If the browser’s “back” button is used (history functionality) to go back to a view reflected by the cookie, the application state no longer matches the stored state within the cookie. So, the next request sent to the same server will contain a cookie that does not represent the current application context. Confusion and unreliable.

Web Storage API

Cookies as client storage can be a performance hindrance since most browsers today are capable of using the local storage API. localStorage is one mechanism within the Web Storage API along with sessionStorage. The Web Storage API allows browsers to store key-value pairs in a very intuitive fashion.

sessionStorage maintains a separate storage area for each given origin that's available for the duration of the page session (as long as the browser is open, including page reloads and restores).

localStorage does the same thing, but persists even when the browser is closed and reopened.

Good Ol’ `<script>` Tag

The HTML script element can be used to embed or reference an executable script within HTML or XHTML.

The basic <script> without an async attribute (or defer) is fetched and executed immediately, before the browser begins to parse the page.

If it is possible, setting the boolean async attribute will execute the script asynchronously. The reason to use async is to have the browser not stop what it is doing while downloading the script. (note that async only applies to external scripts, not inline).

Order doesn’t matter to me.
The script depends on nothing in the DOM.

Loading the script at the bottom of the page means the parser is basically done, you’re deferring the script. That is why async isn’t needed.

Another negative to async is that you don’t have control over the order, so it shouldn’t depend on anything.

The script element’s defer attribute is another boolean and will execute the script when the page has finished parsing, similar to just having your <script>’s at the bottom of the </body>. Not all browsers use defer, so watch out!

Placement of `<link>` and `<script>`

The recommended placement of CSS <link> elements in the <head> element is so that the CSS is declared before the <body>. The <head> element is where metadata about the document, including title and links to scripts and stylesheets should be, so this makes sense. Also, having the styles already available before parsing means faster perceived page load for the user.

Even though scripts can go in the <head>, it is generally a good idea to position them just before the </body> since the browser will parse the document and then the script. This is a guarantee that the DOM is ready to be manipulated by the script. If you really want to put a <script> in the head, it should be wrapped so that it waits for the DOM to be loaded (a DOMContentLoaded event).

When the browser encounters a <script> it requests the file synchronously and the parser is blocked (stops parsing HTML). Then the script is loaded and executed. The parser can continue. This causes a horrible user experience. The best practice is to manipulate the DOM after it has loaded.

Performance

If it is possible (meaning: all potential browsers support the feature and order doesn’t matter), you can add the async attribute to a script and put it in the <head> since it will then begin downloading immediately and execute when it has downloaded, without blocking the browser parsing. Page load could be faster with this approach.

Progressive Rendering

I like this definition of progressive rendering:

Build the experience and improve upon it based on the user’s conditions and network.

Basically, a good metaphor is how Photos in iOS will only download what is necessary of the photo for the view. As you zoom in, it downloads more of the photo.

Similarly, you can create a better experience for the user by rendering it as it is needed. This has potential to increase the perceived and probably actual load times.

Historical Context

Progressive HTML Rendering goes back to Chunked encoding, which allows developers to send pieces of content to the browser, breaking up the page into separate components. This allows progressive display of portions of web pages, potentially sending the most important parts to the client first.

To give the web browser a head start in downloading and rendering a page but flushing out early and at multiple times. This allows actual load time and perceived load time to improve.

Nowadays, with HTML/2, smaller resources can be delivered and independently cached.

HTML Topics

Doctype and Browser Layout Modes

Bringing doctype back

What’s in quirks mode and almost standard mode?

HTML vs XHTML

HTML

XHTML

HTML5 Standard

Differences

Problems with serving XHTML

Content in Multiple Languages

Resources

data- Attributes

Little History

data- attributes to the rescue

HTML5 as a web platform

Semantics

Communication with the server

Going offline; Storage

Service Workers and AppCache

Audio & Video

Graphical presentation

Performance

Device Access

Sophisticated Styling

Good Resources

Storage

Cookies

REST and reliability issues

Web Storage API

Good Ol’ <script> Tag

Placement of <link> and <script>

Performance

Progressive Rendering

Historical Context

Bringing `doctype` back

`data-` Attributes

`data-` attributes to the rescue

Good Ol’ `<script>` Tag

Placement of `<link>` and `<script>`