Jun 14, 2021

From HTML To The Screen: How Browsers Render Web Pages

The most basic function of a web browser is to get a html file, along with an optional css, interpret them and display a page to the user. It is a complex process but it is based on basic principles that any developer can understand.

In this post we will learn the basics of html rendering and discuss how we can improve our html and css from what we learned.

Let’s get right to it!

Constructing the DOM

After downloading the html file from the web, the first step taken by the browser is to construct the DOM based on this file.

The DOM (Document Object Model) is the internal browser representation of a page and it is represented as a tree.

To better understand how the DOM is built let’s use the following HTML as an example and go over each step of the DOM construction for this file.

<html>
  <head>
    <link href="main.css" rel="stylesheet" >
    <title>From HTML To The Screen: How Browsers Render Web Pages</title>
  </head>
  <body>
    <h1>Lorem Ipsum</h1>
    <p>Dolor sit amet.</p>
  </body>
</html>

To construct the DOM tree, the browser has to first read each character of the html text file and transform this text into a sequence of tokens.

A token is a representation of a piece of text that has a special meaning and specific rules about how to handle it.

The example HTML above will produce the following sequence of tokens:

HTML > HEAD > LINK > TITLE > TEXT > /TITLE > /HEAD > BODY > H1 > TEXT > /H1 > P > TEXT > /P > /BODY > /HTML

Note that not only HTML tags are identified by tokens. The text content from the title, h1 and p tags generate a TEXT token to identify them.

After transforming the HTML text into tokens, the browser scans each token and arranges them in the DOM tree structure. The browser knows how to create this structure based on the order of these tokens and the rules for each token.

Each node of the DOM tree is also used to store additional information about that node like tag attributes.

DOM tree for the example HTML

The entire process can be represented by this simple image:

DOM construction steps

The CSSOM

The CSSOM is a structure similar to the DOM that is constructed by the browser to store the css information for a page.

Like the DOM, the CSSOM is constructed as a tree and the steps required to generate the CSSOM from a css file are the same as the DOM.

The file is read character by character to generate tokens that will be later arranged in a tree structure.

Let’s have a look how the browser will construct the CSSOM for the following css.

body { font-size: 14px; }
p { line-height: 25px; }
span { color: red }
p span { color: green }

The tree structure of the CSSOM is constructed from the most generic to the most specific rule.

Each node of the CSSOM contains the css rule defined by the css in addition to rules inherited from parent nodes. This rule inheritance is also called “cascading” and that’s where the name Cascading Style Sheets comes from.

DOM construction steps

In the example above it’s possible to see that the font-size rule defined for the body is cascaded down to p and span.

Another interesting thing to note in this example is the rule overriding. The css defines that all span elements should be red. But all span that are inside a p tag will be rendered green because a more specific rule p span overrides the initial rule.

The Render Tree

After creating the DOM tree and the CSSOM tree the browser combines these trees into a new tree called the “render tree”.

The render tree will contain only nodes that need to be rendered by the browser. DOM nodes representing tags like <script> or <link> tags won’t be present in the render tree. Nodes with css rules like display: none won’t be present either.

Layout Calculation

In this phase the browser will read the render tree starting on the root node and will traverse each node of the tree to calculate the exact size and position in pixels of every html element on the page.

This step is necessary because positioning, widths and heights can be defined in the css using relative units like % or em and must be converted to absolute pixels equivalent based on the browser viewport size.

Painting

And finally, the last step. With every position defined in exact pixel units the browser can start painting each pixel of the screen to render the page.

Optimizing The Rendering Process

The page rendering process is long and complex. Based on what we learned so far there are 3 easy optimizations we can do to help the browser to go trough this entire process and display the page as fast as possible to our users.

The first is, remove all unnecessary html and css from your pages. Unnecessary html and css that won’t ever be displayed makes the browser do a lot of useless work.

That’s because the browser will parse and create the DOM and CSSOM trees for all html and css and will only detect what should be displayed or hidden in the render tree construction step.

Second, improve your server response time. By getting the html and css files earlier, the browser can start the whole rendering process earlier too.

Third, reduce the html and css file sizes. That can be done by removing unused html/css or by using file compression at the backend. This will reduce the time needed for the browser to download all the necessary files and will enable it to start the parsing earlier.

In Conclusion

Browsers are a really complex piece of software.

But some basic knowledge of how the browsers work, specially how they render web pages are helpful to be able to create better pages.

If you have any questions, comments or feedback, reach me on twitter at @decode64.

html
dom