At the heart of any browser engine is networking: Connecting with services and other users. Unlike other engines, WebKit approaches this more abstractly by leaving a large portion of the networking up to individual ports. This includes network protocols such as HTTP, WebSockets, and WebRTC. The upside to this approach is a higher level of integration with the system-provided libraries and features so WebKit will behave similarly to other software on the platform often with more centralized configuration.
Due to this abstraction there are a few independent layers that make up the networking stack of WPE. In this post, I’ll break down what each layer accomplishes as well as give some insight into the codebase’s structure.
Before we get into the libraries used for WPE, let’s discuss WebKit itself. Despite abstracting out a lot of the protocol handling, WebKit itself still needs to understand a lot of fundamentals of HTTP.
WebCore (discussed in WPE Overview) understands HTTP requests, headers, and cookies, as they are required to implement many higher-level features. What it does not do is the network operations, most parsing, or on-disk storage. In the codebase, these are represented by
ResourceResponse objects, which map to general HTTP functionality.
A core part of modern web engine security is the multi-process model. In order to defend against exploits, each website runs in its own isolated process that does not have network access. In order to allow for network access, they must talk over IPC to a dedicated NetworkProcess, typically one per browser instance. The NetworkProcess receives a
ResourceRequest, creates a
NetworkDataTask with it to download the data, and responds with a
ResourceResponse to the WebProcess which looks like this:
WPE implements the platform-specific versions of the classes above as
NetworkDataTaskSoup, primarily using a library called libsoup.
The libsoup library was originally created for the GNOME project’s email client and has since grown to be a very featureful HTTP implementation, now maintained by Igalia.
At a high level, the main task that libsoup does is manage connections and queued requests to websites and then efficiently streams the responses back to WPE. Properly implementing HTTP is a fairly large task, and this is a non-exhaustive list of features it implements: HTTP/1.1, HTTP/2, WebSockets, cookies, decompression, multiple authentication standards, HSTS, and HTTP proxies.
On its own, libsoup is really focused on the HTTP layer and uses the GLib library to implement many of its networking features in a portable way. This is where TCP, DNS, and TLS are handled. It is also directly used by WebKit for URI parsing and DNS pre-caching.
Using GLib also helps standardize behavior across modern Linux systems. It allows configuration of a global proxy resolver that WebKit, along with other applications, can use.
Another unique detail of our stack is that TLS is fully abstracted inside of GLib by a project called GLib-Networking. This project provides multiple implementations of TLS that can be chosen at runtime, including OpenSSL and gnutls on Linux. The benefit here is that clients can choose the implementation they prefer—whether for licensing, certification, or technical reasons.
Let’s go step by step to see some real world usage. If we call
webkit_web_view_load_uri() for a new domain it will:
- Create a
ResourceRequestin WebCore that represents an HTTP request with a few basic headers set.
ResourceRequestSoupwill create its own internal representation for the request using
- This is passed to the
NetworkProcessto load this request as a
NetworkDataTaskSoupwill send/receive the request/response with
soup_session_send()which queues the message to be sent.
- libsoup will connect to the host using
GSocketClientwhich does a DNS lookup and TCP connection.
- If this is a TLS connection
GTlsClientConnectionwill use a library such as gnutls to do a TLS handshake.
- If this is a TLS connection
- libsoup will write the HTTP request and read from the socket parsing the HTTP responses eventually returning the data to WebKit.
- WebKit receives this data, along with periodic updates about the state of the request, and sends it out of the
NetworkProcessback to the main process as a
ResourceResponseeventually loading the data in the
In conclusion, WebKit provides a very flexible abstraction for platforms, and WPE leverages mature system libraries to provide a portable implementation. It has many layers, but they are all well organized and suited to their tasks.