Selenium 4 Architecture — W3C WebDriver Protocol
Selenium has come a long way since its inception in 2004, and the foundations of the Selenium project are WebDriver, Grid, and IDE. Speaking about the Webdriver which is an interface that allows users to execute cross-browser tests, is no more using the JSON Wire protocol and its client-server architecture is fully standardized with W3C. In this post, I’ve explained the basics of the W3C Webdriver protocol.
If you’ve landed here and are already aware of the w3c webdriver protocol, use this post as a refresher.. 😊
Selenium with JSON Wire Protocol
As per Selenium legacy documentation, “All implementations of WebDriver that communicate with the browser, or a RemoteWebDriver server shall use a common wire protocol. This wire protocol defines a RESTful web service using JSON over HTTP.”
The main drawback of this protocol was that there was no direct communication between the client libraries (Java, Ruby, Python, etc..) and the browser driver. JSON wire protocol acts as a mediator between client libraries and webdriver.
Because the server doesn’t understand the programming language but only the protocols, hence it uses the process of serialization (convert object’s data to JSON format) and deserialization (convert JSON format to object). This results sometime in exceptions, slower test execution, and more chances of the test getting flaky.
Selenium with W3C Protocol
Based on W3C website, “WebDriver is a remote control interface that enables introspection and control of user agents. It provides a platform- and language-neutral wire protocol as a way for out-of-process programs to remotely instruct the behavior of web browsers.”
It is interesting to know that in selenium version 3.8 both json wire and w3c protocol ware existed in the web driver. But since version 4.0 selenium is only using World Wide Web Consortium (W3C) standard.
Now as almost all browsers and language bindings use the W3C, communication becomes much easier between client and server (driver). It is like two same language-speaking persons talking to each other.
Below is the improvement list in selenium 4 with the new protocol:
- Standard — common code Almost all the browsers and client library are already adopted the w3c protocol.
- Stability — reduction in different runtime browser exceptions.
- Action API — Multiple keyboard and mouse actions allowed like two keys press, Zoom In Zoom out operations, etc.
- Introduction of ChromiumDriver (now chrome and edge driver extends the ChromiumDriver)
- Optimized Desired Capability
- Selenium codebase is more optimized and cleaned as the legacy code is removed.
Conclusion
There are multiple stable version has been released after selenium 4.0 and users are getting its benefits with more consistent and stable test execution across the browsers. Execution time got improved as the request and response Encoding-Decoding process is removed and the code base is optimized. Action APIs offer multiple keyboard and actions, zoom-in, and zoom-out mouse operations. With selenium 4 there are some interesting new features like Enhanced Selenium Grid, Upgraded Selenium IDE, Support for Chrome Debugging Protocol, Friendly Locators, etc.
Thanks for reading this post. As it is my first in the selenium series therefore I felt that it would be good to start with the basics and hence explain the new WebDriver protocol and its architecture.
Reference: https://www.selenium.dev/