Mobile Automation Testing: Appium: Guts and Glory

Appium follows client/server architecture. It’s a webserver that exposes REST API

  1. It received connection from a client, listens for commands, executes those commands on a mobile device and responds with HTTP representing the results of the command execution

Clients initiate a session with a server in ways specific to each library, but they all end up sending a POST/session request to the server, with a JSON object called the ‘desired capabilities’ object. At this point the server will start up the automation session and responds with a session ID which is used for sending further commands.

Desired capabilities are a set of keys and values (i.e. a map or hash)) sent to the Appium server to tell the server what kind of automation session we’re interested in starting up

  1. There are various cababilites which can modify the behavior of the server during automation (examples…)
    • We might set the ‘platformName’ capability to iOS to tell Appium that we want an ‘iOS’ session, rather than an Android one
    • Or we might set the ‘safariAllowPopups’ capability to ‘true’ in order to ensure that, during a Safari automation session, we’re allowed to use Javascript to open up new windows
    • The complete list of capabilities is in Appium’s documentation

Appium is a server written in Node.js. It can be built and installed from source or installed directly from NPM.

Appium has client libraries (in Java, Ruby, Python, PHP, Javascript, and C#) which support its extensions to the WebDriver protocol. When using Appium, you want to use these client libraries instead of regular WebDriver client ones.

There exist GUI wrappers around the Appium server that can be downloaded. These come bundeled with everything required to run the Appium server, so you don’t need to worry about Node. They also come with an Inspector, which enables you to check out the hierarchy of your app. This can come in handy when writing tests.