maciej@home:~/blog$

About security, penetration testing, python

Communication between Selenium, page script and Firefox extension

7 minutes to read

A few days ago I wrote a tool which automates the KNOXSS extension. In the beginning, my tool was not able to communicate with the extension so I couldn’t read its status and so on. I had information generated by the extension in a background script and needed to receive it in a page script and handle it in Python. The problem was that every solution I found described communication from HTML page to the extension but I needed something opposite. Second problem: I’m not a JavaScript programmer :)

Toolbelt:

  • geckodriver 0.23.0 ( 2018-10-04)
  • Mozilla Firefox 67.0b4 Developer Edition
  • Selenium 3.141.0
  • KNOXSS Add-on 1.2.0
  • Python 3.6.5

What I found:

I’ve found plenty of Stackoverflow drips but it was too complicated for my problem or described communication in the opposite direction.

  1. Communication between HTML and your extension - described the usage of custom events which gave me an idea of trying it
  2. Communicating with background scripts
  3. Can my webdriver script catch an event from the webpage - catching JavaScript events by Selenium Webdriver

What I tried:

  1. Intercepting Console API entries in Selenium - Support for Selenium’s logging interface - didn’t work for me
  2. Communicating with alert box (editing extension code and adding alert(status) somewhere in msgKnoxss function and handling it with Selenium as described here - possible but not elegant.

Some theory about extensions:

Extensions are HTML and JavaScript code running in the browser. Its files have XPI extension and is standard ZIP archives so we can edit them easily. Let’s look at the structure of a typical extension:

/assets/webextension-anatomy.png
Figure 1. https://mdn.mozillademos.org/files/13669/webextension-anatomy.png

Add-ons have its own permission system describing how much we can interfere with web pages. We need a permission called "activeTab", which was set already by the author.

There are three different types of JavaScript code when talking about extensions:


Name Properties
Page script Code running in the context of the web page. I can inject code here using Selenium
Content script Part of the extension but running in the context of the web page. So-called proxy
Background script Logic of the extension, could not communicate with page script directly - my purpose - but can inject code as content script


As mentioned here:

There are two basic patterns for communicating between the background scripts and content scripts: you can send one-off messages, with an optional response, or you can set up a longer-lived connection between the two sides, and use that connection to exchange messages.

Everything looks great in first sight, but let’s look on the code:

// content-script.js

window.addEventListener("click", notifyExtension);

function notifyExtension(e) {
  if (e.target.tagName != "A") {
    return;
  }
  browser.runtime.sendMessage({"url": e.target.href});
}
// background-script.js

browser.runtime.onMessage.addListener(notify);

function notify(message) {
  browser.notifications.create({
    "type": "basic",
    "iconUrl": browser.extension.getURL("link.png"),
    "title": "You clicked a link!",
    "message": message.url
  });
}

This would be helpful if we want to send a message from content script and receive it in background script - wrong direction. Second described operations would do it for us, but it looked too complicated for me.

I know that JavaScript allows us to send events to HTML tags and saw that Selenium allows to catch those events, so let’s look closer on that mechanism.

Custom events:

We have a function called msgKnoxxs in add-ons background script which is responsible for showing notifications each time KNOXSS has done its job. I added two lines of code which should fire a custom event on the DOM object called document, which is a standard object, available regardless of the contents of HTML. If you read carefully you should know we are injecting code as content script:

function msgKnoxss(text) {
   text = text.replace(/(\r\n|\n|\r)/gm, " ");
   
   browser.tabs.executeScript(null, { code: "var myEvent = new CustomEvent('knoxss_status',{'detail': '"+text+"'});
   document.dispatchEvent(myEvent);"});
   
   browser.notifications.create({
     "type": "basic",
     "iconUrl": browser.extension.getURL("icons/k.png"),
     "title": "KNOXSS Msg Service",
     "message": text
   });
}

Now we have to receive that information on the other side in Python using Selenium’s execute_script function, which inject our code in page script. Lets assume we have a Webdriver object and loaded a web page using it:

driver.execute_script("document.addEventListener(\"knoxss_status\", function(e){window.knoxss_status = e.detail}, false);")
    while True:
        text = driver.execute_script("return window.knoxss_status")
        if text is not None:
            driver.execute_script("window.knoxss_status = null")
            print('Got KNOXSS event: {}'.format(text))
            break
        else:
            print("Waiting for KNOXSS event: {}".format(str(text)))
            time.sleep(0.5)

JavaScript events are asynchronous, that’s why we have to save the value with details of custom event in window handler and try to read it in a loop.

Summary

It might look trivial to experienced JavaScript programmer but for me, it took a lot of trials and errors. From the security point of view, modifying extensions is not what should be done because content scripts has access to all the web pages we browse. Moreover, we added a critical function similar to eval. Personally, I don’t like JavaScript. Its standard library is so poor that even simple things like string escaping need to be implemented as a custom function.



Posted by Maciej Piechota on