Build your question-answering Slack bot in Python

Build your question-answering Slack bot in Python

This post will help you create your own know-it-all Slack bot in Python in few very easy steps. The idea is to create a Slack bot that will respond to your questions in a public Slack channel with the information it will gather from the internet. No AI will be used in this guide ;)

NOTE: If you just want to see the code, click here.

Prerequisites

First thing’s first — some prerequisites before you get started:

As always, I recommend you make use of venv to install the required python packages once we got to the coding part :)

1. Configuring Slack

By this point, you should have a Slack account along with your workspace (if you don’t, make sure you create it). We will head over to the Slack app's dashboard and Create New App. Set the App name to “QA Bot” or something similar of your liking and select your workspace — click on Create App. Now that your bot is created, we need to configure it. Go to the “OAuth & Permissions” tab (under “Features”) on the left panel. Under the “Bot Token Scopes” add the following:

  • app_mentions.read

  • chat:write

Bot Token Scopes configurationBot Token Scopes configuration

Now we need to enable “Socket Mode” for our bot. Socket mode will enable our bot to receive updates (events that occurred) directly instead of us exposing the public endpoint for Slack to ping when an event occurs. Go to the “Socket Mode” tab under “Settings” on the left side and activate the “Enable Socket Mode” toggle. You should see a popup asking you to configure an app-level token. Fill in the token name (for example qa-app-token). For the scope fill in:

  • connections:write

You will be presented with a token after you hit the “Generate” button. Save that token somewhere as we will need it later on (APP_TOKEN).

Once Socket Mode is enabled, we can (and will) enable Event Subscriptions. Go to “Event Subscriptions” under “Features” in the left panel and toggle “Enable events”. Configure the “Subscribe to bot events” as per the image below.

Subscribe to bot events configurationSubscribe to bot events configuration

Under “Settings” head over to the “Install App” tab and add the app to your workspace. This will result in you getting the second token we need for later (BOT_TOKEN) under “Bot User OAuth Token”.

The only thing left is to create the public channel in your Slack workspace and invite (add) the bot user to it.

2. ScraperBox

Now that our Slack bot is configured and assigned with proper permissions we need to head over to the ScraperBox and create the free account. Since we will be gathering the information from the internet about the questions that you (or other Slack users) will be asking the QA Bot about, we need to somehow get the responses from Google. Since it is against Google’s TOS to scrape the search results, we will use ScraperBox as a 3rd party service to do this for us via API calls. After the registration process is done, head over to the dashboard to get your API Token.

ScraperBox dashboardScraperBox dashboard

Save the token somewhere as it will be needed for our Python app to work.

3. Python app

Since we have met all the other prerequisites, we can start building our Python app now. Create a new Python project and along it create a new venv that will hold our Python packages. Install the required Python packages:

pip install bs4
pip install requests
pip install slack-bolt

Once that is done, we can build our main app.py file:

from typing import Callable

from slack_bolt import App
from slack_bolt.adapter.socket_mode import SocketModeHandler


APP_TOKEN = "YOUR-APP-TOKEN"
BOT_TOKEN = "YOUR-BOT-TOKEN"
app = App(token=BOT_TOKEN)


@app.event("app_mention")
def mention_handler(body: dict, say: Callable):
    print(body)
    say("got it")    


if __name__ == "__main__":
    handler = SocketModeHandler(app, APP_TOKEN)
    handler.start()

The idea is to test our bot and see how much information we can get from it once it reacts to users tagging it inside the Slack channel. Run the app.py and head over to your Slack client and tag the bot with some dummy text.

Dummy test of our new botDummy test of our new bot

Head over to your console (from where you ran the app.py) and check out the output in there — you should see some dictionary output in there containing some valuable info on all the information our bot gets from a simple mention inside the Slack channel:

{
   "token":"*******",
   "team_id":"TEAM_ID",
   "api_app_id":"APP_ID",
   "event":{
      "client_msg_id":"ea28d05d-***********",
      "type":"app_mention",
      "text":"<@BOT_ID> test",
      "user":"SENDER_ID",
      "ts":"1633536422.000600",
      "team":"TEAM_ID",
      "blocks":[
         {
            "type":"rich_text",
            "block_id":"Sq8",
            "elements":[
               {
                  "type":"rich_text_section",
                  "elements":[
                     {
                        "type":"user",
                        "user_id":"SENDER_ID"
                     },
                     {
                        "type":"text",
                        "text":" test"
                     }
                  ]
               }
            ]
         }
      ],
      "channel":"CHANNEL_ID",
      "event_ts":"1633536422.000600"
   },
   "type":"event_callback",
   "event_id":"Ev02GW7QE6QK",
   "event_time":1633536422,
   "authorizations":[
      {
         "enterprise_id":"None",
         "team_id":"TEAM_ID",
         "user_id":"BOT_ID",
         "is_bot":true,
         "is_enterprise_install":false
      }
   ],
   "is_ext_shared_channel":false,
   "event_context":"some_random_text"
}

From here we can see some information that we can use to make our bot somewhat smart. We can see the Slack ID’s for both sender and bot which we will use. Also, we can see the text that the user sent. Let’s rewrite our mention_handler to make it a bit smarter:

@app.event("app_mention")
def mention_handler(body: dict, say: Callable):
    sender_id = f"<@{body.get('event', {}).get('user')}>"
    say(f"Let me check that for you {sender_id}")
    bot_id = body.get("event", {}).get("text").split()[0]
    message = body.get("event", {}).get("text")
    message = message.replace(bot_id, "").strip()
    answer = get_answer(message)
    say(answer)

The idea here is simple — get the sender and bot ID’s and get the text that was sent by the sender after the bot mention. The problematic part is that we are missing the function get_answer that will get us the answer for the text that the user sent to the bot. Let’s go ahead and see how to make that one work. Create a new file called scraperbox.py and put this inside of it:

import requests
from bs4 import BeautifulSoup


API_TOKEN = "YOUR-SCRAPERBOX-API-TOKEN"


def _get_json_response(query: str) -> dict:
    params = {
        "token": API_TOKEN,
        "q": query,
        "proxy_location": "gb",
        "return_html": "true",
    }
    resp = requests.get("https://api.scraperbox.com/google", params=params)
    return resp.json()


def get_answer(query: str) -> str:
    answer = "No idea how to answer that :("
    resp = _get_json_response(query)

    if "html" not in resp:
        return answer

    soup = BeautifulSoup(resp["html"], "html.parser")
    el = soup.find("div", class_="kno-rdesc")

    if not el:
        return answer

    return el.span.text

A LOT is going on in the above code so I will run it down briefly (feel free to explore it more by yourself). Function _get_json_response is the one that will send the API request to the ScraperBox service and get it to make a Google search for the text that the user tagged our Bot with. The params inside are added by following the ScraperBox docs for Google. The idea is to make a Google search for the given text and get the whole HTML response back (inside a Python dictionary).

Let’s make a simple Google search for the term “who was Nikola Tesla” and see what we can get.

Google search for the term “who was Nikola Tesla”Google search for the term “who was Nikola Tesla”

As you can see on the image above, the right side contains a short and precise summary of the search term that we looked for. We also have this information in our Python app since we have the whole HTML response stored, we only need to find a way to extract that information from the whole HTML returned. This is where beautifulsoup4 comes in handy since it is used for parsing HTML content with Python. If we check the HTML code from the above Google results, we will find that the information we need is stored inside the div element with a class named kno-rdesc.

Div element containing the wanted informationDiv element containing the wanted information

We can use this information to parse out the wanted content, so let’s take a look at the function get_answer. First, we set the default value for the answer variable to No idea how to answer that :( — this will be used in case Google doesn’t have the answer to the user’s question. Afterward, we call _get_json_response with the query param (text that the user sent to the bot user) to get the Google search response along with the complete HTML representation. If the html is not present inside the response dictionary we will respond with the default answer. Next we parse out the HTML inside the soup object and try to find the div element with the class attribute set to kno-rdesc as per the image above. If it fails we will, again, return the default answer — otherwise, we will look for the child span element (as per structure in the above image) and return its inner text. The last thing to do is to add the needed import in our app.py :

from scraperbox import get_answer

Feel free to run the app.py and if everything goes well you should get all your questions answered by our QA Bot :)

Complete code can be found here.

Examples of our new Bot trying to answer our questionsExamples of our new Bot trying to answer our questions

There we go — you just made Siri / Alexa alternative for your Slack workspace — as always, thanks for reading!