Training

ChatterBot includes tools that help simplify the process of training a chat bot instance. ChatterBot’s training process involves loading example dialog into the chat bot’s database. This either creates or builds upon the graph data structure that represents the sets of known statements and responses. When a chat bot trainer is provided with a data set, it creates the necessary entries in the chat bot’s knowledge graph so that the statement inputs and responses are correctly represented.

ChatterBot training statement graph

Several training classes come built-in with ChaterBot. These utilities range from allowing you to update the chat bot’s database knowledge graph based on a list of statements representing a conversation, to tools that allow you to train your bot based on a corpus of pre-loaded training data.

You can also create your own training class. This is recommend if you wish to train your bot with data you have stored in a format that is not already supported by one of the pre-built classes listed below.

Setting the training class

ChatterBot comes with training classes built in, or you can create your own if needed. To use a training class you must import it and pass it to the set_trainer() method before calling train().

Training classes

Training via list data

chatterbot.trainers.ListTrainer(storage, **kwargs)[source]

Allows a chat bot to be trained using a list of strings where the list represents a conversation.

For the training, process, you will need to pass in a list of statements where the order of each statement is based on it’s placement in a given conversation.

For example, if you were to run bot of the following training calls, then the resulting chatterbot would respond to both statements of “Hi there!” and “Greetings!” by saying “Hello”.

from chatterbot.trainers import ListTrainer

chatterbot = ChatBot("Training Example")
chatterbot.set_trainer(ListTrainer)

chatterbot.train([
    "Hi there!",
    "Hello",
])

chatterbot.train([
    "Greetings!",
    "Hello",
])

You can also provide longer lists of training conversations. This will establish each item in the list as a possible response to it’s predecessor in the list.

chatterbot.train([
    "How are you?",
    "I am good.",
    "That is good to hear.",
    "Thank you",
    "You are welcome.",
])

Training with corpus data

chatterbot.trainers.ChatterBotCorpusTrainer(storage, **kwargs)[source]

Allows the chat bot to be trained using data from the ChatterBot dialog corpus.

ChatterBot comes with a corpus data and utility module that makes it easy to quickly train your bot to communicate. To do so, simply specify the corpus data modules you want to use.

from chatterbot.trainers import ChatterBotCorpusTrainer

chatterbot = ChatBot("Training Example")
chatterbot.set_trainer(ChatterBotCorpusTrainer)

chatterbot.train(
    "chatterbot.corpus.english"
)

Specifying corpus scope

It is also possible to import individual subsets of ChatterBot’s at once. For example, if you only wish to train based on the english greetings and conversations corpora then you would simply specify them.

chatterbot.train(
    "chatterbot.corpus.english.greetings",
    "chatterbot.corpus.english.conversations"
)

You can also specify file paths to corpus files or directories of corpus files when calling the train method.

chatterbot.train(
    "./data/greetings_corpus/custom.corpus.json",
    "./data/my_corpus/"
)

Training with the Twitter API

chatterbot.trainers.TwitterTrainer(storage, **kwargs)[source]

Allows the chat bot to be trained using data gathered from Twitter.

Parameters:random_seed_word – The seed word to be used to get random tweets from the Twitter API. This parameter is optional. By default it is the word ‘random’.

Create an new app using your twitter account. Once created, it will provide you with the following credentials that are required to work with the Twitter API.

Parameter Description
twitter_consumer_key Consumer key of twitter app.
twitter_consumer_secret Consumer secret of twitter app.
twitter_access_token_key Access token key of twitter app.
twitter_access_token_secret Access token secret of twitter app.

Twitter training example

# -*- coding: utf-8 -*-
from chatterbot import ChatBot
from settings import TWITTER
import logging


'''
This example demonstrates how you can train your chat bot
using data from Twitter.

To use this example, create a new file called settings.py.
In settings.py define the following:

TWITTER = {
    "CONSUMER_KEY": "my-twitter-consumer-key",
    "CONSUMER_SECRET": "my-twitter-consumer-secret",
    "ACCESS_TOKEN": "my-access-token",
    "ACCESS_TOKEN_SECRET": "my-access-token-secret"
}
'''

# Comment out the following line to disable verbose logging
logging.basicConfig(level=logging.INFO)

chatbot = ChatBot("TwitterBot",
    logic_adapters=[
        "chatterbot.logic.BestMatch"
    ],
    input_adapter="chatterbot.input.TerminalAdapter",
    output_adapter="chatterbot.output.TerminalAdapter",
    database="./twitter-database.db",
    twitter_consumer_key=TWITTER["CONSUMER_KEY"],
    twitter_consumer_secret=TWITTER["CONSUMER_SECRET"],
    twitter_access_token_key=TWITTER["ACCESS_TOKEN"],
    twitter_access_token_secret=TWITTER["ACCESS_TOKEN_SECRET"],
    trainer="chatterbot.trainers.TwitterTrainer"
)

chatbot.train()

chatbot.logger.info('Trained database generated successfully!')

Training with the Ubuntu dialog corpus

chatterbot.trainers.UbuntuCorpusTrainer(storage, **kwargs)[source]

Allow chatbots to be trained with the data from the Ubuntu Dialog Corpus.

This training class makes it possible to train your chat bot using the Ubuntu dialog corpus. Because of the file size of the Ubuntu dialog corpus, the download and training process may take a considerable amount of time.

This training class will handle the process of downloading the compressed corpus file and extracting it. If the file has already been downloaded, it will not be downloaded again. If the file is already extracted, it will not be extracted again.

Creating a new training class

You can create a new trainer to train your chat bot from your own data files. You may choose to do this if you want to train your chat bot from a data source in a format that is not directly supported by ChatterBot.

Your custom trainer should inherit chatterbot.trainers.Trainer class. Your trainer will need to have a method named train, that can take any parameters you choose.

Take a look at the existing trainer classes on GitHub for examples.