Making a Local Instance of ConceptNet

3 minute read

Published:

This note explains how to make a local instance of ConceptNet running.

Headsup

This note is built on top of the official guide:

Hard disk requirement: ~400GB in total (can be two separate blocks of 200GB).

Step-by-Step

1. Fetch the data

First, use puppet to dump the data into your local system. You can either make a virtual environment or not. I found it does not matter much (explained below).

Just make sure before you run any of the following lines, reserve about 200GB space under /home directory.

cd to [ANY_DIRECTORY]
git clone https://github.com/commonsense/conceptnet-puppet
cd conceptnet-puppet
sudo sh puppet-setup.sh
sudo sh puppet-apply.sh

It does not matter where you clone the github repo, because it will not contain the data.

The puppet-setup.sh script will create a sudo-type user called conceptnet which you will not have passcode thus in which you can not run sudo. So later on you will have to switch between your sudo account and this conceptnet account to make things work.

Again, the puppet-setup.sh will download about 24GB compressed data into /home/conceptnet/ and inflate them into about 200GB. The process can be VERY slow. Upon the finish of this script, you will be entered into the conceptnet environment. Exit it to stay with your sudo user.

Then run the puppet-apply.sh will throw you with tons of warnings about server not found or something. Simply ignore them.

2. Build the database

Stay in your sudo user. Start the database service:

sudo service postgresql start

You can not get it done with user conceptnet since you do not have the password:)

Enter the conceptnet:

sudo su conceptnet
cd ~/conceptnet5

where you will need another 200GB space (at ~/conceptnet5/data/) to continue smoothly.

There are already data in ~/conceptnet5/data/. Therefore if your disk is running in short, you can fetch another 200GB and create a softlink as ~/conceptnet5/data2; and move all data in ~/conceptnet5/data/ to there. And rename the data2 to data.

Stay in user conceptnet. Before moving on, run this in ~/conceptnet5/:

pip install -e '.[vectors]'

Stay in user conceptnet. Build and test the database. The process turned out to be very fast. Expect definitely no error here.

./build.sh
./test.sh

3. Running the backend

Stay in user conceptnet. Start the conceptnet service (I am not sure if this is already started at this point)

systemctl restart conceptnet

This actually requires sudo, and it will ask you for which account to authenticate sudo. Simply select your sudo acount.

Now switch to your sudo user (simply type exit). Start a screen. Then switch to user conceptnet again:

sudo su conceptnet
cd ~/conceptnet5/web
python conceptnet_web/api.py

The backend will be running at http://127.0.0.1:8084/. Type Ctrl+A+D to hide the screen into backend.

4. Query

Now let’s have fun with it. First you should be able to get no error running:

curl http://localhost:8084/

Then try:

curl http://localhost:8084/c/en/example

where you will get a json file printed.

As a realworld query, try this:

curl http://localhost:8084/search?rel=/r/Synonym&end=/c/en/play

which will print out the whole json query result for synonym edges ending in the word play.

Cheers:)