The OAPEN Suggestion Service uses natural-language processing to suggest books based on their content similarities. To protect user privacy, we utilize text analysis rather than usage data to provide recommendations. This service is built on the proof-of-concept and paper by Ronald Snijder from the OAPEN Foundation, and you can [read the paper here](https://liberquarterly.eu/article/view/10938).
And copy the public key to your clipboard. If you have a key on your computer already, you can use that.
6. Under "Choose Authentication Method" choose "SSH Key" and click "New SSH Key", and in the popup window paste the public key you copied to your clipboard. Make sure it is selected.
1. From the DigitalOcean dashboard, click "Databases" > "Create Database".
2. Ideally, select the same region & datacenter as the Droplet you just created, so they can be part of the same VPC network.
3. Choose "PostgreSQL v15".
4. Select any sizing plan, but the cheapest one will suffice.
5. Give the database a name, and click "Create Database Cluster".
6. Once the database is done creating (this can take a few minutes), find the "Connection details" section on the new database's page, you will need them later.
> Feel free to replace `1G` in the first command with `4G`. Although the service should never use this much memory, extra swap never hurts if you have the disk space to spare. More on swap [here](https://www.digitalocean.com/community/tutorials/how-to-add-swap-space-on-ubuntu-20-04).
> Add information on how to retrieve certificate from DigitalOcean managed DB.
Create a directory in `api` called `certificates`. Once you have acquired a certificate for your managed database, copy it into `/api/certificates`. **Make sure that this file is named `ca-certificate.crt`, or ensure that the name of your certificate matches the `CA_CERT` variable in your `.env`.**
#### SSL for API
To setup SSL for the API endpoint, you need to first ensure you have the proper ports open, both in DigitalOcean's built-in firewall, and on the droplet itself using `ufw`. DigitalOcean's firewall is sufficient, so if you like you can just `sudo ufw disable`.
If you'd like to keep both `ufw` and the DigitalOcean firewall running, enable the rules in `ufw`:
```bash
sudo ufw allow http
sudo ufw allow https
```
Next, enable ports `80` and `443` in the DigitalOcean dashboard for the droplet. `443` is for HTTPS traffic and `80` is for HTTP traffic, which is needed for certbot to re-issue certificates when they expire. Don't worry, nginx will redirect all non-certbot traffic to HTTPS automatically.
For certbot to issue an SSL certificate, your `DOMAIN` specified in `.env` must already have the proper DNS records pointing to the droplet's IPv4 address.
Then, just make sure the scripts are executeable:
```bash
chmod +x setup-ssh.sh ready-ssh.sh
```
And run them in this order.
```bash
./setup-ssh.sh
./ready-ssh.sh
```
> Wait for `setup-ssh.sh` to run to completion before running `ready-ssh.sh`.
The API should now be accessible by HTTPS only at `https://<domain>/api`!
However, to ensure that certificates are renewed before they expire, add a `cron` job that renews the certificate automatically. First, open the cron editor:
```bash
crontab -e
```
And add a line, replacing `/home/oapen/oapen-suggestion-service` with wherever you cloned the repository to locally:
> _NOTE_: The `-d` flag runs the services in the background, so you can safely exit the session and the services will continue to run. The `--build` flag ensures any changes to the code are reflected in the containers.
Log files are automatically generated by Docker for each container. The log files can be found in `/var/lib/docker/containers/<container-id>/*-json.log`.
After some time, log files may take up too much disk space. To clear all logs on the host machine, run `truncate -s 0 /var/lib/docker/containers/*/*-json.log`
`threshold` (optional): sets the minimum similarity score to receive suggestions for. Default is 0, returning all suggestions.
#### Examples
> **NOTE**: You won't need to worry about the forward slash in handles causing problems, this is handled server-side.
-`/api/20.400.12657/47581`
Returns suggestions for [the book](https://library.oapen.org/handle/20.500.12657/37041) with the handle `20.400.12657/47581`.
-`/api/20.400.12657/47581?threshold=3`
Returns suggestions with a similarity score of 3 or more for [the book](https://library.oapen.org/handle/20.500.12657/37041) with the handle `20.400.12657/47581`.
### GET /api/{handle}/ngrams
Returns the ngrams and their occurences for the book with the specified handle.
#### Path Parameters
`{handle}` (required): the handle of the book to retrieve.
#### Example
`/api/20.400.12657/47581/ngrams`
Returns ngrams and their occurences for [the book](https://library.oapen.org/handle/20.500.12657/37041) with the handle `20.400.12657/47581`.
This engine is written in Python, and generates the recommendation data for users.
Our suggestion service is centered around the trigram semantic inferencing algorithm. This script should be run as a job on a cron schedule to periodically ingest new texts added to the OAPEN catalog through their API. It populates the database with pre-processed lists of suggestions for each entry in the catalog.
The embed script is a drop-in snippet of HTML, CSS, and JavaScript that can be added to the [library.oapen.org](https://library.oapen.org/) site, and adds book recommendation functionality to the sidebar of each book page.
You can find the code for the embed script in `embed-script/`, and read more about it in [`embed-script/README.md`](embed-script/README.md).
This is a web-app demo that can be used to query the API engine and see suggested books. This does not have to be maintained if the API is used on another site, but is useful for development and a tech demo.
This project uses Docker. Instructions for installing Docker [here](https://docs.docker.com/get-docker/). Note that if you do not install Docker with Docker Desktop (which is recommended) you will have to install Docker Compose separately Instructions for that [here](https://docs.docker.com/compose/install/#scenario-two-install-the-compose-plugin).
You can find instructions for installing PostgreSQL on your machine [here](https://www.postgresql.org/download/).
Or you can create a PostgreSQL server with Docker:
```bash
docker run -d --name postgres -p 5432:5432 -e POSTGRES_PASSWORD=postgrespw postgres
```
> The username and database name will both be `postgres` and the password will be `postgrespw`. You can connect via the hostname `host.docker.internal` over port `5432`.