oapen-suggestion-service/oapen-engine
Peter Rauscher 40cac8ba55
DB SSL, API format changes, new endpoints, and unit testing with Actions
* Run unit tests with Github Actions on each push

* Change job timeout to 10 minutes

* Fix for sslmode in API connection string

* Select lists of suggestions or ngrams with /api or /api/ngrams respectively, JSONify ngrams response

* Added better documentation of API endpoints

* Switch from connection string to connection object in API
2023-04-18 19:48:09 -04:00
..
scripts DB SSL, API format changes, new endpoints, and unit testing with Actions 2023-04-18 19:48:09 -04:00
src Change database build & rebuild behavior, removes RUN_CLEAN option (#59) 2023-04-18 11:09:47 -04:00
.gitignore Fix makefile for linux and variable python (#27) 2022-12-04 21:54:40 -05:00
Dockerfile Change database build & rebuild behavior, removes RUN_CLEAN option (#59) 2023-04-18 11:09:47 -04:00
Makefile Basic testing (#45) 2023-03-22 14:52:38 -04:00
Pipfile add schedule package to daemon (#48) 2023-03-31 15:33:41 -04:00
README.md Change database build & rebuild behavior, removes RUN_CLEAN option (#59) 2023-04-18 11:09:47 -04:00
pipenv-proper-names.txt Create mining engine boilerplate (#2) 2022-09-27 15:07:50 -04:00
pyvenv.cfg OAP-22: Set up python build job in GH actions (#4) 2022-09-30 15:49:04 -04:00

README.md

Suggestion Engine

Updating/migrating the database

When you make database changes, or add new stopwords, you'll want to completely re-run the harvesting and suggestion creation for the database. Though this happens weekly by default, you have some more immediate options:

To erase & recreate the database NOW, you can run:

docker compose run oapen-engine clean now

WARNING: You will lose ALL database data! Reruns are resource-intensive and lengthy, be sure before running this. This could cause unexpected errors if the running service is active, in which case you will need to restart the container.

To erase & recreate the database on the next run, you can run:

docker compose run oapen-engine clean true

WARNING: You will lose ALL database data! Reruns are resource-intensive and lengthy, be sure before running this. This is safer than the last command and should not cause any breakage, even if the database is being used by the service actively.

To cancel the operation above, so the database is not erased on the next run, you can run:

docker compose run oapen-engine clean false

How it works

Those last two operations work by creating/deleting a table called migrate in the oapen_suggestions schema in the database. When the table exists, the daemon checks for the existence of the table when starting up, and drops & recreates the schema, tables, and types if it exists. It then deletes the table. When the table does not exist, the database is left as-is. You can also manually create the migrate table via an SQL query in any database admin tool, and the database will be re-created on the next run.

Running the engine alone

Ensure that you have followed the setup instructions, then run:

docker-compose up -d --build

Refreshing items + suggestions manually

./scripts/refresh.sh

How to remove/filter out bad ngrams

People with access to the repository can create a pull request to edit the stopwords used to filter out bad trigrams:

oapen-engine/src/model/stopwords_*.txt

Changes in stopwords will not reflected until the next harvest, which occurs weekly by default.