OAPEN Suggestion Service Web Application - SIT Senior Capstone
 
 
 
 
 
 
Go to file
Justin O'Boyle 45ace93ecd
Merge branch 'main' into documentation-test
2023-04-03 15:42:12 -04:00
.github/workflows Basic testing (#45) 2023-03-22 14:52:38 -04:00
api Embed script deployment fixes (#37) 2023-03-03 09:13:53 -05:00
embed-script Embed script deployment fixes (#37) 2023-03-03 09:13:53 -05:00
oapen-engine add schedule package to daemon (#48) 2023-03-31 15:33:41 -04:00
web Setup docker (#26) 2022-12-13 07:46:08 -05:00
.flake8 [Draft] OAP-32 Ngram Caching (#18) 2022-11-02 23:07:56 -04:00
.gitignore Fixed minor error with handle validation (#33) 2023-02-09 16:46:22 +00:00
.isort.cfg Fix pre-commit hook + linting jobs for OAPEN engine (#14) 2022-10-23 19:51:47 -04:00
.pre-commit-config.yaml Fix pre-commit hook + linting jobs for OAPEN engine (#14) 2022-10-23 19:51:47 -04:00
LICENSE.md Create LICENSE.md 2022-09-27 14:00:13 -04:00
README.md Merge branch 'main' into documentation-test 2023-04-03 15:42:12 -04:00
all-dev.sh improve documentation (#43) 2023-03-27 12:53:06 -04:00
docker-compose.yml celinanperalta/OAP-58, OAP-59: Update suggestion task, remove unnecessary collections from harvest (#42) 2023-03-22 13:15:53 -04:00
pyproject.toml Fix pre-commit hook + linting jobs for OAPEN engine (#14) 2022-10-23 19:51:47 -04:00
run-api.sh improve documentation (#43) 2023-03-27 12:53:06 -04:00
run-web.sh improve documentation (#43) 2023-03-27 12:53:06 -04:00
setup.cfg [Draft] OAP-32 Ngram Caching (#18) 2022-11-02 23:07:56 -04:00
setup.sh OAP-17: PostgreSQL integration into API with pg-promise, data function to read from DB, dotenv to read DB credentials from environment variables (#9) 2022-10-26 03:07:10 +00:00

README.md

OAPEN Suggestion Engine

The OAPEN Suggestion Engine will suggest ebooks based on other books with similar content.

Running the project

The project uses Docker. To run the project, you will need to have Docker installed. You can find instructions for installing Docker here.

1. Clone the repository

git clone https://github.com/EbookFoundation/oapen-suggestion-service.git

2. Install PostgreSQL

The project uses PostgreSQL as a database. You can find instructions for installing PostgreSQL here. Make sure it is running, and a database is created.

Create a database.ini file in oapen-engine/src with the following:

[postgresql]
host=127.0.0.1
database=postgres
user=<username>
password=<your-password>

Edit config.env in api/ with the following:

DATABASE_URL="postgres://<username>:<your-password>@127.0.0.1:5432/postgres"
PORT=3001

5. Run the project

docker compose up

Try connecting to the API at http://localhost:3001/. You can run all the servers together with ./all-dev.sh -- after installing dependencies with . ./setup.sh

Monorepo components

This project is a monorepo, with multiple pieces that can be added or removed as neccessary for deployment.

Mining Engine (Core)

This engine is written in Python, and generates the recommendation data for users. Our suggestion service is centered around the trigram semantic inferencing algorithm. This script should be run as a job on a cron schedule to periodically ingest new texts added to the OAPEN catalog through their API. It will populate the Database (see Database section) with pre-processed lists of suggestions for each entry in the catalog.

You can find the code for the mining engine in oapen-engine/.

Information about running the mining engine is in oapen-engine/README.md.

Base dependencies:

  • Python v3.10
  • PIP package manager
  • make

Automatically-installed dependencies:

  • nltk -- Natural language toolkit.
    • Maintained on GitHub by 300+ contributors.
    • Most recent update: 8 days ago
  • requests -- HTTP request library
    • Maintained on GitHub by 600+ conributors, and backed by sponsors.
    • Most recent update: 1 month ago.
  • psycopg2 -- PostgreSQL Database Adapter
    • Maintained on GitHub by 100+ contributors, and used by 480,000+ packages.
    • Most popular PostgreSQL database adapter for Python
  • pandas -- data analysis library
    • Maintained by PYData with large amounts of sponsors. 2,700+ contributors.
  • scikit-learn -- Scikit Learn

API Engine (Core)

This API server returns a list of recommended books from the database.

You can find the code for the API engine in api/.

Configuration info for the API engine is in api/README.md.

Base dependencies:

  • NodeJS 14.x+
  • NPM package manager

Automatically-installed dependencies:

  • express - Basic HTTP server
  • pg-promise -- basic PostgreSQL driver
  • dotenv -- loads environment variables from .env

Web Demo (Optional)

This is a web-app demo that can be used to query the API engine and see suggested books. This does not have to be maintained if the API is used on another site, but is useful for development and a tech demo.

You can find the code for the web demo in web/.

Configuration info for the web demo is in web/README.md.

Base dependencies:

  • NodeJS 14.x+
  • NPM package manager

Automatically-installed dependencies:

  • next -- Framework for production-driven web apps
    • Maintained by Vercel and the open source community
  • react -- Frontend design framework
    • Maintained by Meta.
    • Largest frontend web UI library.
    • (Alternative considered: Angular -- however, was recently deprecated by Google)
  • pg -- basic PostgreSQL driver
  • typescript -- Types for JavaScript
    • Maintained by Microsoft and the open source community.