Justin O'Boyle
376545450d
Basic testing ( #45 )
...
* finished changes to stopwords and langauges
* final changes to stopwords
* basic testing
* add tests
* Remove formatter for now
* fix merge
* cd
* touch __init__
* Relative path issue \?
* run tests before app
* Move tests to inside docker
* exit when any command fails
---------
Co-authored-by: Max Zaremba <max.zaremba@gmail.com>
2023-03-22 14:52:38 -04:00
Celina Peralta
884872cf60
celinanperalta/OAP-58, OAP-59: Update suggestion task, remove unnecessary collections from harvest ( #42 )
...
* Each thread inserts into DB using one synchronized conn
* Fix formatting for get_empty query
* OAP-59: Filter out unnecessary collections from harvest
* Add endpoints table check
* Fix typo in get_empty description
2023-03-22 13:15:53 -04:00
Celina Peralta
15f801a19a
Bug fixes for prod harvest ( #40 )
...
* fix tqdm counter and add request headers
* fix typo in generate suggestions, add limit to seed.py
2023-03-07 12:41:36 -05:00
Celina Peralta
ebaaa7cab3
fix tqdm counter and add request headers ( #39 )
2023-03-06 15:37:33 -05:00
Celina Peralta
888e73e6e2
OAP-56 and logging ( #38 )
...
* incremental progress, logging, synchronization
* update refresh/harvest period
* add logger StreamHandler, add Dockerfile nltk download, tweak seed parameters
* update readme and add scripts for manual commands
2023-03-06 14:44:45 -05:00
Justin O'Boyle
f4b9ed39ab
Embed script deployment fixes ( #37 )
...
* Don't EncodeURIComponent
* Fix CORS
* Explicitly define CORS header
2023-03-03 09:13:53 -05:00
Justin O'Boyle
83938f73b5
Correct CORS headers and first pass at embedded API item ( #36 )
...
* Ignore cross origin
* Add script
* Dynamic host
2023-03-03 08:58:02 -05:00
Celina Peralta
e772dc2b87
OAP-54: Full harvest for DB, add threshold ( #34 )
...
* Fix harvest synchronization, add threshold parameter
* Move daemon env vars to docker-compose.yml
2023-02-23 19:23:23 -05:00
Celina Peralta
f7c33c07e9
OAP-53 Fix engine Dockerfile, build psycopg2 from source not binary, write daemon ( #32 )
...
* add libpq5 to build
* remove psycopg2-binary
* add punkt as resource
* Use multiprocessing.cpu_count for max suggestion workers
2023-02-10 07:45:44 -05:00
Peter Rauscher
32bb124706
Fixed minor error with handle validation ( #33 )
2023-02-09 16:46:22 +00:00
Celina Peralta
27b9a77f78
OAP-48, OAP-50 ( #30 )
2022-12-13 08:25:40 -05:00
Justin O'Boyle
7ac8bd7af8
Make docker not run on localhost ( #31 )
2022-12-13 08:25:25 -05:00
Justin O'Boyle
535715932d
Setup docker ( #26 )
...
* basic config
* Add github action
* Fix makefile for linux and variable python (#27 )
* fix makefile
* remove out
* add to gitignore
* Fix makefile for linux and variable python (#27 )
* fix makefile
* remove out
* add to gitignore
* Fix dockerfile
* stash changes
* Make makefile dynamic (#28 )
* Remove broken docker packages for now
* Add web
* Make Black Formatter happy?
Co-authored-by: Max Zaremba <max.zaremba@gmail.com>
2022-12-13 07:46:08 -05:00
Justin O'Boyle
924d2b2539
Make makefile dynamic ( #28 )
2022-12-05 20:57:25 -05:00
Max Zaremba
63619d3aa9
Fix makefile for linux and variable python ( #27 )
...
* fix makefile
* remove out
* add to gitignore
2022-12-04 21:54:40 -05:00
Celina Peralta
9bfdc51e5c
Tweak seed task params ( #25 )
...
* No get text
* remove pytest
* add OAP-39 work.
* add weekly item endpoint
* get weekly items
* refresh + generate suggestions
* thread seed tasks
* fix generate_suggestions concurrency
* fix typos
* draft concurrent data ingest
* seed, refresh tasks
* change seed config params
2022-11-29 01:19:16 -05:00
Celina Peralta
922ff68a17
celinanperalta/OAP 23 ( #22 )
...
* No get text
* remove pytest
* add OAP-39 work.
* add weekly item endpoint
* get weekly items
* refresh + generate suggestions
* thread seed tasks
* fix generate_suggestions concurrency
* fix typos
* draft concurrent data ingest
* seed, refresh tasks
2022-11-28 22:17:43 -05:00
Justin O'Boyle
a8563e48be
Fix build job ( #24 )
...
* Fix type
* fix formatting
2022-11-28 19:36:41 -05:00
j-sofia
cdf4659146
OAP-37: Read stopwords from txt ( #23 )
...
* read stopwords from txt
read stopwords from txt and README change
* leftover code removed
* formatting
* formatting again
* formatting last try
2022-11-16 17:22:34 -05:00
Justin O'Boyle
9435a69032
OAP-40 Align API more closely with ngram generation, fix environment ( #21 )
...
* First commit
* Update gitignore
* Update schema
* Remove todo
2022-11-14 15:20:43 -05:00
Celina Peralta
1aa611231b
OAP-36, OAP-39: No get_text() in OapenItem, register adapters for DB objects in Python ( #20 )
...
* No get text
* remove pytest
* add OAP-39 work.
2022-11-09 19:36:20 -05:00
Peter Rauscher
013fef0f0d
OAP-38: Add /ngrams endpoint to API ( #19 )
...
* Added data function to query ngrams within the api
* removed cleaning and seeding tasks from API level, handled at engine level
* Added /:handle/ngrams endpoint to API routes
* Reflected change from uuid to handle in log messages within API
* Just some API readme changes
* Added regex to routes to mitigate url decoding, plus added validation function for handle
Co-authored-by: j-sofia <joey.sofia1@gmail.com>
Co-authored-by: Peter Rauscher <peterrauscher@protonmail.com>
Co-authored-by: j-sofia <joey.sofia1@gmail.com>
2022-11-09 18:45:48 -05:00
Celina Peralta
4333d4fcc3
[Draft] OAP-32 Ngram Caching ( #18 )
...
* start caching ngrams
* fix build warnings
* add timestamp
* resolve comments
* pull out mogrify
* remove pytest from hook for now
2022-11-02 23:07:56 -04:00
j-sofia
ccbdda287e
OAP-17: PostgreSQL integration into API with pg-promise, data function to read from DB, dotenv to read DB credentials from environment variables ( #9 )
...
* local db connection and data functions
added pg-promise package to interface with PostgreSQL, added data functions, changed api to port 3001, updated README.md
* pr review changes
* dotenv
* Update README.md with api dependencies
* Update README.md
* PR changes
* typo
2022-10-26 03:07:10 +00:00
Celina Peralta
cf9569a358
celinanperalta/OAP 33 ( #16 )
...
* make db use handle not uuid
* remove lib
* remove lib
2022-10-24 19:35:43 -04:00
Celina Peralta
fd7f30ca31
celinanperalta/OAP 31 ( #17 )
...
* Get items by handle
* Get items by handle
* remove lib
* update clean/seed tasks
* refactor ngrams
* fix typo in oapen.py
* why is this not being ignored
2022-10-24 19:30:57 -04:00
Celina Peralta
06468a650c
Merge branch 'celinanperalta/OAP-31' into main
2022-10-24 19:20:50 -04:00
Celina Peralta
5a974ff0f1
merge main
2022-10-24 19:19:31 -04:00
Celina Peralta
162eb86497
fix typo in oapen.py
2022-10-23 19:53:48 -04:00
Celina Peralta
ee45695fb6
Merge branch 'main' of https://github.com/EbookFoundation/oapen-suggestion-service into main
2022-10-23 19:52:52 -04:00
Celina Peralta
1520f08b05
Fix pre-commit hook + linting jobs for OAPEN engine ( #14 )
...
* sync upstream
* isort, black, flake8 precommit hook
* Ignore bin
* reset bin
* reset bin
* try to fix black
* remove bin!
* update gh action
2022-10-23 19:51:47 -04:00
Celina Peralta
d2668491ab
refactor ngrams
2022-10-18 13:58:04 -04:00
Celina Peralta
0b97c79bde
Merge remote-tracking branch 'upstream/main' into celinanperalta/OAP-31
2022-10-18 13:40:36 -04:00
Celina Peralta
6589faa2e3
update clean/seed tasks
2022-10-18 13:39:16 -04:00
Celina Peralta
2bba7eaf98
remove lib
2022-10-18 13:24:35 -04:00
Celina Peralta
ed617affd5
Get items by handle
2022-10-18 13:21:55 -04:00
Celina Peralta
fded0c1344
Get items by handle
2022-10-18 13:21:35 -04:00
Max Zaremba
033fc1e56e
OAP 26 ( #12 )
...
We are disregarding the linting job failure as this is maybe an environment issue. Will be fixed in subsequent PRs.
2022-10-18 11:31:49 -04:00
Celina Peralta
417c55ed33
Merge branch 'main' of https://github.com/EbookFoundation/oapen-suggestion-service into main
2022-10-18 09:42:41 -04:00
Justin O'Boyle
09ec61b7d7
OAP-35 Connect `api/` and `web/`, fix querying between them, add running documentation & make dev environment easier to use ( #13 )
2022-10-18 08:10:11 -04:00
Justin O'Boyle
962f2d0972
OAP-21 Add project details and dependency maintenance info to README ( #10 )
2022-10-11 15:54:45 -04:00
Celina Peralta
3392f79665
Merge branch 'EbookFoundation:main' into main
2022-10-11 14:11:41 -04:00
Celina Peralta
2e0f398055
OAP-14: DB connection and seeding ( #8 )
...
Add seeding for database, fix pre-commit hooks, and add Makefile
2022-10-11 14:08:35 -04:00
Celina Peralta
48e79bbc4f
Merge branch 'EbookFoundation:main' into main
2022-10-06 18:21:14 -04:00
Peter Rauscher
4d7afab630
Split API from demo app and began Express framework ( #7 )
2022-10-06 18:49:58 +00:00
Celina Peralta
f0956854ea
Merge https://github.com/EbookFoundation/oapen-suggestion-service into main
2022-10-04 08:24:05 -04:00
Celina Peralta
0f247eac8c
OAP-15, OAP-22: Data ingest + text preprocessing ( #6 )
...
* sync upstream
* db skeleton
* update readme
* basic api calls
* api calls
* data ingest + text preprocessing
* update gitignore
* remove lib changes
* lint
* remove unused imports
* gitignore updates
* update python job
* ignore flake8 warnings
2022-10-04 08:22:55 -04:00
Justin O'Boyle
4a4a318c68
OAP-18 Create basic framework to view books through OAPEN (rough is OK) ( #5 )
...
* Create items/ route
* Add basic query
* Vercel
* Fix typo
* Fix type
* Cleaner exports
* Explain expandable
2022-10-03 21:34:11 -04:00
Celina Peralta
b55f906143
Merge branch 'EbookFoundation:main' into main
2022-09-30 17:13:31 -04:00
Celina Peralta
e005787bbe
OAP-22: Set up python build job in GH actions ( #4 )
...
* sync upstream
* Add linting and testing, update to python 3.10
* push engine workflow
* fix workflow version
* fix workflow version 2
* change setup-python to v3
* workflow: cd oapen-engine
* workflow
* workflow
* workflow
* add lock file
* remove unnecessary cd
* remove unnecessary cd
* remove isort
* update workflow
* job
* job
* job
* job
* job
* job
* isort why
* isort why
* isort why
* isort why
* isort why
* isort why
* isort why
* job
* job
* job
* job
* job
* job
* job
* job
* job
* job
* job
* lint
* job
* hooks
* remove lib
2022-09-30 15:49:04 -04:00