Commit Graph

39 Commits (5e02c87657e3a6055dc2fa660859b5e67183685b)

Author SHA1 Message Date
Mike Benowitz 5e02c87657 Select most complete date
in `copyrightEntry` XML entries there are frequently repeating date elements that have the same values. These can be different and more or less complete. The most common example of this is an entry having two `pub_date` values of `1932` and `1932-06-15` for example.

Previously this process simply took the first date. This updates the date parser to sort all dates of the same type by length, meaning the most complete dates are parsed. If they cannot be parsed the other, less complete, dates are used.

This was prompted by the presence of false January 1st dates and also the discovery that publication dates are often used as replacement registration dates in renewals.
2019-08-21 11:29:22 -04:00
Mike Benowitz 910d514e08
Merge pull request #5 from NYPL/SFR-539-add-fields-to-source
SFR-539 Add new fields to source block
2019-08-13 15:50:23 -04:00
Mike Benowitz 0c1d1eaa75 SFR-539 Add new fields to source block
To help researchers identify the source for copyright entries the following fields (already stored in the database) are added to the API response:

- volume number
- group
- matter
- number (for 3rd series entries only)

The API checks to see if `3` appears in the `series` field and if so adds the `number` field.

This also adds these fields to the object definition in the Swagger documentation.
2019-08-12 15:46:45 -04:00
Mike Benowitz 335080583f Finally resolve date parsing and text issues 2019-08-07 13:34:35 -04:00
Mike Benowitz 5a81f5e12f Change sqlalchemy error type 2019-08-07 12:38:21 -04:00
Mike Benowitz 0cf7f4e464 Catch SQLAlchemy data errors 2019-08-07 11:33:52 -04:00
Mike Benowitz 7ede495a31 Fix bug in publisher and author updates 2019-08-07 09:46:24 -04:00
Mike Benowitz ffa3ed6d5b HOTFIX Resolve renewal source issue 2019-08-06 15:05:16 -04:00
Mike Benowitz 07c1d72265 HOTFIX Handle None dates 2019-08-06 14:38:24 -04:00
Mike Benowitz ab6ca6bb73
Merge pull request #4 from NYPL/SFR-535-use-standard-date
SFR-535 Format date response to ISO 8601 standard
2019-08-06 14:13:43 -04:00
Mike Benowitz b7bf84e137 SFR-535 Format date response to ISO 8601 standard
This includes a fix to standardize output of dates to ISO 8601, replacing the current output of their display as entered in the copyright volumes.

This also includes several code formatting fixes and updates to the `search` and `uuid` endpoints to ensure that the proper status code is returned with non-200 responses
2019-08-06 09:53:54 -04:00
Mike Benowitz 1c0b8a3bf9
Merge pull request #3 from NYPL/api-accuracy-updates
API accuracy updates
2019-08-05 15:31:52 -04:00
Mike Benowitz 70e72105d2 Merge branch 'development' into api-accuracy-updates 2019-08-05 15:21:49 -04:00
Mike Benowitz 73834ff63f Add error handling for 404 and 500 results from PostgreSQL 2019-08-05 15:20:27 -04:00
Mike Benowitz 0a8fb1cde4 Fix issues with delete-cascade
The important tables `xml` and `registration` were not properly set for their `CASCADE` behavior, in addiiton `XML` needed to have the `single_parent` option enabled to allow for cascading-deletes (since otherwise a single entry could be referenced by an entry and a error.
2019-08-05 14:41:27 -04:00
Mike Benowitz 4b1c0b5f1d Fix missing index issue for indexing updates 2019-08-05 14:40:32 -04:00
Mike Benowitz 0f8d50ad4d Further test additions and linting fixes 2019-07-10 17:27:42 -04:00
Mike Benowitz 08e969c6ec Fix elastic indexer time 2019-07-09 17:33:13 -04:00
Mike Benowitz 17a5f33f0a Initial commit of testing configuration using `pytest` 2019-07-09 17:32:22 -04:00
Mike Benowitz 58b8cb30e8 Fix issue in CCE UPDATE operation 2019-07-09 13:14:55 -04:00
Mike Benowitz 5cf334a029 Add swap space configuration to EBS instance 2019-07-09 10:47:39 -04:00
Mike Benowitz 587947b3d4 Comment out empty environment variable block 2019-07-08 16:57:15 -04:00
Mike Benowitz 0675146f38 Update API code to function in all (local/dev/production) environments 2019-07-08 16:55:30 -04:00
Mike Benowitz b526c072a1 Add EBS configuration options
This adds `.ebextension` options to the repository that can control how the beanstalk environment is configured. The two files perform different tasks:

- `sfr-bardo-copyright-development.config` is an empty file for environment variables (empty because at present ENV variables contain secrets that cannot be committed to source control)
- `cron-linux.config` contains configuration details for a nightly cron task that checks for updates from the source git repositories
2019-07-08 14:29:57 -04:00
Mike Benowitz 386a8515a7 Updates to maintenance script for EBS compatibility 2019-07-08 12:18:37 -04:00
Mike Benowitz c1e7fa936c Fix error XML model import 2019-07-08 11:43:19 -04:00
Mike Benowitz 1c1c53543a Remove comment from necessary line 2019-07-08 10:16:31 -04:00
Mike Benowitz f0a112456b Only use config.yaml in local development 2019-07-05 15:19:01 -04:00
Mike Benowitz 0542d85b89 Update app.py to work with EBS WSGI
The ElasticBeanstalk application looks for an object named `application` to run with `WSGI` this was previously created with the `create_app` method and used `app` as the name for the application object.
2019-07-05 14:59:08 -04:00
Mike Benowitz 02b6462b9e Add flasgger to requirements.txt 2019-07-05 14:52:17 -04:00
Mike Benowitz cd9f98e302 Fix app main and api main scripts 2019-07-05 14:06:54 -04:00
Mike Benowitz 4afecd043b
Merge pull request #1 from NYPL/SFR-479-add-api
SFR 479 Add API
2019-07-05 14:04:43 -04:00
Mike Benowitz 6e7ea63ac4 SFR-479 Delete superseeded Swagger file
Removes a Swagger YAML file that is not currently used. It was too difficult to maintain swagger in a set of separate YAML files, so a single one was created and loaded in the main FLask app.
2019-06-04 17:14:04 -04:00
Mike Benowitz f890aaf87b SFR-479 Remove dangling return
Removes old `base` response that was never reached in any case
2019-06-04 17:12:04 -04:00
Mike Benowitz 2a0dc11a62 SFR-479 Update README.md 2019-06-04 17:04:37 -04:00
Mike Benowitz d8f676c88f SFR-479 Add Basic API
Add a basic `Flask` API that responds to queries for copyright data. This includes 5 basic endpoints:

- `/search/fulltext`: queries all text fields in the Registration and Renewal records
- `/search/registration/<regnum>`: queries the collection for a specific copyright registration by registration number
- `/search/renewal/<rennum>`: queries the collection for a specific copyright renewal by renewal number
- `/registration/<uuid>`: fetches a single registration record by internally assigned UUID
- `/registration/<uuid?`: fetches a single renewal record by internally assigned UUID

The api can be run with the standard `python -m flask run` from the root of the project and by default will run in `production` mode. To set `development` run `export FLASK_ENV=development` before starting the application.
2019-06-04 16:43:08 -04:00
Mike Benowitz 27f91b9774 SFR-479 Update import structure to improve loading 2019-06-04 16:42:04 -04:00
Mike Benowitz 812d0b7dce SFR-479 Add Claimants to ElasticSearch index
This adds a full `Claimant` object to the ElasticSearch index, including the `claimant_type` field which helps users see the specific relationship a claimant has to a renewal. It would be good to provide translations of these codes in the future, but this is not currently necessary.
2019-06-04 16:40:22 -04:00
Mike Benowitz edd14d9cdb SFR-453 Initial Commit
Includes an initial version of the utility script used to generate
the copyright entry/renewal database along with instructions on how
to run the script and create a version of the database locally
2019-05-29 12:47:03 -04:00