webapp for unglue.it
 
 
 
 
 
 
Go to file
Eric Hellman f8023913a0
Merge pull request #1065 from Gluejar/maintenance-2024
simplify sitemap, remove expensive cover_image query
2024-10-28 13:32:23 -04:00
api more optimized queryset access 2024-01-19 15:18:34 -05:00
bisac nit 2020-02-14 13:55:30 -05:00
bookdata update ubiquity sites 2020-06-25 14:21:37 -04:00
booxtream end support for mobi 2022-07-28 15:05:13 +02:00
core remove expensive cover_image query 2024-10-28 13:30:55 -04:00
deploy unused 2020-02-26 23:11:27 -05:00
distro refactor ebf(url) 2020-08-15 20:21:56 -04:00
docs https 2017-07-27 10:33:13 -04:00
frontend precompute the number of free books per subject 2024-08-19 18:45:51 -04:00
libraryauth test the encoder 2024-03-21 14:37:47 -04:00
marc better line endings 2020-09-26 12:30:13 -04:00
payment optimize getting first entry of a queryset 2024-01-19 13:27:43 -05:00
questionnaire remove questionnaire, replace with redirector 2023-12-28 14:12:13 -05:00
selenium Adding explicit waits to selenium payment tests in order to wait for very slow js when running headless on ec2 2012-01-26 19:50:14 +00:00
settings add openlibrary switch 2024-10-21 18:46:05 -04:00
static add one more css 2023-01-11 14:53:19 -05:00
sysadmin exception syntax 2020-02-12 17:56:04 -05:00
test optimize getting first entry of a queryset 2024-01-19 13:27:43 -05:00
utils fix title cleaner 2023-07-31 17:57:43 -04:00
.gitattributes fix problems for please 2017-01-06 18:35:49 -08:00
.gitignore requirements and source pipfile 2023-12-28 14:10:07 -05:00
.python-version add author string cleanup 2022-09-20 20:18:34 -04:00
CODE_OF_CONDUCT.md add 3rd party licenses 2017-01-19 11:18:26 -05:00
LICENSE.txt add license 2017-01-19 10:37:59 -05:00
Pipfile PdfMerger is removed 2024-10-02 20:40:43 -04:00
README.md february cleanup 2024-03-04 15:24:37 -05:00
__init__.py odd circ import with ansible install 2020-02-27 17:32:59 -05:00
admin.py refactor admin 2016-07-26 10:34:45 -04:00
celery_module.py odd circ import with ansible install 2020-02-27 17:32:59 -05:00
context_processors.py is_anonymous and is_authenticated are properties 2018-07-23 22:17:05 -04:00
manage.py update manage.py for Django 1.6 2016-04-11 13:17:16 -07:00
requirements.txt PdfMerger is removed 2024-10-02 20:40:43 -04:00
urls.py simplify sitemaps 2024-10-28 13:30:19 -04:00

README.md

regluit - "The Unglue.it web application and website"

Another repo - https://github.com/EbookFoundation/regluit will eventually be the place for collaborative development for Unglue.it. Add issues and submit pull requests there. As of September 1, 2019, https://github.com/Gluejar/regluit is still being used for production builds.

The first version of the unglue.it codebase was a services-oriented project named "unglu". We decided that "unglu" was too complicated, so we started over and named the new project "regluit". regluit is a Django project that contains four main applications: core, frontend, api and payment that can be deployed and configured on as many ec2 instances that are needed to support traffic. The partitioning between these modules is not as clean as would be ideal. payment is particularly messy because we had to retool it twice because we had to switch from Paypal to Amazon Payments to Stripe.

regluit was originally developed on Django 1.3 (python 2.7) and currently runs on Django 1.11 Python 3.8).

Develop

Here are some instructions for setting up regluit for development on an Ubuntu system. If you are on OS X see notes below.

  • Ensure MySQL 5.7 and Redis are installed & running on your system.
  1. Create a MySQL database and user for unglueit.
  2. sudo apt-get upgrade gcc
  3. sudo apt-get install python-setuptools git python-lxml build-essential libssl-dev libffi-dev python3.8-dev libxml2-dev libxslt-dev libmysqlclient-dev
  4. sudo easy_install virtualenv virtualenvwrapper
  5. git clone git@github.com:Gluejar/regluit.git
  6. cd regluit
  7. mkvirtualenv regluit
  8. pip install -r requirements.txt
  9. add2virtualenv ..
  10. cp settings/dev.py settings/me.py
  11. mkdir settings/keys/
  12. cp settings/dummy/* settings/keys/
  13. Edit settings/me.py with proper mysql and redis configurations.
  14. Edit settings/keys/common.py and settings/keys/host.py with account and key information OR if you have the ansible vault password, run ansible-playbook create_keys.yml inside the vagrant directory.
  15. echo 'export DJANGO_SETTINGS_MODULE=regluit.settings.me' >> ~/.virtualenvs/regluit/bin/postactivate
  16. deactivate ; workon regluit
  17. django-admin.py migrate --noinput
  18. django-admin.py loaddata core/fixtures/initial_data.json core/fixtures/bookloader.json populate database with test data to run properly.
  19. redis-server to start the task broker
  20. celery -A regluit worker --loglevel=INFO start the celery daemon to perform asynchronous tasks like adding related editions, and display logging information in the foreground. Add --logfile=logs/celery.log if you want the logs to go into a log file.
  21. celery -A regluit beat --loglevel=INFO to start the celerybeat daemon to handle scheduled tasks.
  22. django-admin.py runserver 0.0.0.0:8000 (you can change the port number from the default value of 8000)
  23. make sure a redis server is running
  24. Point your browser to http://localhost:8000/

CSS development

  1. We used Less version 2.8 for CSS. http://incident57.com/less/. We use minified CSS.
  2. New CSS development is using SCSS. Install libsass and django-compressor.

Production Deployment

See http://github.com/EbookFoundation/regluit-provisioning

OS X Developer Notes

To run regluit on OS X you should have XCode installed

Install MySQL: brew install mysql@5.7
mysql_secure_installation mysqld_safe --user=root -p

We use pyenv and pipenv to set up an environment.

Edit or create .bashrc in ~ to enable virtualenvwrapper commands:

  1. pipenv install -r requirements.txt

  2. Edit .zshrc to include the following lines:

    eval "$(pyenv init -)" export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/10/bin export PATH=$PATH:/usr/local/opt/mysql-client/bin:$PATH export ANSIBLE_VAULT_PASSWORD_FILE=PATH_TO_VAULT_PASSWORD

If you get EnvironmentError: mysql_config not found you might need to set a path to mysqlconfig

You may need to set utf8 in /etc/my.cnf collation-server = utf8_unicode_ci

init-connect='SET NAMES utf8'
character-set-server = utf8

MARC Records

For unglued books with existing print edition MARC records

  1. Get the MARCXML record for the print edition from the Library of Congress.
    1. Find the book in catalog.loc.gov
    2. Click on the permalink in its record (will look something like lccn.loc.gov/2009009516)
    3. Download MARCXML
  2. At /marc/ungluify/ , enter the unglued edition in the Edition field, upload file, choose license
  3. The XML record will be automatically...
    • converted to suitable MARCXML and .mrc records, with both direct and via-unglue.it download links
    • written to S3
    • added to a new instance of MARCRecord
    • provided to ungluers at /marc/
  1. Use /admin to create a new MARC record instance
  2. Upload the MARC records to s3 (or wherever)
  3. Add the URLs of the .xml and/or .mrc record(s) to the appropriate field(s)
  4. Select the relevant edition
  5. Select an appropriate marc_format:
    • use DIRECT if it links directly to the ebook file
    • use UNGLUE if it links to the unglue.it download page
    • if you have records with both DIRECT and UNGLUE links, you'll need two MARCRecord instances
    • if you have both kinds of link, put them in separate records, as marc_format can only take one value
      ungluify_record.py should only be used to modify records of print editions of unglued ebooks. It will not produce appropriate results for CC/PD ebooks.

For unglued ebooks without print edition MARC records, or CC/PD books without ebook MARC records

  1. Get a contract cataloger to produce quality records (.xml and .mrc formats)
    • we are using ungas the format for our accession numbers, where is the id of the MARCRecord instance, plus leading zeroes
  2. Upload those records to s3 (or wherever)
  3. Create a MARCRecord instance in /admin
  4. Add the URLs of the .xml and .mrc records to the appropriate fields
  5. Select the relevant edition
  6. Select an appropriate marc_format:
    • use DIRECT if it links directly to the ebook file
    • use UNGLUE if it links to the unglue.it download page
    • if you have records with both DIRECT and UNGLUE links, you'll need two MARCRecord instances
    • if you have both kinds of link, put them in separate records, as marc_format can only take one value

MySQL Migration

5.7 - 8.0 Notes

  • Many migration blockers were removed by by dumping, then restoring the database.
  • After that, RDS was able to migrate
  • needed to create the unglueit user from the mysql client