Commit Graph

6863 Commits (c142533898b3f8e2763d2ec5d4d7a3ceee91b4fa)

Author SHA1 Message Date
eric c142533898 db cleaning 2019-03-27 21:22:56 -04:00
eric e563da9655 refactor lang validation 2019-03-27 21:22:37 -04:00
eric 6fd33d989c don't create bad works 2019-03-27 21:21:25 -04:00
eric e53450fb01 use newer PyPFD 2019-03-25 12:47:46 -04:00
eric 5fc6a2ee82 harvest more ebooks 2019-03-25 12:47:20 -04:00
eric fe05ff9f88 don't stall on super big pdf files 2019-03-25 12:47:04 -04:00
eric 2396e23ae4 fix missing lang string 2019-03-25 12:46:20 -04:00
eric 174b46abd1 add mobied to ebf admin 2019-03-25 12:45:53 -04:00
eshellman 6031aa8733
Merge pull request #833 from Gluejar/more_online
fix undefined "stapled"
2019-03-08 23:51:58 -05:00
eric c190fc0bb1 fix undefined "stapled" 2019-03-08 23:45:54 -05:00
eshellman bb15b48569
Merge pull request #832 from Gluejar/more_online
bugfix
2019-03-05 12:20:17 -05:00
eric 9b12418ada catch more pdf errors 2019-03-05 12:02:42 -05:00
eric cefbc7c56f bugfix 2019-03-05 10:12:51 -05:00
eshellman e618e5da8d
Merge pull request #831 from Gluejar/more_online
add exception handling in stapler
2019-03-04 19:48:16 -05:00
eric d87578c5a0 harden stapler 2019-03-04 17:27:55 -05:00
eshellman 11090aa0d5
Merge pull request #830 from Gluejar/more_online
bugfix
2019-03-02 21:01:33 -05:00
eric 52b1621633 bugfix 2019-03-02 20:55:42 -05:00
eshellman 4ece9034e6
Merge pull request #829 from Gluejar/more_online
refinements
2019-03-02 20:50:06 -05:00
eric 7c33cae82e refinements
- handle dropbox urls with no params
- catch exceptions in stapler
- fix dedupe summary
2019-03-02 19:16:47 -05:00
eshellman b91d7d4156
Merge pull request #828 from Gluejar/more_online
fix degruyter signifier for make_dl
2019-03-02 16:34:29 -05:00
eric 9bf2d85108 fix degruyter signifier
also propagate user_agent
2019-03-02 16:00:11 -05:00
eshellman 588bbcf9fd
Merge pull request #827 from Gluejar/more_online
whoops
2019-03-02 11:18:22 -05:00
eric 943031ca22 whoops 2019-03-01 22:38:46 -05:00
eshellman 18ffdc4f09
Merge pull request #826 from Gluejar/more_online
online ebook provider cleanup
2019-03-01 22:33:42 -05:00
eric 02170c9bc2 management commands
1. run an update of providers
2. dedupe the online ebooks
3. should have half the onlines to harvest
2019-03-01 21:26:39 -05:00
eric ac5c241e09 resolve doi in doab provider
- resolve the doi before setting the provider
- strip "www." from netloc
- strip url before setting provider
2019-03-01 21:23:54 -05:00
eshellman fa8a1c2c07
Merge pull request #825 from Gluejar/more_online
improve harvesting of online ebooks
2019-02-28 21:06:46 -05:00
eric 1fdac9c548 remove dead code 2019-02-28 16:34:14 -05:00
eric 0282ed8136 delint 2019-02-28 16:22:23 -05:00
eric 72a40976bc add degruyter handling
- move harvest to separate module
- add ratelimiter class
- add pdf stapler
- add a googlebot UA
- add base url storage in get_soup
2019-02-28 15:32:41 -05:00
eshellman 14ecd864f0
Merge pull request #824 from Gluejar/paginate_search
Paginate search
2019-02-27 17:04:49 -05:00
eric e162308191 change to a fulltext query and indices
(this is only a ~20% improvement)
2019-02-27 16:40:21 -05:00
eric 25161ca4a7 paginate unglue.it search 2019-02-27 15:23:01 -05:00
eric 2c37a2cb7e Update requirements_versioned.pip
version conflicts with gitberg
2019-02-25 10:53:18 -05:00
eshellman 6d6bd74ed5
Merge pull request #823 from Gluejar/fix_ku
missing import
2019-02-18 15:46:19 -05:00
eric 390f403e6c missing import 2019-02-18 15:29:16 -05:00
eshellman e421660f65
Merge pull request #822 from Gluejar/fix_ku
change to ku sso
2019-02-18 15:17:33 -05:00
eric 1a8f22411a change to ku sso 2019-02-18 15:06:40 -05:00
eshellman 9a3b41ed0d
Merge pull request #821 from Gluejar/lencrypt
Enable acme challenges for let's encrypt
2019-01-31 15:41:29 -05:00
eric a078e4e68e wrong regex 2019-01-30 15:18:28 -05:00
eric fdaa875d19 plugin really wants that file path 2019-01-30 13:06:46 -05:00
eric db14da1d44 enable acme challenge 2019-01-24 16:54:23 -05:00
eshellman bb88c1c4b1
Merge pull request #820 from Gluejar/maintenance
Maintenance
2019-01-24 13:04:14 -05:00
eric 8652ce0b77 add rounds to ku 2019-01-18 12:03:04 -05:00
eric 92f333fc48 sort sitemaps 2019-01-18 12:02:45 -05:00
eshellman 02ec8af4c5
Merge pull request #819 from Gluejar/dropboxct
Dropbox contenttype
2018-12-10 14:52:27 -05:00
eric c6771f2eed fix limit on harvest_online 2018-12-10 14:30:54 -05:00
eric 260650ba92 handle application/binary 2018-12-10 14:28:39 -05:00
eshellman f1da5a495a
Merge pull request #818 from Gluejar/username-screening
save the last used
2018-12-07 19:51:15 -05:00
eric 767b8fac06 forgot to save the last used 2018-12-07 16:00:44 -05:00