eshellman
|
e618e5da8d
|
Merge pull request #831 from Gluejar/more_online
add exception handling in stapler
|
2019-03-04 19:48:16 -05:00 |
eric
|
d87578c5a0
|
harden stapler
|
2019-03-04 17:27:55 -05:00 |
eshellman
|
11090aa0d5
|
Merge pull request #830 from Gluejar/more_online
bugfix
|
2019-03-02 21:01:33 -05:00 |
eric
|
52b1621633
|
bugfix
|
2019-03-02 20:55:42 -05:00 |
eshellman
|
4ece9034e6
|
Merge pull request #829 from Gluejar/more_online
refinements
|
2019-03-02 20:50:06 -05:00 |
eric
|
7c33cae82e
|
refinements
- handle dropbox urls with no params
- catch exceptions in stapler
- fix dedupe summary
|
2019-03-02 19:16:47 -05:00 |
eshellman
|
b91d7d4156
|
Merge pull request #828 from Gluejar/more_online
fix degruyter signifier for make_dl
|
2019-03-02 16:34:29 -05:00 |
eric
|
9bf2d85108
|
fix degruyter signifier
also propagate user_agent
|
2019-03-02 16:00:11 -05:00 |
eshellman
|
588bbcf9fd
|
Merge pull request #827 from Gluejar/more_online
whoops
|
2019-03-02 11:18:22 -05:00 |
eric
|
943031ca22
|
whoops
|
2019-03-01 22:38:46 -05:00 |
eshellman
|
18ffdc4f09
|
Merge pull request #826 from Gluejar/more_online
online ebook provider cleanup
|
2019-03-01 22:33:42 -05:00 |
eric
|
02170c9bc2
|
management commands
1. run an update of providers
2. dedupe the online ebooks
3. should have half the onlines to harvest
|
2019-03-01 21:26:39 -05:00 |
eric
|
ac5c241e09
|
resolve doi in doab provider
- resolve the doi before setting the provider
- strip "www." from netloc
- strip url before setting provider
|
2019-03-01 21:23:54 -05:00 |
eshellman
|
fa8a1c2c07
|
Merge pull request #825 from Gluejar/more_online
improve harvesting of online ebooks
|
2019-02-28 21:06:46 -05:00 |
eric
|
1fdac9c548
|
remove dead code
|
2019-02-28 16:34:14 -05:00 |
eric
|
0282ed8136
|
delint
|
2019-02-28 16:22:23 -05:00 |
eric
|
72a40976bc
|
add degruyter handling
- move harvest to separate module
- add ratelimiter class
- add pdf stapler
- add a googlebot UA
- add base url storage in get_soup
|
2019-02-28 15:32:41 -05:00 |
eshellman
|
14ecd864f0
|
Merge pull request #824 from Gluejar/paginate_search
Paginate search
|
2019-02-27 17:04:49 -05:00 |
eric
|
e162308191
|
change to a fulltext query and indices
(this is only a ~20% improvement)
|
2019-02-27 16:40:21 -05:00 |
eric
|
25161ca4a7
|
paginate unglue.it search
|
2019-02-27 15:23:01 -05:00 |
eric
|
2c37a2cb7e
|
Update requirements_versioned.pip
version conflicts with gitberg
|
2019-02-25 10:53:18 -05:00 |
eshellman
|
6d6bd74ed5
|
Merge pull request #823 from Gluejar/fix_ku
missing import
|
2019-02-18 15:46:19 -05:00 |
eric
|
390f403e6c
|
missing import
|
2019-02-18 15:29:16 -05:00 |
eshellman
|
e421660f65
|
Merge pull request #822 from Gluejar/fix_ku
change to ku sso
|
2019-02-18 15:17:33 -05:00 |
eric
|
1a8f22411a
|
change to ku sso
|
2019-02-18 15:06:40 -05:00 |
eshellman
|
9a3b41ed0d
|
Merge pull request #821 from Gluejar/lencrypt
Enable acme challenges for let's encrypt
|
2019-01-31 15:41:29 -05:00 |
eric
|
a078e4e68e
|
wrong regex
|
2019-01-30 15:18:28 -05:00 |
eric
|
fdaa875d19
|
plugin really wants that file path
|
2019-01-30 13:06:46 -05:00 |
eric
|
db14da1d44
|
enable acme challenge
|
2019-01-24 16:54:23 -05:00 |
eshellman
|
bb88c1c4b1
|
Merge pull request #820 from Gluejar/maintenance
Maintenance
|
2019-01-24 13:04:14 -05:00 |
eric
|
8652ce0b77
|
add rounds to ku
|
2019-01-18 12:03:04 -05:00 |
eric
|
92f333fc48
|
sort sitemaps
|
2019-01-18 12:02:45 -05:00 |
eshellman
|
02ec8af4c5
|
Merge pull request #819 from Gluejar/dropboxct
Dropbox contenttype
|
2018-12-10 14:52:27 -05:00 |
eric
|
c6771f2eed
|
fix limit on harvest_online
|
2018-12-10 14:30:54 -05:00 |
eric
|
260650ba92
|
handle application/binary
|
2018-12-10 14:28:39 -05:00 |
eshellman
|
f1da5a495a
|
Merge pull request #818 from Gluejar/username-screening
save the last used
|
2018-12-07 19:51:15 -05:00 |
eric
|
767b8fac06
|
forgot to save the last used
|
2018-12-07 16:00:44 -05:00 |
eshellman
|
0fe102c1ea
|
Merge pull request #817 from Gluejar/username-screening
add username screening
|
2018-12-07 15:00:27 -05:00 |
eric
|
aa12cc75f9
|
add username screening
|
2018-12-07 14:51:25 -05:00 |
eshellman
|
9d8b129477
|
Merge pull request #816 from Gluejar/fix-rh_admin
change to dj 1.11 template syntax
|
2018-11-13 14:19:39 -05:00 |
eric
|
7a44accfa2
|
chang to dj111 syntax
|
2018-11-13 13:39:55 -05:00 |
eshellman
|
8dcfcfe6fd
|
Merge pull request #815 from Gluejar/ku2
bugfix
|
2018-11-05 20:28:20 -05:00 |
eric
|
24ab902e00
|
added ebook activation
|
2018-11-05 18:48:35 -05:00 |
eric
|
ed64dc2b3f
|
bugfix
|
2018-11-05 18:17:46 -05:00 |
eric
|
6535505e4d
|
Revert "Merge branch 'master' into master"
This reverts commit bd52df020d , reversing
changes made to e455d9a766 .
|
2018-11-03 17:23:07 -04:00 |
eshellman
|
bd52df020d
|
Merge branch 'master' into master
|
2018-11-03 17:06:09 -04:00 |
eshellman
|
e455d9a766
|
Merge pull request #814 from Gluejar/ku
Implement Knowledge Unlatched Harvest
|
2018-11-03 16:39:56 -04:00 |
eric
|
f4d7e6f888
|
working ku code
|
2018-11-03 14:47:41 -04:00 |
eric
|
f98de7114e
|
add oapn id
|
2018-11-03 14:33:23 -04:00 |
eric
|
add0375ac3
|
working scraper
|
2018-11-02 14:03:30 -04:00 |