Raymond Yee
8a85aa88d3
first cut at producing iterator interfaces to Stripe objects -- here I implement one for events
...
don't yield empty page in bookdata.py's grouper
remove extraneous import in gutenberg.py
expose stripe test key as a module variable to make it easier to create a StripeClient that will be in test mode (sc=StipeClient(api_key=TEST_STRIPE_SK))
2012-10-18 06:59:08 -07:00
Raymond Yee
359ff71984
Add g_seed_isbn.json which hold the Gutenberg editions I'm loading.
2012-02-27 13:19:58 -08:00
Raymond Yee
7c9b6f9eba
Compute similarity measures and allow filtering of Gutenberg editions by these measures
2012-02-27 12:12:06 -08:00
Raymond Yee
dcfc24e380
Code to repick the seed isbn to find isbns that are more likely to be found in a wide variety of data sources
2012-02-27 08:46:34 -08:00
Raymond Yee
a8f1c157be
Check current progress in so that I can focus on a change in the master branch to add missing isbns to Editions
2012-02-15 16:06:40 -08:00
Raymond Yee
9fb57a6b4e
At this point, I have logic in regluit.test.bookloader.load_gutenberg_books to read the data from regluit/experimental/gutenberg/g_seed_isbn.json and load books into the db. Still shaking out bugs from the process though.
2012-02-14 18:01:13 -08:00
Raymond Yee
a9c91bf9c8
Changed the number of Gutenberg books to process
2012-02-11 18:01:37 -08:00
Raymond Yee
cfc3dd3549
Code that I'm now running in quasi-production on my laptop to compute the seed isbn. Let's see how it goes
2012-02-10 19:15:35 -08:00
Raymond Yee
b5c663f82f
basics of database structure for running through all the Gutenberg books.
...
Generating a report on each seed isbn calc
2012-02-10 10:56:08 -08:00
Raymond Yee
d3a183bc61
OK: I'm able to return a single candidate isbn seed now while at the same time caching the results
2012-02-08 14:28:46 -08:00
Raymond Yee
3bc5da4685
Now able to cluster isbns by language of work
2012-02-08 10:44:18 -08:00
Raymond Yee
d06ee6a67e
Progress towards calculating the seed isbn: calculating a union of Freebase + OpenLibrary ISBNs -- then clustered with thingisbn an feeding these ISBNs to Google Books
2012-02-07 22:52:50 -08:00
Raymond Yee
6e5f52db4b
work in progress, especially openlibrary xisbn
2012-02-02 23:07:25 -08:00
Raymond Yee
6c21074ee7
Added some comments to gutenberg.py
...
Trying to debug zotero_books.py -- pyzotero seems to be quite broken now
2012-01-12 17:52:10 -08:00
Raymond Yee
4818e92ba2
Writing out the mapping of Gutenberg epub file to OpenLibrary workid
2011-12-12 10:49:33 -08:00
Raymond Yee
d1b58c89ad
Added bookdata.json_for_olid to pull out metadata for any given OpenLibrary ID (olid), including work, edition, author
...
Added map_refine_fb_links_to_openlibrary_work_ids in gutenberg.py to do the mapping of Freebase IDs -> OpenLibrary work ids and capture in database
2011-12-10 14:18:22 -08:00
Raymond Yee
a349cb0adf
Current results post-Refine processing of Gutenberg etext_id -> Freebase IDs (via Wikipedia links)
2011-12-05 09:47:52 -08:00
Raymond Yee
810e8ac3e7
Code so far to parse Project Gutenberg catalog, extract Wikipedia links, do some Google Refine munging, and then map Freebase ids to OpenLibrary Work IDs
2011-12-05 09:23:17 -08:00