Commit Graph

41 Commits (f72796acdfe615387035fbb3ec606e84da46e302)

Author SHA1 Message Date
Raymond Yee b9c272ae74 Removing stuff I had put into ry branch, specifically the experimental branch 2012-04-18 15:12:15 -07:00
Raymond Yee 5a9c8fe207 Simple experimental view 2012-03-02 09:31:13 -08:00
Raymond Yee f4e2960ae1 Putting in a skeleton to build views in experimental 2012-03-01 20:06:00 -08:00
Raymond Yee ddd60f5a34 Deleting lt_data.json.gz from git repo 2012-02-27 14:16:23 -08:00
Raymond Yee c98258e459 Add g_seed_isbn.json which hold the Gutenberg editions I'm loading. 2012-02-27 13:58:47 -08:00
Raymond Yee 446907109f Compute similarity measures and allow filtering of Gutenberg editions by these measures 2012-02-27 13:58:34 -08:00
Raymond Yee 86fb15b8bc Code to repick the seed isbn to find isbns that are more likely to be found in a wide variety of data sources 2012-02-27 13:58:17 -08:00
Raymond Yee f7220d9812 Programs and data for fighting Frankenworks 2012-02-24 12:06:24 -08:00
Raymond Yee 1d001f33ba Now I think I'm able to calculate the timedate of when the latest "frankenwork" merging is happening 2012-02-21 08:54:12 -08:00
Raymond Yee e4c23500fb Putting a copy of the Librarything data into the repo 2012-02-17 13:48:57 -08:00
Raymond Yee 2e079b2c2e Now I have booktests to recalculate clusters 2012-02-17 10:30:09 -08:00
eric 471cb62fd2 changed core.tasks to not use models 2012-02-16 13:19:36 -05:00
Raymond Yee a8f1c157be Check current progress in so that I can focus on a change in the master branch to add missing isbns to Editions 2012-02-15 16:06:40 -08:00
Raymond Yee 9fb57a6b4e At this point, I have logic in regluit.test.bookloader.load_gutenberg_books to read the data from regluit/experimental/gutenberg/g_seed_isbn.json and load books into the db. Still shaking out bugs from the process though. 2012-02-14 18:01:13 -08:00
Raymond Yee c04aacec4a Putting away my work for ry...hope it's ok 2012-02-13 11:28:21 -08:00
Raymond Yee a9c91bf9c8 Changed the number of Gutenberg books to process 2012-02-11 18:01:37 -08:00
Raymond Yee cfc3dd3549 Code that I'm now running in quasi-production on my laptop to compute the seed isbn. Let's see how it goes 2012-02-10 19:15:35 -08:00
Raymond Yee b5c663f82f basics of database structure for running through all the Gutenberg books.
Generating a report on each seed isbn calc
2012-02-10 10:56:08 -08:00
Raymond Yee d3a183bc61 OK: I'm able to return a single candidate isbn seed now while at the same time caching the results 2012-02-08 14:28:46 -08:00
Raymond Yee 3bc5da4685 Now able to cluster isbns by language of work 2012-02-08 10:44:18 -08:00
Raymond Yee d06ee6a67e Progress towards calculating the seed isbn: calculating a union of Freebase + OpenLibrary ISBNs -- then clustered with thingisbn an feeding these ISBNs to Google Books 2012-02-07 22:52:50 -08:00
Raymond Yee 2d98cf9b0a Now looking at thingisbn data and printing out more data from Google Books (publication data, publisher) 2012-02-03 10:08:48 -08:00
Raymond Yee 9cf875c62a ol.xisbn working now. Running a test comparing OL, Freebase and Google Books on editions for Surfacing 2012-02-03 09:00:52 -08:00
Raymond Yee 6e5f52db4b work in progress, especially openlibrary xisbn 2012-02-02 23:07:25 -08:00
Raymond Yee 6d6f9a2724 Small change to the basic hello world tests 2012-01-13 09:39:46 -08:00
Raymond Yee a08944c465 Make sure there are creators before printing them 2012-01-13 09:36:40 -08:00
Raymond Yee 33b5548b41 Changing Zotero.items() -> Zotero.top() and put exception handling to see what does work vs what doesn't. 2012-01-13 09:20:54 -08:00
Raymond Yee 16d8716f87 Adding a "hello world" test file to test basic functionality of pyzotero 2012-01-13 07:37:25 -08:00
Raymond Yee 6c21074ee7 Added some comments to gutenberg.py
Trying to debug zotero_books.py -- pyzotero seems to be quite broken now
2012-01-12 17:52:10 -08:00
Ed Summers 55656e2d3d now getting subjects from openlibrary instead of from googlebooks. You will need to APPLY MIGRATIONS! 2011-12-19 01:33:13 -05:00
Raymond Yee 4818e92ba2 Writing out the mapping of Gutenberg epub file to OpenLibrary workid 2011-12-12 10:49:33 -08:00
Raymond Yee d1b58c89ad Added bookdata.json_for_olid to pull out metadata for any given OpenLibrary ID (olid), including work, edition, author
Added map_refine_fb_links_to_openlibrary_work_ids in gutenberg.py to do the mapping of Freebase IDs -> OpenLibrary work ids and capture in database
2011-12-10 14:18:22 -08:00
eric 167dccf574 Wishlists are now filled using the Wishes intermediate table. It's named the same as previous intermediate table, and I've edited the migration so data is not lost.
Also, I've added methods od Wishlists to add and remove Works. There
are "source" and created columns on the Wishes table
2011-12-08 18:22:20 -05:00
Raymond Yee a349cb0adf Current results post-Refine processing of Gutenberg etext_id -> Freebase IDs (via Wikipedia links) 2011-12-05 09:47:52 -08:00
Raymond Yee 810e8ac3e7 Code so far to parse Project Gutenberg catalog, extract Wikipedia links, do some Google Refine munging, and then map Freebase ids to OpenLibrary Work IDs 2011-12-05 09:23:17 -08:00
Raymond Yee e121e07e72 Added xisbn-like method based on Freebase data; Added a Freebase /book/book id to OpenLibrary work id mapper 2011-12-05 09:19:07 -08:00
Ed Summers 30e6dc38cd experimental scripts to try to match metadata in oai-pmh feeds (online books page) to googlebooks 2011-12-04 21:45:53 -05:00
Raymond Yee 31edebe769 Fleshing out Freebase book data search 2011-11-09 09:09:58 -08:00
Raymond Yee 68b4da17d1 Some code for OpenLibrary, Freebase, HathiTrust to explore the nature of the data available in those sources 2011-11-06 07:55:07 -05:00
Raymond Yee 820107bd4d Got oauth signing to work with goodreads reviews_list 2011-11-04 14:04:32 -07:00
Raymond Yee 29104f6347 Setting up an experimental folder to hold proof of concept code 2011-11-02 17:48:38 -07:00