regluit

Commit Graph

Author	SHA1	Message	Date
Raymond Yee	ddd60f5a34	Deleting lt_data.json.gz from git repo	2012-02-27 14:16:23 -08:00
Raymond Yee	c98258e459	Add g_seed_isbn.json which hold the Gutenberg editions I'm loading.	2012-02-27 13:58:47 -08:00
Raymond Yee	446907109f	Compute similarity measures and allow filtering of Gutenberg editions by these measures	2012-02-27 13:58:34 -08:00
Raymond Yee	86fb15b8bc	Code to repick the seed isbn to find isbns that are more likely to be found in a wide variety of data sources	2012-02-27 13:58:17 -08:00
Raymond Yee	f7220d9812	Programs and data for fighting Frankenworks	2012-02-24 12:06:24 -08:00
Raymond Yee	1d001f33ba	Now I think I'm able to calculate the timedate of when the latest "frankenwork" merging is happening	2012-02-21 08:54:12 -08:00
Raymond Yee	e4c23500fb	Putting a copy of the Librarything data into the repo	2012-02-17 13:48:57 -08:00
Raymond Yee	2e079b2c2e	Now I have booktests to recalculate clusters	2012-02-17 10:30:09 -08:00
eric	471cb62fd2	changed core.tasks to not use models	2012-02-16 13:19:36 -05:00
Raymond Yee	a8f1c157be	Check current progress in so that I can focus on a change in the master branch to add missing isbns to Editions	2012-02-15 16:06:40 -08:00
Raymond Yee	9fb57a6b4e	At this point, I have logic in regluit.test.bookloader.load_gutenberg_books to read the data from regluit/experimental/gutenberg/g_seed_isbn.json and load books into the db. Still shaking out bugs from the process though.	2012-02-14 18:01:13 -08:00
Raymond Yee	c04aacec4a	Putting away my work for ry...hope it's ok	2012-02-13 11:28:21 -08:00
Raymond Yee	a9c91bf9c8	Changed the number of Gutenberg books to process	2012-02-11 18:01:37 -08:00
Raymond Yee	cfc3dd3549	Code that I'm now running in quasi-production on my laptop to compute the seed isbn. Let's see how it goes	2012-02-10 19:15:35 -08:00
Raymond Yee	b5c663f82f	basics of database structure for running through all the Gutenberg books. Generating a report on each seed isbn calc	2012-02-10 10:56:08 -08:00
Raymond Yee	d3a183bc61	OK: I'm able to return a single candidate isbn seed now while at the same time caching the results	2012-02-08 14:28:46 -08:00
Raymond Yee	3bc5da4685	Now able to cluster isbns by language of work	2012-02-08 10:44:18 -08:00
Raymond Yee	d06ee6a67e	Progress towards calculating the seed isbn: calculating a union of Freebase + OpenLibrary ISBNs -- then clustered with thingisbn an feeding these ISBNs to Google Books	2012-02-07 22:52:50 -08:00
Raymond Yee	2d98cf9b0a	Now looking at thingisbn data and printing out more data from Google Books (publication data, publisher)	2012-02-03 10:08:48 -08:00
Raymond Yee	9cf875c62a	ol.xisbn working now. Running a test comparing OL, Freebase and Google Books on editions for Surfacing	2012-02-03 09:00:52 -08:00
Raymond Yee	6e5f52db4b	work in progress, especially openlibrary xisbn	2012-02-02 23:07:25 -08:00
Raymond Yee	6d6f9a2724	Small change to the basic hello world tests	2012-01-13 09:39:46 -08:00
Raymond Yee	a08944c465	Make sure there are creators before printing them	2012-01-13 09:36:40 -08:00
Raymond Yee	33b5548b41	Changing Zotero.items() -> Zotero.top() and put exception handling to see what does work vs what doesn't.	2012-01-13 09:20:54 -08:00
Raymond Yee	16d8716f87	Adding a "hello world" test file to test basic functionality of pyzotero	2012-01-13 07:37:25 -08:00
Raymond Yee	6c21074ee7	Added some comments to gutenberg.py Trying to debug zotero_books.py -- pyzotero seems to be quite broken now	2012-01-12 17:52:10 -08:00
Ed Summers	55656e2d3d	now getting subjects from openlibrary instead of from googlebooks. You will need to APPLY MIGRATIONS!	2011-12-19 01:33:13 -05:00
Raymond Yee	4818e92ba2	Writing out the mapping of Gutenberg epub file to OpenLibrary workid	2011-12-12 10:49:33 -08:00
Raymond Yee	d1b58c89ad	Added bookdata.json_for_olid to pull out metadata for any given OpenLibrary ID (olid), including work, edition, author Added map_refine_fb_links_to_openlibrary_work_ids in gutenberg.py to do the mapping of Freebase IDs -> OpenLibrary work ids and capture in database	2011-12-10 14:18:22 -08:00
eric	167dccf574	Wishlists are now filled using the Wishes intermediate table. It's named the same as previous intermediate table, and I've edited the migration so data is not lost. Also, I've added methods od Wishlists to add and remove Works. There are "source" and created columns on the Wishes table	2011-12-08 18:22:20 -05:00
Raymond Yee	a349cb0adf	Current results post-Refine processing of Gutenberg etext_id -> Freebase IDs (via Wikipedia links)	2011-12-05 09:47:52 -08:00
Raymond Yee	810e8ac3e7	Code so far to parse Project Gutenberg catalog, extract Wikipedia links, do some Google Refine munging, and then map Freebase ids to OpenLibrary Work IDs	2011-12-05 09:23:17 -08:00
Raymond Yee	e121e07e72	Added xisbn-like method based on Freebase data; Added a Freebase /book/book id to OpenLibrary work id mapper	2011-12-05 09:19:07 -08:00
Ed Summers	30e6dc38cd	experimental scripts to try to match metadata in oai-pmh feeds (online books page) to googlebooks	2011-12-04 21:45:53 -05:00
Raymond Yee	31edebe769	Fleshing out Freebase book data search	2011-11-09 09:09:58 -08:00
Raymond Yee	68b4da17d1	Some code for OpenLibrary, Freebase, HathiTrust to explore the nature of the data available in those sources	2011-11-06 07:55:07 -05:00
Raymond Yee	820107bd4d	Got oauth signing to work with goodreads reviews_list	2011-11-04 14:04:32 -07:00
Raymond Yee	29104f6347	Setting up an experimental folder to hold proof of concept code	2011-11-02 17:48:38 -07:00

38 Commits (0aba595e050648b9a99f4ac54de260c88e5c4730)