3.7 KiB
layout | title | permalink |
---|---|---|
default | Volunteers' FAQ | Project Gutenberg | /help/volunteers_faq.html |
Volunteers' FAQ
Project Gutenberg welcomes contributions of eBooks from people with the interest, time, and skillset needed to meet our submission standards. Details of the process and the standards are at our copyright clearance site copy.pglaf.org and upload site upload.pglaf.org.
Join Distributed Proofreaders, Instead
For most people interested in producing eBooks, we recommend starting with Distributed Proofreaders (https://www.pgdp.net). With Distributed Proofreaders, you can get involved with different portions of the production pipeline described below. This is a much easier way to get started, and results in very high quality eBooks.
If you simply want to suggest a book for digitization, DP has online forums for this, or you can simply send an email (contact information is on the site).
Distributed Proofreaders maintains canonical guidance on production. See especially:
- The Post-Processing FAQ
- Easy Epub. This is a guide to how best to handle the HTML that goes through epubmaker to lead to passable epubs/mobis files.
- HTML Best Practices. This was written a while back but DP tries to keep it up-to-date.
Being a Solo Producer
If you might be interested in producing an eBook yourself, without involving Distributed Proofreaders, here is some guidance. But start with what's above, including the DP links.
In a nutshell, the production process typically involves the following:
- Identify a candidate printed book. Confirm it is not already in the collection, or in process by other volunteers. Use the Collection Development Policy to guide you on eligibility.
- Obtain a copyright clearance for the printed book. Usually this is based on scanned title page and verso page demonstrating the printed book was published more than 95 years ago. See the Copyright How-To.
- Obtain scans of the book. This may be done using your own scanner, or there might be online scans available for reuse. Scans must come from the exact same print edition as your copyright clearance.
- Perform optical character recognition (OCR) on the scans, to make an approximate representation of the book in plain text.
- Proofread, proofread, proofread: "Fix" the OCR output by carefully fixing any errors it made. Remove page headers & footers. De-hyphenate. Add back italics or other formatting.
- Format: Generate valid and well-formed HTML source. Different tools are available for this, and usually involve editing the HTML source code directly. Note that many tools produce convoluted, non-standard, or non-valid HTML, which can be very difficult to clean up for Project Gutenberg: poor HTML is not accepted, even if it is valid.
- Check, and recheck. The upload site has various tools, including to test proper conversion to derived formats.
- Upload your work, using the copyright clearance key generated earlier.
- Coordinate with the Project Gutenberg production volunteers (known as "whitewashers," after the Mark Twain book) on final formatting and presentation.
- Once the eBook is added to the Project Gutenberg collection, confirm it is appearing correctly, and all metadata are correct.
- If possible, stay in touch into the future. If we receive errata reports that require access to source material, or are stylistic or subjective in nature, we might get in touch to discuss potential changes.