gutenbergsite/site/policy/collection_development.md

8.7 KiB

layout title permalink
default Collection Development | Project Gutenberg /policy/collection_development.html

Collection Development Policy

Project Gutenberg is a library of free electronic books (eBooks). The Project Gutenberg collection has been built by the efforts of volunteers who, over many years, have selected and digitized a variety of written and other works. The collection continues to grow, as new works are submitted.

What types of works are eligible?

Project Gutenberg accepts only donations of eBooks (i.e., written works) that are not currently protected by copyright in the United States. Such works are in the public domain. New Project Gutenberg eBooks are typically digitized versions of books that were published long ago and for which any US copyright has expired.

Project Gutenberg's collection development focuses on literature and other written works. Selections are made by volunteers with diverse interests, and essentially all eligible submissions are welcome.

The basic eligibility criteria are:

  • Submitted eBooks are digitized versions of printed books, or similar items such as manuals, pamphlets, periodicals, travelogues, theses, journals, or chapbooks.
  • Evidence is submitted via the copyright portal at https://copy.pglaf.org to enable Project Gutenberg to confirm the source - that is, the printed item(s) from which the eBook is derived - is in the public domain in the US.
  • The eBook is submitted via the upload portal at https://upload.pglaf.org, and meets the requirements there for formatting and for proofreading accuracy.
  • The resulting eBook is entirely free of any US copyrights. Importantly, this includes that the "sweat of the brow" effort to digitize the source is not an act of authorship, so is not eligible for copyright; and also that incidental or supplemental additions, such as transcriber's notes, indices, improvements or supplements to artwork, or new cover art, are granted permanently to the public domain.

Some types of items which are ineligible include:

  • Scans of books or other sources that have not been converted to machine-readable text and undergone proofreading and formatting to the requirements of the upload portal.
  • Unpublished contemporary items, even if they are granted to the public domain by the author.
  • Items that were not previously published or distributed.

Project Gutenberg eBooks are new works, derived from existing printed works. Project Gutenberg does not require that its eBooks be exact representations of their printed sources (i.e., facsimiles). Instead, the printed works are transformed into modern digital formats. This process typically includes removing page headers/footers, de-hyphenation, formatting or relocating footnotes and endnotes, adding internal links for table of contents and indices, and many other improvements that are intended to yield an enjoyable reading experience.

There are strict criteria for file formats and compliance checks, which are documented in the upload portal, but Project Gutenberg allows latitude for the volunteers who produce new eBooks to make choices about how to go about digitization. The producer may make stylistic choices, such as for page layout or images, whether to indicate original page numbers from the source, how to handle footnotes or endnotes, and other aspects of how the printed item is digitized. Producers are encouraged to focus on presentation of the content and structure of the eBook, more than a particular visual presentation or layout. This is because Project Gutenberg eBooks are intended to be enjoyable, no matter how or where they are read, now and in the future. As such, production choices should not inhibit automated or non-automated creation of new derived formats.

What topics and subject matter are accepted?

The eBooks in the Project Gutenberg collection are freely offered to readers for their enjoyment, enlightenment, education, and entertainment. The collection includes eBooks on many topics. There is emphasis on literary works and reference items of historical significance, because volunteers have focused on digitizing such works. Any eligible item, on any topic, is welcome.

Project Gutenberg follows the principles of the American Library Association's Freedom to Read Statement (FTR), which may be found online at www.ala.org.

This commitment means that Project Gutenberg does not avoid difficult or unpopular topics. It also means that Project Gutenberg adds eBooks to its collection that contain language or ideas that are outdated, incorrect, offensive, or otherwise inconsistent with today's societal views, standards or morals.

The FTR relies on the US Constitution's First Amendment right to freedom of speech and freedom of the press. Project Gutenberg was founded on the idea that free, unlimited access to the world's literature is a pathway to literacy, education, opportunity, and enlightenment. It is inimical to these principles that the collection, or access to it, be restricted due to content.

Project Gutenberg's readers and contributors are encouraged to read the entire FTR document. It presents a vision for how libraries and publishers, and the people behind them, may work together to "enrich the quality and diversity of thought and expression."

The FTR makes it clear that inclusion of an item in a library collection does not mean the ideas within it are endorsed by the library. "Publishers, librarians, and booksellers do not need to endorse every idea or presentation they make available. It would conflict with the public interest for them to establish their own political, moral, or aesthetic views as a standard for determining what should be published or circulated."

The final paragraph of FTR nicely summarizes Project Gutenberg's commitment to building a diverse and vibrant collection, and to not avoid inclusion of eBooks based on their topics or the ideas within them: "We state these propositions neither lightly nor as easy generalizations. We here stake out a lofty claim for the value of the written word. We do so because we believe that it is possessed of enormous variety and usefulness, worthy of cherishing and keeping free. We realize that the application of these propositions may mean the dissemination of ideas and manners of expression that are repugnant to many persons. We do not state these propositions in the comfortable belief that what people read is unimportant. We believe rather that what people read is deeply important; that ideas can be dangerous; but that the suppression of ideas is fatal to a democratic society. Freedom itself is a dangerous way of life, but it is ours."

Historical context

The Project Gutenberg collection includes a number of items that do not meet the public domain or formatting criteria described above. The founder of Project Gutenberg, Michael Hart, invented eBooks in 1971, and the online library grew substantially in the 1980s and 1990s. During that period of time, there were not many free collections of diverse literary works, and some of the modern standards we now rely on had not yet emerged (such as Unicode for representing character sets and HTML for textual markup).

Project Gutenberg worked with many different content types, including audio books, digitized sheet music, some movies, and quite a few copyrighted items that were donated by contemporary authors. By the early 2000s, it was clear that Project Gutenberg was not as well-suited for those different content types as for public domain literature. There are now many other outlets for these other types of works, including a self-publishing portal for contemporary authors that is operated by a Project Gutenberg affiliate.

Project Gutenberg will not remove or deprecate these previous items. They were all donated and accepted with the best of intentions, and with the understanding that Project Gutenberg would provide for their long-term stewardship and unlimited redistribution. Michael Hart expressed unending gratitude to all the people who contributed content, and who digitized previous works. Project Gutenberg remains grateful to all of its contributors and volunteers.

Status of this policy

The Project Gutenberg collection development policy was approved by the Board of Directors of the Project Gutenberg Literary Archive Foundation (PGLAF) in November 2019. It was also endorsed by the Trustees of the Distributed Proofreaders Foundation.

Day-to-day operation of the Project Gutenberg website, along with associated workflows and procedures, is by volunteers. The specific mechanisms applied to grow the Project Gutenberg collection are subject to change over time, within the guidance included in collection development policy.