update for 2022-2023

pull/15/head
eric 2022-08-29 13:21:20 +02:00
parent 3e29f46937
commit 5486d766b3
3 changed files with 103 additions and 22 deletions

View File

@ -1,27 +1,25 @@
# capstone-projects # capstone-projects
This repo is used to organize Free Ebook Foundation projects for Stevens Institute of Technology Senior-year computer science capstone projects. This repo is used to organize Free Ebook Foundation projects for Stevens Institute of Technology Senior-year computer science capstone projects.
## 2022-2023
Proposed projects:
- [OAPEN Suggestion Service](oapen-doab.md#oapen-suggestion-service)
- [Expert System for Open Access Book Website Analysis](oapen-doab.md#expert-system-for-open-access-book-website-analysis)
Students interested in these projects should use Github issues and pull requests to develop and propose teams. For example, students interested in a project but needing team members, and teams needing additional members should create an issue describing their interest and needs. Use issues to ask questions or seek clarification about the projects. To propose a team for a specific project, create a pull request adding the names of team members to the project page. You may also want to include roles, capabilities and the approach of the team.
We will not accept a proposal PR until September 14. But do not wait until then to start a pull request even if your team is incomplete or you're still deciding - we will comment on PRs with the goal of improving them, and you can close the PR to withdraw the proposal. If there are competing proposals, we will give preference to the best developed proposal. We am happy to schedule a Q&A session via Zoom- just request one via Github issues.
I expect to meet with teams weekly via zoom and at least once in person - We will use Slack for meetings and discussions. Note that the Suggestion Service project will need to have a regular scheduled meeting before noon to enable conferencing with Dr. Snijder, who resides in Amsterdam.
## 2021-2022 ## 2021-2022
Proposed Team: - [Free-Programming-Books-Search](fpb.md)
- Brogan Clements - [repo](https://github.com/EbookFoundation/free-programming-books-search/)
- Paul Kelly - [search page](https://EbookFoundation.github.io/free-programming-books-search/)
- Leo Ouyang
- Dan Pekata
- Nick Quidas
- Dylan Regan
Proposed project:
- [Free-Programming-Books](fpb.md)
Students interested in this project should use Github issues and pull requests to develop and propose teams. For example, students interested in a project but needing team members, and teams needing additional members should create an issue describing their interest and needs. Use issues to ask questions or seek clarification about the projects. To propose a team for a specific project, create a pull request adding the names of team members to the project page. You may also want to include roles, capabilities and the approach of the team.
I will not accept a proposal PR until September 16. But do not wait until then to start a pull request even if your team is incomplete or you're still deciding - I will comment on PRs with the goal of improving them, and you can close the PR to withdraw the proposal. If there are competing proposals, I will give preference to the best developed proposal. I am happy to schedule a Q&A session via Zoom- just request one via Github issues.
I expect to meet with teams weekly - I will use Slack for meetings and discussions. Ideally we'll be able to meet as a team in person, Covid permitting.
## 2020-2021 ## 2020-2021

9
fpb.md
View File

@ -42,8 +42,11 @@ Free Programming Books applies Github's workflow to the creation and maintenance
## Team ## Team
Add names and links here. - Brogan Clements
- Paul Kelly
- Leo Ouyang
- Dan Pekata
- Nick Quidas
- Dylan Regan
## More about our team
Describe your team here.

80
oapen-doab.md Normal file
View File

@ -0,0 +1,80 @@
# OAPEN/DOAB projects
# OAPEN Suggestion Service
## (Project #1)
## Background
[OAPEN](https://oapen.org/) promotes and supports the transition to open access for academic books by providing open infrastructure services to stakeholders in scholarly communication. Over 24,000 books and book chapters are available from the OAPEN platform.
OAPEN has experimented with a suggestion service based on semantic inferencing based on trigrams.[1] It is interested to see if this concept can be turned into a web service and integrated into its online offering.
## Goals
This project will build
- an analysis/mining engine that will ingest the 24,000 texts at OAPEN and produce a trigram map
- a web-service application that will use the trigram map allow websites to present suggestions from the OAPEN catalog based on a book identifier.
The team will use off-the-shelf components such as Django, Node, and Postgres for deployment on Digital Ocean.
### Advisors
- Ronald Snijder, OAPEN
- Eric Hellman, Free Ebook Foundation
### Proposed Team
-
-
-
-
-
### More about the team
### Reference
[1] Snijder, R. (2021). Words Algorithm Collection - finding closely related open access books using text mining techniques. LIBER Quarterly: The Journal of the Association of European Research Libraries, 31(1), 122. [https://doi.org/10.53377/lq.10938](https://doi.org/10.53377/lq.10938)
# Expert System for Open Access Book Website Analysis
## (Project #2)
## Background
The Directory of Open Access Books, [DOAB](https://doabooks.org/) is a database of 60,000 academic peer-reviewed books and book chapters. The information in the database, including links, is contributed by open-access publishers around the world. This year, the Free Ebook Foundation has begun a joint project with OAPEN, the operator of DOAB, to verify and improve the links in this database.
While many of the books in DOAB are stored in the OAPEN database, more of them are not. 30,000 of the links in the DOAB database point at HTML access pages rather than directly to document files such as PDFs or EPUBs. It can be difficult for a software agent to recognize if these links are "good", i.e. usable by researchers to access a book, or "bad", links that need to be fixed or otherwise corrected by the publisher. In addition, identification of PDF and EPUB links on these pages can enable additional services such as indexing and archival.
## Goals
This project will build an expert system to analyze webpages linked from the DOAB System. The components of such a system would include:
- a classifier that recognizes the common types of website access pages
- a classifier that recognizes the common types of error pages
- a collection of parsers that extract ebook document links from access pages
The project will build on the Free Ebook Foundation's ebook loader modules written in Python, using Beautiful Soup, a Python library for pulling data out of HTML and XML files.
Advisors
- Eric Hellman, Free Ebook Foundation
- Ronald Snijder, OAPEN
### Proposed Team
-
-
-
-
-
### More about the team