Scraping and parsing Goodreads user books data in Node.js
- Published on
- Published on
- /3 mins read/---
Since Goodreads no longer supports fetching user's books data via their API, I've decided to scrape user's book data using the RSS feed and parse it in Node.js.
The idea here is to use the rss-parser
package to parse the RSS feed and extract the book data.
Then you can fetch the data from the RSS feed using the parser
object, and process it as needed.
NOTE
You can get a Goodreads user's RSS feed URL by going to their profile and navigating to the bookshelf page and copy the RSS feed URL. This is my bookshelf page for example: https://www.goodreads.com/review/list/179720035
Now that you have the data you might need to prettify them before storing or using in your application since the data is stored in a raw format.
The GoodreadsBook
type is defined here:
Caveat
The Goodreads RSS feed is not updated instantly if you update your books on Goodreads. You might need to wait for a few hours before you can fetch the latest data.
Happy scraping!