Sitemap

Web Scraping in Ruby on Rails Using HTTParty and Nokogiri

2 min readFeb 24, 2025

Web scraping is a technique used to extract data from websites. In Ruby on Rails, we can achieve this using HTTParty for making HTTP requests and Nokogiri for parsing HTML. This blog will walk you through why web scraping is useful, how to implement it in Ruby, and how to save the extracted data into a CSV file.

Why Use Web Scraping?

Web scraping is useful for:
✅ Collecting product prices from e-commerce websites
✅ Extracting news articles or blog posts
✅ Monitoring competitor data
✅ Automating repetitive data extraction tasks

Step 1: Create a Folder and Add scraper.rb

Create a new folder for your project:

mkdir web_scraper
cd web_scraper
touch scraper.rb

Step 2: Install the Required Gems:

gem install httparty nokogiri

Step 3: Copy and paste the following script into scraper.rb:

require "httparty" 
require "nokogiri"
# https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies
response = HTTParty.get("https://www.scrapingcourse.com/ecommerce/", {
headers: {
"User-Agent" => "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36"
},
})

document = Nokogiri::HTML(response.body)
Product = Struct.new(:url, :image, :name, :price)
html_products = document.css("li.product")

# initializing the list of objects
# that will contain the scraped data
products = []

# iterating over the list of HTML products
html_products.each do |html_product|
# extracting the data of interest
# from the current product HTML element
url = html_product.css("a").first.attribute("href").value
image = html_product.css("img").first.attribute("src").value
name = html_product.css("h2").first.text
price = html_product.css("span").first.text

# storing the scraped data in a Product object
product = Product.new(url, image, name, price)

# adding the Product to the list of scraped objects
products.push(product)
end

puts products

# defining the header row of the CSV file
csv_headers = ["url", "image", "name", "price"]
CSV.open("output.csv", "wb", write_headers: true, headers: csv_headers) do |csv|
# adding each product as a new row
# to the output CSV file
products.each do |product|
csv << product
end
end

puts "Data saved to output.csv successfully!"

How It Works

🔹 HTTParty sends an HTTP request to fetch the webpage content.
🔹 Nokogiri parses the HTML and extracts product details using CSS selectors.

Running the Script

Save the script as scraper.rb and run:

ruby scraper.rb

After execution, the output.csv file will contain the extracted product details.

Conclusion

Web scraping in Ruby on Rails using HTTParty and Nokogiri is a powerful way to automate data extraction. This method is useful for price tracking, data analysis, and automation.

🚀 Now, try modifying this script to scrape data from other websites!

--

--

BharteeTechRubyOnRails
BharteeTechRubyOnRails

Written by BharteeTechRubyOnRails

Ruby on Rails Developer || React Js || Rspec || Node Js

No responses yet