What Is Product Scraping?
Product scraping is the automated extraction of data from your Shopify store. Bots visit your product pages, collect information like titles, descriptions, pricing, images, and inventory levels, and store that data in a database. This happens without your permission and often without your knowledge.
Scrapers target Shopify stores for several reasons:
- Competitor intelligence — Rival stores monitor your pricing and product catalog to undercut you or copy your offerings
- Clone site creation — Scammers scrape your entire catalog to build fake stores that impersonate your brand
- Price comparison sites — Aggregators pull your data to display on third-party platforms you did not authorize
- Dropshipping theft — Operators scrape your product listings to resell your items without any relationship with you
- Market research — Data brokers collect product information at scale to sell as market intelligence
The impact goes beyond lost data. Scrapers consume server resources, slow down your store for real customers, and can even cause downtime during high-traffic periods.
Signs Your Store Is Being Scraped
Before you can stop scraping, you need to recognize it. Watch for these warning signs.
Unusual Traffic Patterns
- Sudden spikes in page views, especially on product and collection pages
- High traffic from a single IP address or a narrow range of IPs
- Traffic that follows a systematic pattern (visiting every product page in order rather than browsing naturally)
- Visits concentrated during off-peak hours when real customers are less active
Server Performance Issues
- Slower page load times without any changes to your store
- Increased bandwidth usage that does not correspond to increased sales
- Server errors or timeouts during periods of normal customer traffic
Data Appearing Elsewhere
- Your product descriptions showing up on other websites word-for-word
- Your product images used on sites you do not control
- Competitors suddenly matching your pricing with suspicious accuracy
- Your products listed on marketplaces you have not authorized
Analytics Anomalies
- Very low bounce rates from specific traffic sources (bots visit many pages per session)
- Zero conversion rate from high-volume traffic sources
- Unusual user agent strings in your server logs
- Requests to your store’s JSON endpoints or API routes that you did not initiate
How Scrapers Work
Understanding how scrapers operate helps you defend against them effectively.
Basic HTML Scrapers
The simplest scrapers fetch your product page HTML and parse out the data. They look for structured elements like product titles in heading tags, prices in specific CSS classes, and image URLs. These are relatively easy to detect because they often lack JavaScript execution and do not load assets like CSS or fonts.
Headless Browser Scrapers
More sophisticated scrapers use headless browsers (like Puppeteer or Playwright) that render your pages fully, including JavaScript. These are harder to detect because they mimic real browser behavior. However, they are slower and more resource-intensive, which affects the scraper’s patterns.
API Scrapers
Shopify stores expose certain data through predictable JSON endpoints. Scrapers that target these endpoints can extract product data very efficiently without ever loading a full page. Endpoints like /products.json and /collections.json are common targets.
Distributed Scrapers
Advanced operations distribute scraping across many IP addresses, rotate user agents, and add random delays to avoid detection. These require more sophisticated countermeasures.
Step 1: Block Known Scraper IPs
Start by blocking IP addresses that are already identified as scraping sources.
Review Your Blocking Logs
- Open SecurEcommerce and go to Analytics > Traffic Logs
- Sort by page views per session — scrapers typically have unusually high counts
- Filter for visitors that accessed more than 50 pages in a single session
- Note the IP addresses and check whether they belong to known hosting providers or data centers
Block Identified IPs
- Navigate to Blocking > IP Blocking
- Click Add IP for each identified scraper IP
- Add a note such as “Scraper - accessed 200+ product pages on 2026-01-28”
- For IPs from the same range, consider blocking the entire CIDR range
Enable Data Center Blocking
Most scrapers run from cloud hosting providers rather than residential internet connections:
- Go to Blocking > Advanced Settings
- Enable Data Center IP Detection
- Choose your action: block, challenge, or flag
- This blocks traffic from major hosting providers like AWS, Google Cloud, and DigitalOcean
Note: Some legitimate services (like payment processors or SEO tools) also use data center IPs. Monitor the impact after enabling this setting and whitelist any services that need access.
Step 2: Enable Bot Detection
SecurEcommerce includes automated bot detection that identifies scraping behavior in real time.
Configure Bot Detection
- Open SecurEcommerce and go to Protection > Bot Detection
- Toggle Enable Bot Detection to on
- Set the detection sensitivity:
- Low — Catches only obvious bots with known signatures
- Medium — Catches bots exhibiting automated browsing patterns (recommended)
- High — Aggressive detection that may occasionally flag power users
- Choose the action for detected bots: block, challenge with CAPTCHA, or log only
- Click Save
How Detection Works
SecurEcommerce analyzes visitor behavior looking for signals that indicate automation:
- Request timing — Bots tend to make requests at unnaturally consistent intervals
- Navigation patterns — Real users browse contextually; bots follow systematic crawl patterns
- Browser fingerprinting — Headless browsers have detectable differences from real browsers
- JavaScript execution — Simple scrapers do not execute JavaScript, which is a strong signal
- Mouse and interaction events — Real users generate mouse movements and scroll events that bots often do not
Step 3: Set Up Rate Limiting
Rate limiting restricts how many requests a single visitor can make in a given time period. This is one of the most effective defenses against scraping.
Configure Rate Limits
- Go to Protection > Rate Limiting
- Set the maximum requests per minute per IP (a reasonable starting point is 60 requests per minute for normal browsing)
- Set the maximum product page views per session (consider 30-50 for a generous limit)
- Choose the action when limits are exceeded: throttle, challenge, or block
- Click Save
Recommended Settings
| Setting | Recommended Value | Notes |
|---|---|---|
| Requests per minute | 60 | Covers fast browsing without allowing bulk scraping |
| Product pages per session | 40 | Generous for shoppers, restrictive for scrapers |
| API requests per minute | 20 | Limits JSON endpoint abuse |
| Action on exceed | Challenge | Lets real users through while stopping bots |
Tip: Start with generous limits and tighten them over time based on your traffic patterns. Check your analytics to understand what normal browsing behavior looks like for your store before setting restrictive limits.
Step 4: Enable Content Protection Features
SecurEcommerce offers content-level protections that make scraping more difficult even if a bot reaches your pages.
Right-Click and Copy Protection
- Go to Protection > Content Protection
- Enable Disable Right-Click on product pages
- Enable Copy Protection to prevent text selection of product descriptions
- These will not stop determined scrapers but raise the effort required for casual copying
Image Protection
- In the same Content Protection settings, enable Image Protection
- This prevents direct image downloads and disables drag-and-drop saving
- Consider enabling image watermarking for an additional layer of protection
Source Code Obfuscation
- Enable Source Obfuscation to make your page source harder to parse
- This adds complexity for scrapers that rely on predictable HTML structure
- It does not affect how your pages render for real visitors
For a complete walkthrough of all content protection features, see the Enable Content Protection guide.
Step 5: Restrict API Access
Shopify’s built-in JSON endpoints are a prime target for scrapers. While you cannot fully disable them, you can add protections.
Monitor API Endpoints
- In SecurEcommerce, go to Analytics > API Access Logs
- Review which endpoints are being accessed and by whom
- Look for high-volume access to
/products.json,/collections.json, and similar endpoints
Protect JSON Endpoints
- Go to Protection > API Protection
- Enable JSON Endpoint Rate Limiting
- Set strict rate limits for API-style requests (lower than general page rate limits)
- Enable API Bot Detection to apply additional scrutiny to JSON requests
Step 6: Monitor and Adapt
Scraping prevention is not a set-it-and-forget-it task. Scrapers evolve their techniques, and you need to stay ahead.
Set Up Alerts
- Go to Settings > Alerts
- Enable notifications for:
- Unusual traffic spikes
- Rate limit triggers
- Bot detection events
- Multiple blocked requests from the same source
- Choose email or in-app notifications based on your preference
Regular Review Schedule
Build these checks into your routine:
- Daily — Glance at the traffic dashboard for obvious anomalies
- Weekly — Review blocking logs and bot detection events in detail
- Monthly — Analyze traffic trends, update rate limits if needed, and review your blocklist for entries to add or remove
- Quarterly — Audit your entire protection configuration against current scraping techniques
Watch for Evasion Tactics
Sophisticated scrapers will try to work around your defenses:
- IP rotation — If you see the same scraping pattern from many different IPs, consider tightening bot detection sensitivity rather than relying solely on IP blocking
- Residential proxies — Some scrapers use residential IP addresses to bypass data center blocking. Behavioral detection catches these better than IP-based rules
- Slow scraping — Scrapers that throttle themselves to stay under rate limits can be caught by session-level analysis of browsing patterns
Best Practices Summary
- Layer your defenses — No single technique stops all scrapers. Combine IP blocking, bot detection, rate limiting, and content protection for the best coverage.
- Start with detection, then enforce — Use “log only” mode initially to understand your traffic before enabling blocking. This prevents false positives.
- Protect your most valuable data — Focus protection on product pages, pricing information, and high-resolution images.
- Keep legitimate access open — Ensure search engines, payment processors, and other services you depend on are whitelisted.
- Document your configuration — Record why each protection rule exists so future team members understand the setup.
What’s Next
With anti-scraping measures in place, continue strengthening your store’s protection:
- Enable Content Protection — Deep dive into all content protection features
- Configure Access Blocking — Fine-tune your overall blocking strategy to complement anti-scraping measures
- Review Blocked Traffic — Analyze the traffic being stopped to ensure your rules are effective