Activity by Crawler trafilatura/2.0.0

Crawler

Name trafilatura/2.0.0
Type Indexer/Scraper
Type Description Identifies HTML pages for indexing and/or scraping
Is Identified Yes
Web Page https://github.com/adbar/trafilatura
Email Address
Note
User Agent Strings 1
IP Addresses 1
Resources 1
Requests 1
Requests for robots.txt 0
Earliest Request 2025-04-20 17:49:03
Latest Request 2025-04-20 17:49:03

User Agent String

1 User Agent String
ID User Agent String Requests
1031 trafilatura/2.0.0 (+https://github.com/adbar/trafilatura) 1

IP Address

1 IP Address–Crawler Combination
IP Address Host Crawler Requests
34.169.175.120 120.175.169.34.bc.googleusercontent.com trafilatura/2.0.0 1

Resources

Domain sphaerula.com

Resources Refused or Not Found

1 Resource–Status Code Combination, 1 Request
Resource Resource Type Status Code Requests
/wordpress/ Nonexistent 404 1

Request

1 Request
ID Method Domain Resource Referrer Status IP Address User Agent String ID Timestamp Crawler
146832 GET sphaerula.com /wordpress/ 404 34.169.175.120 1031 2025-04-20 17:49:03 trafilatura/2.0.0