Kleinanzeigen Boosted
A web-based map visualization tool for searching and exploring listings from kleinanzeigen.de with real-time geographic display on OpenStreetMap.
Features
- 🗺️ Interactive map visualization
- 🔍 Advanced search with price range (more options in future)
- 📍 Automatic geocoding of listings via Nominatim API
- ⚡ Parallel scraping with concurrent workers
- 📊 Prometheus-compatible metrics endpoint
- 🎯 Real-time progress tracking with ETA
- 💾 ZIP code caching to minimize API calls
- 🌐 User location display on map
Architecture
Backend: Flask API server with multi-threaded scraping Frontend: Vanilla JavaScript with Leaflet.js for maps Data Sources: kleinanzeigen.de, OpenStreetMap/Nominatim
Requirements
Python Packages
pip install flask flask-cors beautifulsoup4 lxml urllib3 requests
System Requirements
- Python 3.8+
- nginx (for production deployment)
Installation
1. Create System User
mkdir -p /home/kleinanzeigenscraper/
useradd --system -K MAIL_DIR=/dev/null kleinanzeigenscraper -d /home/kleinanzeigenscraper
chown -R kleinanzeigenscraper:kleinanzeigenscraper /home/kleinanzeigenscraper
2. Clone Repository
cd /home/kleinanzeigenscraper/
mkdir git
cd git
git clone https://git.mosad.xyz/localhorst/kleinanzeigen-boosted.git
cd kleinanzeigen-boosted
git checkout main
3. Install Dependencies
pip install flask flask-cors beautifulsoup4 lxml urllib3 requests
4. Configure Application
Create config.json:
{
"server": {
"host": "127.0.0.1",
"port": 5000,
"debug": false
},
"scraping": {
"session_timeout": 300,
"listings_per_page": 25,
"max_workers": 5,
"min_workers": 2,
"rate_limit_delay": 0.5,
"geocoding_delay": 1.0
},
"cache": {
"zip_cache_file": "zip_cache.json"
},
"apis": {
"nominatim": {
"url": "https://nominatim.openstreetmap.org/search",
"user_agent": "kleinanzeigen-scraper"
},
"kleinanzeigen": {
"base_url": "https://www.kleinanzeigen.de"
}
},
"user_agents": [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
]
}
5. Create Systemd Service
Create /lib/systemd/system/kleinanzeigenscraper.service:
[Unit]
Description=Kleinanzeigen Scraper API
After=network.target systemd-networkd-wait-online.service
[Service]
Type=simple
User=kleinanzeigenscraper
WorkingDirectory=/home/kleinanzeigenscraper/git/kleinanzeigen-boosted/backend/
ExecStart=/usr/bin/python3 scrape_proxy.py
Restart=on-failure
RestartSec=10
StandardOutput=append:/var/log/kleinanzeigenscraper.log
StandardError=append:/var/log/kleinanzeigenscraper.log
[Install]
WantedBy=multi-user.target
6. Enable and Start Service
systemctl daemon-reload
systemctl enable kleinanzeigenscraper.service
systemctl start kleinanzeigenscraper.service
systemctl status kleinanzeigenscraper.service
7. Configure nginx Reverse Proxy
Create /etc/nginx/sites-available/kleinanzeigenscraper:
server {
listen 80;
server_name your-domain.com;
# Redirect HTTP to HTTPS
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name your-domain.com;
ssl_certificate /path/to/ssl/cert.pem;
ssl_certificate_key /path/to/ssl/key.pem;
# Security headers
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
location / {
proxy_pass http://127.0.0.1:5000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300;
}
location /api/ {
proxy_pass http://127.0.0.1:5000/api/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300;
}
}
Enable site:
ln -s /etc/nginx/sites-available/kleinanzeigenscraper /etc/nginx/sites-enabled/
nginx -t
systemctl reload nginx
API Endpoints
POST /api/search
Start a new search session.
Request Body:
{
"search_term": "Fahrrad",
"num_listings": 25,
"min_price": 0,
"max_price": 1000
}
Response:
{
"session_id": "uuid-string",
"total": 25
}
GET /api/scrape/<session_id>
Get the next scraped listing from an active session.
Response:
{
"complete": false,
"listing": {
"title": "Mountain Bike",
"price": 450,
"id": 123456,
"zip_code": "76593",
"address": "Gernsbach",
"date_added": "2025-11-20",
"image": "https://...",
"url": "https://...",
"lat": 48.7634,
"lon": 8.3344
},
"progress": {
"current": 5,
"total": 25
}
}
POST /api/scrape/<session_id>/cancel
Cancel an active scraping session and delete cached listings.
Response:
{
"cancelled": true,
"message": "Session deleted"
}
GET /api/health
Health check endpoint.
Response:
{
"status": "ok"
}
GET /api/metrics
Prometheus-compatible metrics endpoint.
Response (text/plain):
# HELP search_requests_total Total number of search requests
# TYPE search_requests_total counter
search_requests_total 42
# HELP scrape_requests_total Total number of scrape requests
# TYPE scrape_requests_total counter
scrape_requests_total 1050
# HELP uptime_seconds Application uptime in seconds
# TYPE uptime_seconds gauge
uptime_seconds 86400
# HELP active_sessions Number of active scraping sessions
# TYPE active_sessions gauge
active_sessions 2
# HELP cache_size Number of cached ZIP codes
# TYPE cache_size gauge
zip_code_cache_size 150
# HELP kleinanzeigen_http_responses_total HTTP responses from kleinanzeigen.de
# TYPE kleinanzeigen_http_responses_total counter
kleinanzeigen_http_responses_total{code="200"} 1000
kleinanzeigen_http_responses_total{code="error"} 5
# HELP nominatim_http_responses_total HTTP responses from Nominatim API
# TYPE nominatim_http_responses_total counter
nominatim_http_responses_total{code="200"} 150
Configuration Options
Server Configuration
host: Bind address (default: 0.0.0.0)port: Port number (default: 5000)debug: Debug mode (default: false)
Scraping Configuration
session_timeout: Session expiry in seconds (default: 300)listings_per_page: Listings per page on kleinanzeigen.de (default: 25)max_workers: Number of parallel scraping threads (default: 4)min_workers: Number of parallel scraping threads (default: 2)rate_limit_delay: Delay between batches in seconds (default: 0.5)geocoding_delay: Delay between geocoding requests (default: 1.0)
Cache Configuration
zip_cache_file: Path to ZIP code cache file (default: zip_cache.json)
Monitoring
View logs:
tail -f /var/log/kleinanzeigenscraper.log
Check service status:
systemctl status kleinanzeigenscraper.service
Monitor metrics (Prometheus):
curl http://localhost:5000/api/metrics
Development
Run in debug mode:
python3 scrape_proxy.py
Frontend files are located in web/:
index.html- Main HTML filecss/style.css- Stylesheetjs/config.js- Configurationjs/map.js- Map functionsjs/ui.js- UI functionsjs/api.js- API communicationjs/app.js- Main application
License
This project is provided as-is for educational purposes. Respect kleinanzeigen.de's terms of service and robots.txt when using this tool.
Credits
Built with:
- Flask (Python web framework)
- Leaflet.js (Interactive maps)
- BeautifulSoup4 (HTML parsing)
- OpenStreetMap & Nominatim (Geocoding)