Compare commits

...

13 commits

Author SHA1 Message Date
Marc b1e305362f 🚀 release version 1.1.0 2024-09-17 22:08:27 +02:00
Marc cb0582997b implement a maximum time delta for appointment notification 2024-09-17 22:08:27 +02:00
Marc dcffe2f85a 🚀 release version 1.0.3 2024-09-17 22:08:27 +02:00
Marc 32cbe7625d fix regex
The regex in config.yaml doesn't need to be escaped twice.
2024-09-17 22:08:27 +02:00
Marc 3707804459 🚀 release verion 1.0.2 2024-09-17 22:08:27 +02:00
Marc 18587d1afa require python >=3.12 2024-09-17 22:08:27 +02:00
Marc bd4af1cecc adapt pypi download url 2024-09-17 22:08:27 +02:00
Marc 157a3798a6 🚀 release version 1.0.1 2024-09-17 22:08:27 +02:00
Marc d4e393e8ac add documentation 2024-09-17 22:08:27 +02:00
Marc 8255892e41 fix support for csv file argument 2024-09-17 22:08:27 +02:00
Marc b0d0274f0f 🚀 release version 1.0.0 2024-09-17 22:08:27 +02:00
Marc 32fecf9792 add argument parsing 2024-09-17 22:08:27 +02:00
Marc d9fa7af5de remove import for typing.override
fix config.yaml creation
2024-09-17 22:08:26 +02:00
7 changed files with 336 additions and 37 deletions

View file

@ -4,6 +4,7 @@
<content url="file://$MODULE_DIR$">
<sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" />
<excludeFolder url="file://$MODULE_DIR$/.venv" />
<excludeFolder url="file://$MODULE_DIR$/dist" />
</content>
<orderEntry type="jdk" jdkName="Python 3.12 (.venv)" jdkType="Python SDK" />
<orderEntry type="sourceFolder" forTests="false" />

177
README.md
View file

@ -1 +1,178 @@
# cgn-appointments
This programm allows you to scrape the appointments from the website of the city
of Cologne (Stadt Köln) and send a notification via ntfy if a new appointment is
available.
## Installation
A good way to install `cgn-appointments` is via [pipx](https://pipx.pypa.io/stable/).
```bash
pipx install --index-url https://git.pub.solar/api/packages/marc/pypi/simple/ --pip-args="--extra-index-url https://pypi.org/simple" cgn-appointments
```
## Configuration
Execute `cgn-appointments` for the first time to create the configuration file.
Then edit the configuration file and adapt it to your needs.
```yaml
---
# If this url does not work anymore, visit https://www.stadt-koeln.de/service/produkte/00416/index.html,
# copy the url of the "Termin Buchen" tile and replace the url below.
url: 'https://termine.stadt-koeln.de/m/kundenzentren/extern/calendar/?uid=b5a5a394-ec33-4130-9af3-490f99517071&wsid=e570a1ea-7b3d-43f6-bf43-3e60b3d7d888&lang=de&set_lang_ui=de&rev=rfOtF#top'
# Which services should be checked for free appointments?
# Copy the exact name of the service from the website.
services:
- 'Personalausweis - Antrag'
- 'Reisepass - Antrag (seit 01.01.2024 auch für Kinder unter 12 Jahren)'
# In which locations should the services be checked?
# Use the name as displayed on the website.
locations:
- Ehrenfeld
- Kalk
# Path to the CSV file to store the scraped appointments
# csv_path: ~/Termine.csv
# Name of the CSV file to store the scraped appointments
csv_name: 'appointments.csv'
# Regex to extract the date from the website
# There should be no need to change this
date_regex: '(\\d{2}\\.\\d{2}\\.\\d{4}\\s\\d{2}:\\d{2})'
# Date format to store the date in the CSV file (should match the date_regex)
# There should be no need to change this
date_format: '%d.%m.%Y %H:%M'
# ntfy configuration
# See https://ntfy.sh/ for more information
# Choose a topic name that is unique to you
ntfy:
server: https://ntfy.sh/
topic: public_cgn_appintments_83e0c8db1f51a7044b6431ddb2814c11
title: 'A new appointment is available!'
message: 'A new appointment is available in %s: %s'
tags:
- tada
priority: 3 # 1-5
# Configure logging
# Advanced users can change the logging configuration here
# The loglevel can be set to DEBUG, INFO, WARNING, ERROR, CRITICAL
logging:
version: 1
disable_existing_loggers: false
formatters:
simple:
format: '[%(levelname)s|%(module)s|L%(lineno)d] %(asctime)s: %(message)s'
datefmt: '%Y-%m-%dT%H:%M:%S%z'
json:
fmt_keys:
level: levelname
message: message
timestamp: timestamp
logger: name
module: module
function: funcName
line: lineno
thread_name: threadName
handlers:
stderr:
class: logging.StreamHandler
formatter: simple
stream: ext://sys.stderr
level: DEBUG
file:
class: logging.handlers.RotatingFileHandler
formatter: json
level: INFO
maxBytes: 10000000
backupCount: 3
queue_handler:
class: logging.handlers.QueueHandler
handlers:
- stderr
- file
respect_handler_level: true
loggers:
root:
handlers:
- queue_handler
level: DEBUG
```
## Usage
You can use `cg-appointments` with only the configuration file or with
additional command line arguments which will override the values set in the
configuration file.
Using the `--services` flag, make sure to use `"double quotes"` around the
service name if it contains spaces.
```bash
usage: cgn-appointments [-h] [-s SERVICES [SERVICES ...]]
[-l LOCATIONS [LOCATIONS ...]]
[--config-file CONFIG_FILE] [--csv-file CSV_FILE]
[--log-file LOG_FILE]
[--log-level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}]
Scrapes appointments from termine.stadt-koeln.de an sends a message to a ntfy
server.
options:
-h, --help show this help message and exit
-s SERVICES [SERVICES ...], --services SERVICES [SERVICES ...]
Services to check
-l LOCATIONS [LOCATIONS ...], --locations LOCATIONS [LOCATIONS ...]
Locations to check
--config-file CONFIG_FILE
Path to the configuration file
--csv-file CSV_FILE Path to the csv file, which stores the last fetched
dates
--log-file LOG_FILE Path to logfile
--log-level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}
Logging Level
```
### Example
```bash
cgn-apppointments \
--services "Personalausweis - Antrag" "Reisepass - Antrag (seit 01.01.2024 auch für Kinder unter 12 Jahren)" \
--locations Ehrenfeld Kalk \
--config-file /path/to/config.yaml \
--csv-file /path/to/csvfile.csv \
--log-file /path/to/logfile.log \
--log-level INFO
```
## Scheduled Execution via `cron`
On linux systems you can use `cron` to schedule the execution of
`cgn-appointments`.
Find the path to the `cgn-appointments` executable
```bash
which cgn-appointments
```
Open the crontab file
```bash
crontab -e
```
Add the following line to the crontab file to execute `cgn-appointments` every
30 minutes
```bash
*/30 * * * * /path/to/cgn-appointments > /dev/null 2>&1
```

View file

@ -7,7 +7,7 @@ name = "cgn-appointments"
dynamic = ["version"]
description = 'Scrapes appointments from termine.stadt-koeln.de an sends a message to a ntfy server.'
readme = "README.md"
requires-python = ">=3.8"
requires-python = ">=3.12"
license = "MIT"
keywords = ["selenim", "cologne", "scraper"]
authors = [

View file

@ -1,4 +1,4 @@
# SPDX-FileCopyrightText: 2024-present Marc Koch <marc-koch@posteo.de>
#
# SPDX-License-Identifier: MIT
__version__ = "1.0.0rc0"
__version__ = "1.1.0"

View file

@ -1,3 +1,4 @@
import argparse
import csv
import json
import logging
@ -22,50 +23,153 @@ PROJECT_ROOT = Path(__file__).parent.parent
logger = logging.getLogger(__package__)
def parse_arguments() -> dict:
"""
Parse the arguments.
:return: dict
"""
argparser = argparse.ArgumentParser(
prog="cgn-appointments",
description="Scrapes appointments from termine.stadt-koeln.de an sends a message to a ntfy server.",
)
argparser.add_argument(
"-s",
"--services",
action="store",
nargs='+',
type=str,
help="Services to check",
required=False,
)
argparser.add_argument(
"-l",
"--locations",
action="store",
nargs='+',
type=str,
help="Locations to check",
required=False,
)
argparser.add_argument(
"-t",
"--max-timedelta",
action="store",
type=int,
help="Maximum timedelta in days to notify about new appointments",
required=False,
)
argparser.add_argument(
"--config-file",
action="store",
type=Path,
help="Path to the configuration file",
required=False,
)
argparser.add_argument(
"--csv-file",
action="store",
type=Path,
help="Path to the csv file, which stores the last fetched dates",
required=False,
)
argparser.add_argument(
"--log-file",
action="store",
type=Path,
help="Path to logfile",
required=False,
)
argparser.add_argument(
"--log-level",
action="store",
type=str,
choices=["CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG", "NOTSET"],
help="Logging Level",
required=False,
)
return argparser.parse_args().__dict__
def update_config_with_args(config: dict, args: dict) -> dict:
"""
Update the configuration with the arguments.
:param config:
:param args:
:return: dict
"""
update_config = {
"services": args.get("services"),
"locations": args.get("locations"),
"max_timedelta": args.get("max_timedelta"),
"csv_path": args.get("csv_file"),
}
for key, value in update_config.items():
if value is not None:
config[key] = value
if args.get("log_file") is not None:
config["logging"]["handlers"]["file"]["filename"] = args.get("log_file")
if args.get("log_level") is not None:
config["logging"]["loggers"]["root"]["level"] = args.get("log_level")
return config
def get_config() -> dict:
"""
Get the configuration from the config.yaml file.
:return:
:return: dict
"""
config_yaml = Path(user_config_dir()) / "cgn-appointments" / "config.yaml"
args = parse_arguments()
if args.get("config_file") is not None:
config_yaml = args.get("config_file")
else:
config_yaml = Path(user_config_dir()) / "cgn-appointments" / "config.yaml"
if not config_yaml.exists():
logger.info(f"""config.yaml not found.
Creating a new one under '{config_yaml}'.
Please fill in the required information.""")
print(f"""config.yaml not found.
Creating a new one under '{config_yaml}'.
Please fill in the required information.""")
config_yaml.parent.mkdir(parents=True, exist_ok=True)
Path("./config_template.yaml").write_text(config_yaml.read_text())
config_yaml.touch()
config_yaml.write_text(Path(Path(__file__).parent, "config_template.yaml").read_text())
exit(0)
try:
with open(config_yaml, "r") as file:
return dict(yaml.safe_load(file))
config = dict(yaml.safe_load(file))
except FileNotFoundError:
print("config.yaml not found")
print(f"config.yaml not found in '{config_yaml}'.")
exit(1)
# Replace config values with given arguments
return update_config_with_args(config, args)
def define_csv_path(csv_path: str|None, csv_name: str|None) -> Path:
"""
Define the path to the csv file.
:param csv_path:
:param csv_name:
:return:
:return: Path
"""
csv_path = Path(csv_path) if csv_path else None
csv_name = Path(csv_name) if csv_name else None
if csv_path is not None and csv_path.is_file():
return csv_path
elif csv_path is not None and csv_path.is_dir() and csv_name is not None:
if csv_path is not None and csv_path.is_dir() and csv_name is not None:
return csv_path / csv_name
elif csv_path is not None and csv_path.is_dir() and csv_name is None:
return csv_path / "cgn-appointments.csv"
elif csv_path is not None:
csv_path.touch()
return csv_path
elif csv_name is not None:
return Path(user_data_dir()) / csv_name
csv_path = Path(user_data_dir()) / csv_name
csv_path.touch()
return csv_path
else:
return Path(user_data_dir()) / "cgn-appointments.csv"
csv_path = Path(user_data_dir()) / "cgn-appointments"
csv_path.touch()
return csv_path / "cgn-appointments.csv"
def select_options(services, selects):
@ -162,6 +266,7 @@ def main():
url = config.get("url")
services = config.get("services")
check_locations = config.get("locations")
max_timedelta = config.get("max_timedelta")
csv_name = config.get("csv_name", "cgn-appointments.csv")
csv_path = define_csv_path(config.get("csv_path"), csv_name)
date_regex = config.get("date_regex")
@ -232,16 +337,16 @@ def main():
# Get location containers
location_containers = driver.find_elements(By.CLASS_NAME, "location-container")
for location_container in location_containers:
loc_title = location_container.find_element(By.CLASS_NAME, "location_title")
for loc in check_locations:
for loc in check_locations:
for location_container in location_containers:
loc_title = location_container.find_element(By.CLASS_NAME, "location_title")
if loc in loc_title.text:
locations.update({loc: {"location_container": location_container}})
if len(locations) > 0:
logger.debug(f"Location containers found",
extra={"locations": locations})
else:
logger.warning("No location containers found.")
if len(locations) > 0:
logger.debug(f"Location containers found",
extra={"locations": locations})
else:
logger.warning("No location containers found.")
# Get earliest date for each location
for loc in locations.keys():
@ -287,17 +392,31 @@ def main():
if new_date and new_date != previous_date:
logger.info(f"New appointment found for {name}: {new_date}",
extra={"location": name, "previous_date": previous_date,
"new_date": new_date})
"new_date": new_date, "max_timedelta": max_timedelta})
lines.append((name, new_date.strftime(date_format)))
ntfy(
ntfy_server,
ntfy_topic,
ntfy_title,
ntfy_message % (name, new_date),
session_url,
ntfy_tags,
ntfy_priority,
)
# Send notification if new date is within timedelta or
# timedelta is not set
time_delta = (new_date - datetime.now()).days if max_timedelta > 0 else False
if time_delta == False or time_delta <= max_timedelta:
logger.info(f"Sending notification for new appointment.",
extra={"location": name, "new_date": new_date,
"time_delta": time_delta,
"max_timedelta": max_timedelta})
ntfy(
ntfy_server,
ntfy_topic,
ntfy_title,
ntfy_message % (name, new_date),
session_url,
ntfy_tags,
ntfy_priority,
)
else:
logger.info(f"New appointment is not within timedelta.",
extra={"location": name, "new_date": new_date,
"time_delta": time_delta,
"max_timedelta": max_timedelta})
elif previous_date is not None:
lines.append((name, previous_date.strftime(date_format)))

View file

@ -11,6 +11,10 @@ locations:
- Ehrenfeld
- Kalk
# Max time between today and a new appointment to notify about the new
# appointment. Set to -1 to notify about all new appointments.
max_timedelta: 14
# Path to the CSV file to store the scraped appointments
# csv_path: ~/Termine.csv
@ -18,7 +22,7 @@ locations:
csv_name: 'appointments.csv'
# Regex to extract the date from the website
date_regex: '(\\d{2}\\.\\d{2}\\.\\d{4}\\s\\d{2}:\\d{2})'
date_regex: '(\d{2}\.\d{2}\.\d{4}\s\d{2}:\d{2})'
# Date format to store the date in the CSV file (should match the date_regex)
date_format: '%d.%m.%Y %H:%M'

View file

@ -4,7 +4,6 @@ import json
import logging
import logging.config
from pathlib import Path
from typing import override
from platformdirs import user_log_dir
@ -85,7 +84,6 @@ class JSONFormatter(logging.Formatter):
super().__init__()
self.fmt_keys = fmt_keys if fmt_keys is not None else {}
@override
def format(self, record: logging.LogRecord) -> str:
message = self._prepare_log_dict(record)
return json.dumps(message, default=str)