hc – holiday converter

Contents:

hc introduction

GitLab CI Build Status (GitLab CI) - Travis CI Build Status (Travis CI) - coverage report - Read the Docs CII Best Practices
Version License Python versions dev status pypi monthly downloads

hc – holiday converter

Supports the following inputs and outputs:

Supports the following data serialization languages:

  • YAML
  • JSON

Repositories

Documentation

Installation

Latest release

You can install hc by invoking the following commands:

gpg --recv-keys 'C505 B5C9 3B0D B3D3 38A1  B600 5FE9 2C12 EE88 E1F0'
mkdir --parent /tmp/hc && cd /tmp/hc
wget -r -nd -l 1 https://pypi.python.org/pypi/hc --accept-regex '^https://(test)?pypi\.python\.org/packages/.*\.whl.*'
current_release="$(find . -type f -name '*.whl' | sort | tail -n 1)"
gpg --verify "${current_release}.asc" "${current_release}" && pip3 install --upgrade "${current_release}"

Refer to Verifying PyPI and Conda Packages for more details. Note that this might pull down dependencies in an unauthenticated way! You might want to install the dependencies yourself beforehand.

Or if you feel lazy and agree that pip/issues/1035 should be fixed you can also install hc like this:

pip3 install hc

Development version

If you want to be more on the bleeding edge of hc development consider cloning the git repository and installing from it:

gpg --recv-keys 'EF96 BC32 AC57 CFC7 2DF0  1D8C 489A 4D5E C353 C98A'
git clone --recursive https://gitlab.com/ypid/hc.git
cd hc && git verify-commit HEAD
echo 'Check if the HEAD commit has a good signature and only proceed in that case!' && read -r fnord
echo 'Then chose one of the commands below to install hc and its dependencies:'
pip3 install --upgrade .
./setup.py develop --user
./setup.py install --user
./setup.py install

This will also get you the cache which is tracked in git as well to do integration testing over the whole dataset. Please be sure to use the cache by symlinking it to your user cache directory. The following should do the trick:

hc_cache="$(python3 -c 'from appdirs import user_cache_dir; print(user_cache_dir("hc"))')"
ln -sT "$PWD/tests/cache/" "$hc_cache"

CLI interface

Holiday converter tool

usage: hc [-h] [-V] [-d] [-v] [-q] [-n] [-c CACHE_DIR] [-i INPUT_FILE]
          [-f {schulferien_html}] [-F FROM_DATE] [-T TO_DATE] [-u]
          [-t {yaml,json}] [-s {opening_hours.js}] [-D]
          output-file

Positional Arguments

output-file Where to write the output file. ‘-‘ will write to STDOUT.

Named Arguments

-V, --version show program’s version number and exit
-d, --debug Write debugging and higher to STDOUT|STDERR.
-v, --verbose Write information and higher to STDOUT|STDERR.
-q, --quiet, --silent
 Only write errors and higher to STDOUT|STDERR.
-n, --no-cache

Do not cache intermediary files.

Default: True

-c, --cache-dir
 Cache directory, defaults to the default cache directory of your operating system.
-i, --input-file
 File path to the input file to process. ‘-‘ will read from STDIN.
-f, --input-format, --from
 

Possible choices: schulferien_html

Format of the input file. Default: “schulferien_html”.

Default: “schulferien_html”

-F, --from-date
 

Process date range starting at given RFC 3339 date. Default: Current year and month “”2020-04”“.

Default: “2020-04”

-T, --to-date

Process date range ending at given RFC 3339 date. Default: One year in the further “”2021-04”“.

Default: “2021-04”

-u, --update-output
 

Update the output file instead of constructing it from scratch. Implementation incomplete.

Default: False

-t, --output-format, --to
 

Possible choices: yaml, json

Format of the output file. Default: “yaml”.

Default: “yaml”

-s, --output-structure
 

Possible choices: opening_hours.js

Structure of the output file. Default: “opening_hours.js”.

Default: “opening_hours.js”

-D, --dry-run

Don’t write output.

Default: False

History

This tool was created because Germany as of 2017 seems to be unable/unwilling to provide school holidays or holidays in general in a machine readable format. There are sites like http://www.schulferien.org/ which do a really good job in getting the data anyway through various sources and “providing them”. Back in 2013, everything was great and schulferien.org just provided all iCal files they had for the school holidays of the current year and the following years as far as they are defined by the German Kultusministerkonferenz. A Perl Script has been used to parse all the iCal files and convert them (ref: convert_ical_to_json). Unfortunately, those days are over and after checking out all the available sources the least bad one was to go ahead and parse the HTML table of schulferien.org because the HTML version still provides all data. schulferien.org was contacted before to find a better solution but none has been found. One concern from schulferien.org are the use of (faulty) scripts which put load on their servers. It is therefore one key design goal of this tool to make the fewest requests to external resources possible and use extensive caching. This has been implemented see Design principles.

Refer to this issue for more details.

Design principles

  • Generic

    When you look around on the Internet, you find hundreds of public and/or school holiday APIs, libraries, websites providing HTML calenders, iCALs, PDFs, at least for Germany. Most of these have some artificial kind of limitation or restriction. This is an attempt to harvest holidays (which are generally not copyright protected), convert them and provide them without any limitations.

    The available tools where found useless for the use case of bundling holiday definitions for opening_hours.js which is why this tool has been written.

  • Free Software

    All sources are provided under the GNU Affero General Public License v3 (AGPL-3.0). Resources such as holiday data is released under a Creative Commons Zero v1.0 Universal. Enjoy.

  • Idempotent.

    The program can be run against it’s output and should not make any changes to it. This property is checked by integration testing.

  • Caching

    Make the fewest requests to external resources possible and use extensive caching. The cache is provided as separate git repository (hc-tests-cache) to also make use of the cache during CI testing which is done against a support matrix of Python versions and environments and therefore runs in parallel a number of times for each commit.

  • Expendable

    Convert from anything to anything using a common internal data structure.

hc package

Submodules

hc.cli module

Command line interface of hc

hc.cli.main()

hc.datatypes module

Data types definition

class hc.datatypes.MonthDayList

Bases: list

class hc.datatypes.PhData

Bases: list

hc.datatypes.fix_data_types(dataset)
hc.datatypes.fix_ph_data(dataset)

hc.defaults module

hc defaults

hc.helpers module

hc helpers

hc.helpers.get_date_from_relative_month(relative_month)
hc.helpers.get_month_number(month_name)
hc.helpers.get_relative_month(date)

hc.opening_hours_js module

OpenStreetMap opening_hours.js format. Refer to https://github.com/opening-hours/opening_hours.js/blob/master/holidays/README.md for the “spec”.

class hc.opening_hours_js.OpeningHoursJS(defs=None)

Bases: object

FIRST_LEVEL_SORTING = {'PH': '20', 'SH': '30', '_nominatim_url': '10'}
SH_DATA_SORTING = {'name': '0'}
get_school_holidays(out=None)
read(in_defs)
static update_sh_format(sh_data)
hc.opening_hours_js.find_ind(lst, key, value)

hc.schulferien_org module

schulferien.org interface

class hc.schulferien_org.SchulferienOrg(defs=None, cache=True, cache_dir=None)

Bases: object

get_month_dataset(date)
get_school_holidays(from_date, to_date)

hc.yaml module

YAML representation

class hc.yaml.PrettyHolidayYAMLDumper(stream, default_style=None, default_flow_style=None, canonical=None, indent=None, width=None, allow_unicode=None, line_break=None, encoding=None, explicit_start=None, explicit_end=None, version=None, tags=None, block_seq_indent=None, top_level_colon_align=None, prefix_colon=None)

Bases: ruamel.yaml.dumper.RoundTripDumper

YAML dumper optimized human readability of the holiday format.

represent_dict(data)

write out tag if saved on loading

represent_list(data)
yaml_representers = {None: <unbound method SafeRepresenter.represent_undefined>, <type 'float'>: <unbound method SafeRepresenter.represent_float>, <type 'int'>: <unbound method SafeRepresenter.represent_int>, <type 'list'>: <unbound method SafeRepresenter.represent_list>, <type 'long'>: <unbound method SafeRepresenter.represent_long>, <type 'dict'>: <unbound method PrettyHolidayYAMLDumper.represent_dict>, <type 'NoneType'>: <unbound method RoundTripRepresenter.represent_none>, <type 'set'>: <unbound method SafeRepresenter.represent_set>, <type 'str'>: <unbound method SafeRepresenter.represent_str>, <type 'tuple'>: <unbound method SafeRepresenter.represent_list>, <type 'unicode'>: <unbound method SafeRepresenter.represent_unicode>, <type 'bool'>: <unbound method SafeRepresenter.represent_bool>, <class 'collections.OrderedDict'>: <unbound method PrettyHolidayYAMLDumper.represent_dict>, <class 'ruamel.yaml.comments.CommentedSeq'>: <unbound method RoundTripRepresenter.represent_list>, <class 'ruamel.yaml.scalarstring.FoldedScalarString'>: <unbound method RoundTripRepresenter.represent_folded_scalarstring>, <class 'ruamel.yaml.scalarstring.LiteralScalarString'>: <unbound method RoundTripRepresenter.represent_literal_scalarstring>, <class 'ruamel.yaml.scalarstring.SingleQuotedScalarString'>: <unbound method RoundTripRepresenter.represent_single_quoted_scalarstring>, <class 'ruamel.yaml.scalarstring.DoubleQuotedScalarString'>: <unbound method RoundTripRepresenter.represent_double_quoted_scalarstring>, <class 'ruamel.yaml.scalarstring.PlainScalarString'>: <unbound method RoundTripRepresenter.represent_plain_scalarstring>, <class 'ruamel.yaml.comments.CommentedSet'>: <unbound method RoundTripRepresenter.represent_set>, <class 'ruamel.yaml.comments.CommentedMap'>: <unbound method RoundTripRepresenter.represent_dict>, <class 'ruamel.yaml.comments.CommentedOrderedMap'>: <unbound method RoundTripRepresenter.represent_ordereddict>, <class 'ruamel.yaml.comments.TaggedScalar'>: <unbound method RoundTripRepresenter.represent_tagged_scalar>, <class 'ruamel.yaml.scalarint.ScalarInt'>: <unbound method RoundTripRepresenter.represent_scalar_int>, <class 'ruamel.yaml.scalarint.BinaryInt'>: <unbound method RoundTripRepresenter.represent_binary_int>, <class 'ruamel.yaml.scalarint.OctalInt'>: <unbound method RoundTripRepresenter.represent_octal_int>, <class 'ruamel.yaml.scalarint.HexInt'>: <unbound method RoundTripRepresenter.represent_hex_int>, <class 'ruamel.yaml.scalarint.HexCapsInt'>: <unbound method RoundTripRepresenter.represent_hex_caps_int>, <class 'ruamel.yaml.scalarfloat.ScalarFloat'>: <unbound method RoundTripRepresenter.represent_scalar_float>, <class 'ruamel.yaml.scalarbool.ScalarBoolean'>: <unbound method RoundTripRepresenter.represent_scalar_bool>, <class 'ruamel.yaml.timestamp.TimeStamp'>: <unbound method RoundTripRepresenter.represent_datetime>, <class 'hc.datatypes.MonthDayList'>: <unbound method PrettyHolidayYAMLDumper.represent_list>, <class 'hc.datatypes.PhData'>: <unbound method PrettyHolidayYAMLDumper.represent_list>, <type '_ordereddict.ordereddict'>: <unbound method SafeRepresenter.represent_ordereddict>, <type 'datetime.datetime'>: <unbound method SafeRepresenter.represent_datetime>, <type 'datetime.date'>: <unbound method SafeRepresenter.represent_date>}
hc.yaml.dump_holidays_as_yaml(unserialized_data, add_vspacing=True)
hc.yaml.get_clean_yaml(serialized_data, add_vspacing=False)

Module contents

Holiday converter tool

Contributing and issue reporting

You can contribute and report issues in the usual way as documented by GitHub. Unit and integration tests can be run locally and are automatically run in CI. Acceptable contributions need to pass all of them.

If you found a security vulnerability that might put users at risk please send your report/patch to ypid@riseup.net. Please consider using OpenPGP to encrypt your email.

FAQ

Am I allowed to use the data the script gathers?

The author hopes so but keep in mind that he is not a lawyer. As this is about German law, the § 5 UrhG should apply after which content like school holidays in Germany are not copyright protected. The official source is https://www.kmk.org/service/ferien.html.

Changelog

This project adheres to Semantic Versioning and human-readable changelog.

hc master - unreleased

hc v0.1.1 - 2017-06-04

Changed

  • Pin ruamel.yaml on 0.14.X for now as there will be API changes in 0.15+. [ruamel]

hc v0.1.0 - 2017-03-02

Added

  • Initial coding and design. [ypid]