Releases: html-extract/hext
Hext v1.0.0
Changes
- New syntax: Nested rules ( #22 )
# match <div> elements that have a descendant <a> at any depth <div> { <a/> } </div>
- Abort extraction after a specified amount of searches (4dff797): Added a new parameter
max_searches
toRule::extract
andhtmlext
. It is disabled by default (value0
). If running untrusted hext templates, I recommend settingmax_searches
to some high value, like 10000, to protect against resource exhaustion. Nested rules can cause nasty runtime performance (see #22 for an example).
Hext v0.8.3
Static binary releases
Install the htmlext command-line utility and Hext for Python (v3.8 or earlier):
pip install hext
Install Hext for Node (v12 or earlier):
npm install hext
Both are compatible with Linux (x86_64) and Mac OS X ≥ 10.11.
Hext for WebAssembly
See https://github.com/html-extract/hext-emscripten/releases
Changes
- Rules can now match custom-tags
- Custom-tags are matched in a case-insensitive manner
Releases can be verified with the public key at https://thomastrapp.com/public_key.asc, which has the key ID 086653AA8CC7270E
and the fingerprint E6EA EFD0 2CBB 0EFF C010 1324 0866 53AA 8CC7 270E
.
Hext v0.8.2
Notable but minor changes:
make install
now uses CMake's GNUInstallDirs. This allows for finer control of what gets installed where, when configuring the project (67afed7).- htmlext's
Version.cpp
and libhext'sVersion.cpp
are generated out of the source tree (9e8125e). - Removed libhext's custom Doxygen theme (93dab11).
Releases can be verified with the public key at http://thomastrapp.com/public_key.asc, which has the key ID 086653AA8CC7270E
and the fingerprint E6EA EFD0 2CBB 0EFF C010 1324 0866 53AA 8CC7 270E
.
Hext v0.8.0
Static binary releases
Install the htmlext command-line utility and Hext for Python:
pip install hext
Install Hext for Node:
npm install hext
Both are compatible with Linux (x86_64) and Mac OS X ≥ 10.11. Visit https://hext.thomastrapp.com/download.
Major changes
- Syntax: Added greedy rules. Existing Hext snippets do not need to be changed. Documentation.
- CMake and FindPackage Hext: When using libhext in a CMake project, the target
hext::hext
must be used. The variablesHext_LIBRARY
andHext_INCLUDE_DIRS
do no longer exist. See https://hext.thomastrapp.com/download#using-libhext - New dependency: RapidJSON. Previously, RapidJSON (which generates the JSON output of
htmlext
) was downloaded automatically, if not found. This is no longer the case. RapidJSON is installable through most package managers (rapidjson-dev
). See https://github.com/Tencent/rapidjson - Mac OS X support: Hext can now be built and installed on Mac OS X.
Minor changes
- Modernized CMake usage. CMake 3.8 or later required.
- Boost 1.70 support (#8)
- Compatibility with Node v10, v11 and v12 (#6)
- Added a man page for htmlext
- Added build scripts for releases
- Travis CI tests
- libhext examples are no longer built automatically
- libhext/test: Google Test is now a hard dependency
- Python2 unicode string support
- PHP bindings now require PHP7
Acknowledgments
I'd like to thank the following people for their feedback and help:
- @freedmand for suggesting Mac OS X support.
- @impredicative for suggesting
pip install hext
. - @samatt for reporting a Node v10 incompatibility and suggesting
npm install hext
. - @brandonrobertz for suggesting "greedy rules".
- And the staff at https://www.npmjs.com/, for letting me take over the previously occupied package name "hext".
👍
Releases can be verified with the public key at http://thomastrapp.com/public_key.asc, which has the key ID 086653AA8CC7270E
and the fingerprint E6EA EFD0 2CBB 0EFF C010 1324 0866 53AA 8CC7 270E
.
Hext v0.7.0
Notable changes:
- A C++17 capable compiler is now required to build from source
- Rapidjson was updated to v1.1.0
- Improved Python 3 support, courtesy of @brandonrobertz
Attached to this release you will find four experimental builds:
- htmlext-static-binary-amd64-v0.7.0.tar.gz: A statically compiled version of the htmlext command line utility. Confirmed to work in Ubuntu 18.04, Debian 8 and 9.
- hext-amd64-ubuntu18.04-v0.7.0.deb: A Debian package of the htmlext command line utility and the libhext library for Ubuntu 18.04.
- hext-python3.6-amd64-ubuntu18.04-v0.7.0.deb: A Debian package of the Hext python module for Ubuntu 18.04 with Python 3.6. Requires the Hext Debian package.
- hext-python2.7-amd64-ubuntu18.04-v0.7.0.deb: A Debian package of the Hext python module for Ubuntu 18.04 with Python 2.7. Requires the Hext Debian package.
SHA256SUM can be verified with SHA256SUM.asc using my public key at http://thomastrapp.com/public_key.asc, which has the key ID 086653AA8CC7270E
and the fingerprint E6EA EFD0 2CBB 0EFF C010 1324 0866 53AA 8CC7 270E
.
The binaries were built using docker, Ubuntu 18.04 and build-hext-binary-releases.sh.
Hext v0.6.0
Notable changes:
- libhext: reduce amount of exported symbols
This will be the last release before switching to C++17.
Hext v0.5.2
Notable changes:
- libhext/NodeUtil: add proper serialization of HTML for InnerHtml
- libhext: use versioned soname
Hext v0.5.1
Notable changes:
- htmlext and libhext can now be compiled with Visual Studio 2015
- libhext/bindings: htmlext.{js,php,py,rb}: Handle hext syntax error
Hext v0.5.0
Public API changes:
- libhext/bindings/ruby: convert hext::Result to native types
Hext v0.4.0
Public API changes:
- libhext/bindings/python: convert hext::Result to native types