Skip to content

Commit

Permalink
Modernize things a bit
Browse files Browse the repository at this point in the history
Instead of header guard macros, use `#pragma once`, use `std::array`
instead of C style arrays and so on. Generally prepare for 1.0.0
release.
  • Loading branch information
RauliL committed Feb 8, 2024
1 parent 0c0d19b commit 1a2a31b
Show file tree
Hide file tree
Showing 37 changed files with 985 additions and 1,045 deletions.
14 changes: 7 additions & 7 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ jobs:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v2
- name: Build
uses: ashutoshvarma/action-cmake-build@master
with:
build-dir: ${{ runner.workspace }}/build
build-type: Release
run-test: true
- uses: actions/checkout@v4
- name: Build
uses: ashutoshvarma/action-cmake-build@master
with:
build-dir: ${{ runner.workspace }}/build
build-type: Release
run-test: true
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
/.vscode
/build
/doxygen
*.o
Expand Down
4 changes: 2 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
CMAKE_MINIMUM_REQUIRED(VERSION 3.12)
CMAKE_MINIMUM_REQUIRED(VERSION 3.6)

PROJECT(
PeeloUnicode
VERSION 0.2.0
VERSION 1.0.0
DESCRIPTION "Header only C++ Unicode utilities."
HOMEPAGE_URL "https://github.com/peelonet/peelo-unicode"
LANGUAGES CXX
Expand Down
2 changes: 1 addition & 1 deletion Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -858,7 +858,7 @@ EXCLUDE_PATTERNS =
# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories use the pattern */test/*

EXCLUDE_SYMBOLS = *::internal
EXCLUDE_SYMBOLS = *::utils

# The EXAMPLE_PATH tag can be used to specify one or more files or directories
# that contain example code fragments that are included (see the \include
Expand Down
6 changes: 3 additions & 3 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
Copyright (c) 2018, peelo.net
Copyright (c) 2018-2024, peelo.net
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
- Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
- Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

Expand Down
102 changes: 101 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,108 @@

![Build](https://github.com/peelonet/peelo-unicode/workflows/Build/badge.svg)

Collection of various Unicode related utility functions for C++17.
Collection of simple to use [Unicode] utilities for C++17.

[Doxygen generated API documentation.][API]

[Unicode]: https://en.wikipedia.org/wiki/Unicode
[API]: https://peelonet.github.io/peelo-unicode/index.html

## Character testing functions

The library ships with Unicode version of [ctype.h] header, containing
following functions inside `peelo::unicode::ctype` namespace:

- `isalnum()`
- `isalpha()`
- `isblank()`
- `iscntrl()`
- `isdigit()`
- `isgraph()`
- `islower()`
- `isprint()`
- `ispunct()`
- `isspace()`
- `isupper()`
- `isxdigit()`
- `tolower()`
- `toupper()`
- And additional `isvalid()` function which tests whether given value is valid
Unicode codepoint.

[ctype.h]: https://en.cppreference.com/w/cpp/header/cctype

### Example

```cpp
#include <iostream>
#include <peelo/unicode/ctype.hpp>

int
main()
{
using namespace peelo::unicode::ctype;

std::cout << isalnum(U'Ä') << std::endl;
std::cout << isdigit(U'൧') << std::endl;
std::cout << isgraph(U'€') << std::endl;
std::cout << ispunct(U'\u2001') << std::endl;
std::cout << std::hex;
std::cout << tolower(U'Ä') << std::endl;
std::cout << toupper(U'ä') << std::endl;
}
```

## Character encodings

The library also provides functions for encoding and decoding Unicode character
encodings. Both validating and non-validating (where all encoding/decoding
errors are ignored) functions are provided.

Supported character encodings are:

- [UTF-8]
- [UTF-16BE][UTF-16]
- [UTF-16LE][UTF-16]
- [UTF-32BE][UTF-32]
- [UTF-32LE][UTF-32]

[UTF-8]: https://en.wikipedia.org/wiki/UTF-8
[UTF-16]: https://en.wikipedia.org/wiki/UTF-16
[UTF-32]: https://en.wikipedia.org/wiki/UTF-32

### Example

```cpp
#include <peelo/unicode/encoding.hpp>

int
main()
{
using namespace peelo::unicode::encoding;

// Decode UTF-8 input, ignoring any decoding errors.
std::u32string utf8_decoded = utf8::decode("\xe2\x82\xac");

// Encode it back to byte string, ignoring any encoding errors.
std::string utf8_encoded = utf8::encode(utf8_decoded);

// Decode UTF-32BE input with validation.
std::u32string utf32be_decoded;
if (utf32be::decode_validate("\x00\x00 \xac", utf32be_decoded))
{
// Given input is valid UTF-32BE.
} else {
// Given input is invalid UTF-32BE.
}

// Encode it back to byte string, with validation.
std::string utf32be_encoded;
if (utf32be::encode_validate(utf32be_decoded, utf32be_encoded))
{
// Given input contained only valid Unicode code points.
} else {
// Given input contained invalid Unicode code points.
}
}
```
7 changes: 2 additions & 5 deletions include/peelo/unicode/ctype.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2018-2020, peelo.net
* Copyright (c) 2018-2024, peelo.net
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
Expand All @@ -24,8 +24,7 @@
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef PEELO_UNICODE_CTYPE_HPP_GUARD
#define PEELO_UNICODE_CTYPE_HPP_GUARD
#pragma once

#include <peelo/unicode/ctype/isalnum.hpp>
#include <peelo/unicode/ctype/isalpha.hpp>
Expand All @@ -42,5 +41,3 @@
#include <peelo/unicode/ctype/isxdigit.hpp>
#include <peelo/unicode/ctype/tolower.hpp>
#include <peelo/unicode/ctype/toupper.hpp>

#endif /* !PEELO_UNICODE_CTYPE_HPP_GUARD */
53 changes: 53 additions & 0 deletions include/peelo/unicode/ctype/_utils.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
/*
* Copyright (c) 2018-2024, peelo.net
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#pragma once

#include <array>
#include <utility>

namespace peelo::unicode::ctype::utils
{
using range = std::pair<char32_t, char32_t>;

template<std::size_t Size>
inline bool table_lookup(const std::array<range, Size>& table, char32_t c)
{
const auto size = table.size();

for (std::size_t i = 0; i < size; ++i)
{
const auto& range = table[i];

if (c >= range.first && c <= range.second)
{
return true;
}
}

return false;
}
}
28 changes: 10 additions & 18 deletions include/peelo/unicode/ctype/isalnum.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2018-2020, peelo.net
* Copyright (c) 2018-2024, peelo.net
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
Expand All @@ -24,18 +24,20 @@
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef PEELO_UNICODE_CTYPE_ISALNUM_HPP_GUARD
#define PEELO_UNICODE_CTYPE_ISALNUM_HPP_GUARD
#pragma once

#include <peelo/unicode/ctype/_utils.hpp>

namespace peelo::unicode::ctype
{
/**
* Determines whether the given Unicode code point is alphanumeric.
*/
inline bool isalnum(char32_t c)
inline bool
isalnum(char32_t c)
{
static const char32_t alnum_table[436][2] =
{
static const std::array<utils::range, 436> alnum_table =
{{
{ 0x0030, 0x0039 }, { 0x0041, 0x005a }, { 0x0061, 0x007a },
{ 0x00aa, 0x00aa }, { 0x00b5, 0x00b5 }, { 0x00ba, 0x00ba },
{ 0x00c0, 0x00d6 }, { 0x00d8, 0x00f6 }, { 0x00f8, 0x0241 },
Expand Down Expand Up @@ -182,18 +184,8 @@ namespace peelo::unicode::ctype
{ 0x1d78a, 0x1d7a8 }, { 0x1d7aa, 0x1d7c2 }, { 0x1d7c4, 0x1d7c9 },
{ 0x1d7ce, 0x1d7ff }, { 0x20000, 0x2a6d6 }, { 0x2f800, 0x2fa1d },
{ 0xe0100, 0xe01ef }
};

for (int i = 0; i < 436; ++i)
{
if (c >= alnum_table[i][0] && c <= alnum_table[i][1])
{
return true;
}
}
}};

return false;
return utils::table_lookup(alnum_table, c);
}
}

#endif /* !PEELO_UNICODE_CTYPE_ISALNUM_HPP_GUARD */
28 changes: 10 additions & 18 deletions include/peelo/unicode/ctype/isalpha.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2018-2020, peelo.net
* Copyright (c) 2018-2024, peelo.net
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
Expand All @@ -24,18 +24,20 @@
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef PEELO_UNICODE_CTYPE_ISALPHA_HPP_GUARD
#define PEELO_UNICODE_CTYPE_ISALPHA_HPP_GUARD
#pragma once

#include <peelo/unicode/ctype/_utils.hpp>

namespace peelo::unicode::ctype
{
/**
* Determines whether the given Unicode code point is alphabetic.
*/
inline bool isalpha(char32_t c)
inline bool
isalpha(char32_t c)
{
static const char32_t alpha_table[418][2] =
{
static const std::array<utils::range, 418> alpha_table =
{{
{ 0x0041, 0x005a }, { 0x0061, 0x007a }, { 0x00aa, 0x00aa },
{ 0x00b5, 0x00b5 }, { 0x00ba, 0x00ba }, { 0x00c0, 0x00d6 },
{ 0x00d8, 0x00f6 }, { 0x00f8, 0x0241 }, { 0x0250, 0x02c1 },
Expand Down Expand Up @@ -176,18 +178,8 @@ namespace peelo::unicode::ctype
{ 0x1d770, 0x1d788 }, { 0x1d78a, 0x1d7a8 }, { 0x1d7aa, 0x1d7c2 },
{ 0x1d7c4, 0x1d7c9 }, { 0x20000, 0x2a6d6 }, { 0x2f800, 0x2fa1d },
{ 0xe0100, 0xe01ef }
};

for (int i = 0; i < 418; ++i)
{
if (c >= alpha_table[i][0] && c <= alpha_table[i][1])
{
return true;
}
}
}};

return false;
return utils::table_lookup(alpha_table, c);
}
}

#endif /* !PEELO_UNICODE_CTYPE_ISALPHA_HPP_GUARD */
Loading

0 comments on commit 1a2a31b

Please sign in to comment.