Skip to content

Commit 0178577

Browse files
committed
Fixed up documentation server, added documentation README, fixed up documentation in regex modules and util
1 parent 4268a66 commit 0178577

File tree

6 files changed

+81
-39
lines changed

6 files changed

+81
-39
lines changed

Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -23,5 +23,5 @@ test:
2323

2424
.PHONY: doc_server
2525
doc_server:
26-
swipl -f src/documentation_server.pl
26+
cd src; swipl -f documentation_server.pl
2727

README.md

+8-6
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ that matches that regular expression, as output.
99

1010
* [Introduction](#Introduction)
1111
* [How To Build](#How-To-Build)
12+
* [How To Use Regexc](#How-To-Use-Regexc)
1213
* [Regular Expression Syntax](#Regular-Expression-Syntax)
1314
* [For Developers](#For-Developers)
1415
* [Future Work](#Future-Work)
@@ -55,19 +56,20 @@ dot -Tsvg -o /tmp/nfa.svg nfa.dot &&
5556
open /tmp/ast.svg /tmp/nfa.svg
5657
```
5758

58-
Regexc should provide a useful error message if it fails to parse a regular expression. A generic
59-
error is provided, but I am working towards eliminating the case where it is seen.
59+
Regexc should provide a useful error message if it fails to parse a regular expression.
6060

6161
```
62-
# An example of a useful error
6362
$ regexc -r "what is (this|that"
6463
ERROR: No closing parenthesis at 8
6564
what is (this|that
6665
^
6766
ERROR: No strings were parsed successfully
6867
Exiting due to above Errors...
68+
```
69+
70+
A generic error is provided, but I am working towards eliminating the case where it is seen.
6971

70-
# The generic (bad) error
72+
```
7173
$ regexc -r "\d{2-3}"
7274
ERROR: Could not parse string at 0
7375
\d{2-3}
@@ -111,7 +113,7 @@ Operator characters must be backslash escaped to be taken literally.
111113
Control Symbols = ['\\', '(', ')', '[', ']', '-']
112114
Operator Symbols = ['+', '*', '?', '{', '}', '|', '.']
113115
```
114-
"\(" Would then match the '(' instead of being interpreted as starting a group.
116+
For example, "\\\(" Would then match the '(' instead of being interpreted as starting a group.
115117

116118
There are also classes that can match one of a set of characters. These are specified with `[ Members ]`
117119
notations.
@@ -126,7 +128,7 @@ Member -> Single_Char # A single char to include in the clas
126128
-> Single_Char - Single_Char # A range of characters to include in the class
127129
```
128130

129-
"[a-z\_]" would match any lowercase letter or '\_'.
131+
For example, "[a-z\_]" would match any lowercase letter or '\_'.
130132

131133
We also provide some shortcuts to commonly used classes. These class shortcuts can be used freely outside
132134
of class defintions, but can only be used in place of a range when in a class defintions. That is, one

src/regex_ast.pl

+12-1
Original file line numberDiff line numberDiff line change
@@ -503,6 +503,12 @@
503503
}.
504504
gram_single(Ast, Errors) --> gram_symbol(Ast, Errors).
505505

506+
%! ast_to_dot(+Ast, +Stream) is det.
507+
%
508+
% Write the dot representation of the Ast to the specified stream.
509+
%
510+
% @arg Ast The Ast to write out
511+
% @arg Stream The stream to write the dot represenation too
506512
ast_to_dot(Ast, Stream) :-
507513
format(Stream, "digraph AST {~n", []),
508514
ast_to_dot_r(Stream, Ast, 0, _),
@@ -559,9 +565,14 @@
559565
format(Stream, "\t~d -> ~d;~n", [Current_Index, Sub_Ast_L_Index]),
560566
format(Stream, "\t~d -> ~d;~n", [Current_Index, Sub_Ast_R_Index]).
561567

568+
569+
%! combined_asts(+Asts, -Combined_Ast) is det.
570+
%! combined_asts(-Asts, +Combined_Ast) is det.
562571
%
563-
% Takes a list of Asts, and combines them with the or operator
572+
% Combined_Ast is Asts combined with logical OR.
564573
%
574+
% @arg Asts The list of Asts to combine with logical OR
575+
% @arg Combined_Ast The result of combining Asts with logical OR
565576
combined_asts([First_Ast | Rest_Of_Asts], Combined_Ast) :-
566577
foldl(combined_asts_fold, Rest_Of_Asts, First_Ast, Combined_Ast).
567578

src/regex_parsing.pl

+44-25
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,24 @@
11
:- module(regex_parsing,
22
[
3+
parse_regex_strings/4,
4+
format_errors/3,
5+
format_error/3,
36
parse_regex_strings/4
47
]
58
).
69

710
:- use_module(regex_ast).
811

9-
%! print_errors(+Input:string, +Errors:list) is det.
12+
%! format_errors(+Output_Stream:stream, +Input:string, +Errors:list) is det.
1013
%
1114
% This predicate prints out the error list in a nicely formated way
1215
%
16+
% @arg Output_Stream Where to write the formatted errors
1317
% @arg Input The original string being parsed.
1418
% @arg Errors The list of errors to print
15-
print_errors(_, _, []).
16-
print_errors(Output_Stream, Input, Errors) :-
17-
maplist(print_error(Output_Stream, Input), Errors).
19+
format_errors(_, _, []).
20+
format_errors(Output_Stream, Input, Errors) :-
21+
maplist(format_error(Output_Stream, Input), Errors).
1822

1923
write_single_arrow(Output_Stream, 0) :-
2024
format(Output_Stream, '^~n', []), !.
@@ -23,54 +27,69 @@
2327
M is N - 1,
2428
write_single_arrow(Output_Stream, M).
2529

26-
%! print_errors(+Input:string, +Error:list) is det.
30+
%! format_error(+Ouput_Stream:stream, +Input:string, +Error:list) is det.
2731
%
2832
% This predicate prints out the error in a nicely formated way
2933
%
34+
% @arg Output_Stream Where to write the formatted error
3035
% @arg Input The original string being parsed.
3136
% @arg Error The list of errors to print
3237
% TODO: We should probably propagate information about where regex came from
33-
print_error(Output_Stream, Input, error(Message, some(Pos))) :-
38+
format_error(Output_Stream, Input, error(Message, some(Pos))) :-
3439
format(Output_Stream, 'ERROR: ~s at ~d~n', [Message, Pos]),
3540
format(Output_Stream, '~w~n', [Input]),
3641
write_single_arrow(Output_Stream, Pos).
3742

38-
print_error(Output_Stream, _Input, error(Message)) :-
43+
format_error(Output_Stream, _Input, error(Message)) :-
3944
format(Output_Stream, 'ERROR: ~s~n', [Message]).
4045

41-
% TODO: I think it should be format_error instead of print_error
42-
4346
%! process_regex_string
4447
%
4548
% This is used by parse_regex_strings to both transform the string into an AST,
4649
% and to handle formatting the errors.
4750
%
48-
process_regex_string(_Output_Stream, Regex_String, (Asts, Error_Flag), ([Ast | Asts], Error_Flag)) :-
49-
regex_ast:string_ast(Regex_String, Ast, []), !.
50-
51-
% TODO: we should collapse these so that we only call string_ast once
52-
process_regex_string(Output_Stream, Regex_String, (Asts, _), (Asts, true)) :-
53-
regex_ast:string_ast(Regex_String, _, Errors),
54-
print_errors(Output_Stream, Regex_String, Errors).
51+
process_regex_string(Output_Stream, Regex_String, (Asts, Error_Flag), ([Ast | Asts], New_Error_Flag)) :-
52+
regex_ast:string_ast(Regex_String, Ast, Errors), !,
53+
handle_parse_errors(Output_Stream, Regex_String, Errors, Error_Flag, New_Error_Flag).
5554

56-
process_regex_string(Output_Stream, Regex_String, (Asts, _), (Asts, true)) :-
55+
% If regex_ast:string_ast fails, we should catch that here.
56+
% Note that we don't get an AST here
57+
process_regex_string(Output_Stream, Regex_String, (Asts, Error_Flag), (Asts, New_Error_Flag)) :-
5758
Errors = [error("Could not parse string", some(0))],
58-
print_errors(Output_Stream, Regex_String, Errors).
59+
handle_parse_errors(Output_Stream, Regex_String, Errors, Error_Flag, New_Error_Flag).
60+
61+
%
62+
% Handling Errors means formatting them and keeping track of
63+
% whether we've seen any with a flag
64+
%
65+
handle_parse_errors(_Output_Stream, _Regex_String, [], Error_Flag, Error_Flag).
5966

67+
handle_parse_errors(Output_Stream, Regex_String, Errors, _Error_Flag, true) :-
68+
format_errors(Output_Stream, Regex_String, Errors).
6069

70+
%
71+
% Once we have a list of ASTS,
72+
% we need at least one
73+
% We need to cominbe them
74+
%
6175
handle_asts(Output_Stream, [], _, _, true) :-
6276
writeln(Output_Stream, "ERROR: No strings were parsed successfully").
6377

6478
handle_asts(_, Asts, Ast, Error_Flag, Error_Flag) :-
6579
regex_ast:combined_asts(Asts, Ast).
6680

6781

68-
%! parse_regex_strings
82+
%! parse_regex_strings(+Output_Stream:stream, +Regex_Strings:list, -Ast, -Error_Found_Flag) is det.
6983
%
70-
% This is the highest level handle for parsing strings, it takes in a list of strings
84+
% This is the highest level handle for parsing strings.
85+
% It takes in a list of strings,
7186
% transforms them all into one Ast (by OR'ing them together),
7287
% and formats the errors into an output_stream.
7388
%
89+
% @arg Ouput_Stream Where to write any formatted errors
90+
% @arg Regex_Strings The strings to parse as regular expressions
91+
% @arg AST The resulting AST
92+
% @arg Error_Found_Flag Will be true if any errors were found
7493
parse_regex_strings(
7594
Output_Stream,
7695
Regex_Strings,
@@ -113,13 +132,13 @@
113132
test_write_single_arrow(Num, Correct_Arrow)
114133
).
115134

116-
test_print_error(Error, Correct_Output) :-
135+
test_format_error(Error, Correct_Output) :-
117136
with_output_to(string(Arrow),
118-
print_error(current_output, "aaaa", Error)
137+
format_error(current_output, "aaaa", Error)
119138
),
120139
assertion(Arrow = Correct_Output).
121140

122-
test(print_error) :-
141+
test(format_error) :-
123142
Arrows = [
124143
(
125144
error("Wut", some(0)),
@@ -135,7 +154,7 @@
135154
)
136155
],
137156
forall(member((Error, Correct_Output), Arrows),
138-
test_print_error(Error, Correct_Output)
157+
test_format_error(Error, Correct_Output)
139158
).
140159

141160
test_parse_regex_strings(Strings, Correct_Output, Correct_Ast, Correct_Error_Flag) :-
@@ -169,7 +188,7 @@
169188
(
170189
["(a", "b"],
171190
"ERROR: No closing parenthesis at 0\n(a\n^\n",
172-
ast_range(98, 98),
191+
ast_or(ast_range(97, 97), ast_range(98, 98)),
173192
true
174193
),
175194
(

src/statemachine.pl

+2-3
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,10 @@
1717
F: A set of accepting states
1818
1919
We assume that all finite Automotan here share the same set in input symbols, bytes.
20-
For the purposes of specifying input in transitions we have three options.
20+
For the purposes of specifying input in transitions we have two options.
2121
22-
byte(Byte),
2322
range(Min, Max),
24-
any.
23+
wildcard.
2524
2625
Also note that a finite automaton is non-determinisitic unless E = [].
2726

src/util.pl

+14-3
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
:- module(util,
22
[
3-
enumeration/2
3+
enumeration/2,
4+
write_to_file/2,
5+
file_diff/3
46
]).
57

68
/** <module> util
@@ -11,20 +13,29 @@
1113
@license MIT
1214
*/
1315

14-
%! enumeration(+List:list, +Enumerated_List:kist) is semidet.
16+
%! enumeration(+List:list, +Enumerated_List:list) is semidet.
1517
%
1618
% This relates a list to a list of tuples with the element and their index.
1719
enumeration([], []).
1820
enumeration(Ls, Es) :- enumeration_r(Ls, Es, 0).
1921
enumeration_r([], [], _).
2022
enumeration_r([L|Ls], [(L, C)|Es], C) :- N is C + 1, enumeration_r(Ls, Es, N).
2123

24+
%! write_to_file(:Goal, +Path) is det.
25+
%
26+
% This predicate will open the file at Path for writing and call Goal with that Output Stream.
27+
% The Goal should normally be called like `goal(..., Output_Stream)`.
2228
write_to_file(Goal, Path) :-
2329
absolute_file_name(Path, Absolute_Path),
2430
open(Absolute_Path, write, File_Output),
25-
call(Goal, File_Output),
31+
call(Goal, File_Output), !,
2632
close(File_Output).
2733

34+
%! file_diff(+Path_1, +Path_2, -Diff) is det.
35+
%
36+
% This predicate just shells out to git diff.
37+
% I use it for testing.
38+
% This predicate asserts that both Paths must exist.
2839
file_diff(Path_1, Path_2, Diff) :-
2940
absolute_file_name(Path_1, Absolute_Path_1),
3041
absolute_file_name(Path_2, Absolute_Path_2),

0 commit comments

Comments
 (0)