Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved start.jar dry-run command line quoting #10090

Closed
wants to merge 12 commits into from
Original file line number Diff line number Diff line change
Expand Up @@ -38,31 +38,82 @@ public static String findJavaBin()
return "java";
}

/**
* Perform an optional quoting of the argument, being intelligent with spaces and quotes as needed. If a subString is set in quotes it won't the subString
* won't be escaped.
*
* @param arg the argument to quote
* @return the quoted and escaped argument
* @deprecated no replacement, quoting is done by {@link #toQuotedString()} now.
*/
@Deprecated
public static String quote(String arg)
private final StringBuilder commandLine = new StringBuilder();
private final List<String> args = new ArrayList<>();
private final String separator;

public CommandLineBuilder()
{
return "'" + arg + "'";
this(false);
}

private List<String> args;

public CommandLineBuilder()
public CommandLineBuilder(boolean multiline)
{
args = new ArrayList<String>();
separator = multiline ? (" \\" + System.lineSeparator() + " ") : " ";
}

public CommandLineBuilder(String bin)
/**
* Quote a string suitable for use with a command line shell using double quotes.
* <p>This method applies doubles quoting as described for the unix {@code sh} commands:
* Enclosing characters within double quotes preserves the literal meaning of all characters except
* dollarsign ($), backquote (`), and backslash (\).
* The backslash inside double quotes is historically weird, and serves
* to quote only the following characters: {@code $ ` " \ newline}.
* Otherwise it remains literal.
* </p><p>Additionally, a string is deemed to need quoting if
* it contains a single quote or a {@code bash} meta character: A character that,
* when unquoted, separates words. One of the following: {@code | & ; ( ) < > space tab newline}
* </p>
*
* @param input The string to quote if needed
* @return The quoted string or the original string if quotes are not necessary
*/
public static String shellQuoteIfNeeded(String input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of characters for bash (haven't looked at other shells yet) that have special meaning ...

Character Meaning
~ home directory
` Single quote for command substitution
# comment
$ variable
& background
* string wildcard
( start substitution
) end substitution
\ quote next char
| pipe to other command
[ start character-set wildcard
] end character-set wildcard
{ start command block
} end command block
; shell command separator
' strong quote
" weak quote
< input redirection
> output redirection
/ pathname separator
? single character wildcard
! pipeline logical not operator

Each of those should trigger quoting and in some cases escaping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joakime I think we don't need every special character, we just need metacharacters that are used to split a command line and characters involved with quoting itself.....
Or do we need to also protect wild cards? Hmmm probably yes....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be simpler to have a set of characters that we don't need to quote for: a-z A-Z 0-9 _-/

Copy link
Contributor

@joakime joakime Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a longer breakdown of special characters for the shells: csh, sh, bash, ksh, zsh

Character(s) Shells Meaning
~ sh, csh, bash, ksh, zsh home directory
` bash Single quote for command substitution
# csh, sh, bash comment
# ksh, bash variable replacement
^ sh only pipe character
^ csh, bash previous command
$ csh, sh, bash, ksh, zsh variable
& csh, sh, bash, ksh, zsh background job. or combine stdin and stdout
* csh, sh, bash string wildcard (match zero or more characters)
( ) bash, ksh command substitution
\ csh, sh, bash, ksh, zsh quote next char
| csh, sh, bash, ksh pipe to other command
| zsh job control
[ ] csh, sh, bash, ksh character-set wildcard (match range of characters)
{ } csh, bash in-line expansion
{ } csh, bash, ksh, zsh command block
: sh evaluate args. or separate values in paths.
: csh variable modifier
; csh, sh, bash, ksh, zsh shell command separator
; sh end of case element
' bash strong quote
" csh, sh, bash weak quote
< csh, sh, bash, ksh, zsh input redirection
> csh, sh, bash, ksh, zsh output redirection
/ csh, sh, bash, ksh, zsh pathname separator
- sh, bash, ksh, zsh stdin or stdout reference
? csh, sh, bash single character wildcard
? sh, bash optional variable
! csh, bash pipeline logical not operator
! ksh, zsh exit status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be simpler to have a set of characters that we don't need to quote for: a-z A-Z 0-9 _-/

I wouldn't include - (dash) or / (slash) in that list.

The - (dash) is used as stdin or stdout reference in many shells.
And the / (slash) is used as a hard path separator in some shells (forces the shell to intepret the text around the / in path terms, applying path behaviors to that string), and a soft separator in others (if present, the matching logic applies to the strings adjacent too)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Windows, the rules are a bit different ...

You can only use the \ back-slash to escape non-control characters when executing in CMD (such as a space character).

In Powershell, the escaping is done with ^ character.

We will likely need separate quoting and escaping methods for Unix/Linux/OSX and Windows. (Maybe even a special method for those users on Powershell)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can only really be output for unix shells, and then only the common ones. It is a tool for human debugging and for doing start scripts.
If we keep the output constant, then others can transform to any special usages

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the usage.txt file to make it clear that the output is only suitable for the unix sh command. I do not think we can come up with an output format that will work on every shell on every platform. I also think that any attempt to cover multiple platforms is just going to result in hard to handle variable output. Best to just give fixed output and allows any users on other platforms either to use a sh port or do a transformation.

{
this();
args.add(bin);
if (input == null || input.length() == 0)
return input;

int i = 0;
boolean needsQuoting = false;
while (!needsQuoting && i < input.length())
{
char c = input.charAt(i++);
needsQuoting = switch (c)
{
// sh quotes
case '$', '`', '"', '\\',

// sh single quotes
'\'',

// bash metacharacter: A character that, when unquoted, separates words.
// One of the following: | & ; ( ) < > space tab newline
'|', '&', ';', '(', ')', '<', '>', ' ', '\t', '\n'
-> true;
default -> false;
};
}

if (!needsQuoting)
return input;

StringBuilder builder = new StringBuilder(input.length() * 2);
builder.append('"');
builder.append(input, 0, --i);

while (i < input.length())
{
char c = input.charAt(i++);
switch (c)
{
case '"', '\\', '`', '$' -> builder.append('\\').appendCodePoint(c);
default -> builder.appendCodePoint(c);
}
}

builder.append('"');

return builder.toString();
}

/**
Expand All @@ -77,29 +128,25 @@ public void addArg(String arg)
if (arg != null)
{
args.add(arg);
if (commandLine.length() > 0)
commandLine.append(separator);
commandLine.append(shellQuoteIfNeeded(arg));
}
}

/**
* Similar to {@link #addArg(String)} but concats both name + value with an "=" sign, quoting were needed, and excluding the "=" portion if the value is
* undefined or empty.
*
* <pre>
* addEqualsArg("-Dname", "value") = "-Dname=value"
* addEqualsArg("-Djetty.home", "/opt/company inc/jetty (7)/") = "-Djetty.home=/opt/company\ inc/jetty\ (7)/"
* addEqualsArg("-Djenkins.workspace", "/opt/workspaces/jetty jdk7/") = "-Djenkins.workspace=/opt/workspaces/jetty\ jdk7/"
* addEqualsArg("-Dstress", null) = "-Dstress"
* addEqualsArg("-Dstress", "") = "-Dstress"
* </pre>
*
* @param name the name
* @param value the value
*/
public void addEqualsArg(String name, String value)
public void addArg(String name, String value)
{
if (commandLine.length() > 0)
commandLine.append(separator);
commandLine.append(shellQuoteIfNeeded(name));
if ((value != null) && (value.length() > 0))
{
args.add(name + "=" + value);
commandLine.append('=').append(shellQuoteIfNeeded(value));
}
else
{
Expand All @@ -108,17 +155,31 @@ public void addEqualsArg(String name, String value)
}

/**
* Add a simple argument to the command line.
* <p>
* Will <b>NOT</b> quote/escape arguments that have a space in them.
*
* @param arg the simple argument to add
* @param option the option
* @param name the name
* @param value the value
*/
public void addRawArg(String arg)
public void addArg(String option, String name, String value)
{
if (arg != null)
if (commandLine.length() > 0)
commandLine.append(separator);
commandLine.append(option);
if (name == null || name.length() == 0)
{
args.add(arg);
args.add(option);
}
else
{
commandLine.append(shellQuoteIfNeeded(name));
if ((value != null) && (value.length() > 0))
{
args.add(option + name + "=" + value);
commandLine.append(shellQuoteIfNeeded(name)).append('=').append(shellQuoteIfNeeded(value));
}
else
{
args.add(option + name);
}
}
}

Expand All @@ -129,20 +190,12 @@ public List<String> getArgs()

@Override
public String toString()
{
return toString(" ");
}

public String toString(String delim)
{
StringBuilder buf = new StringBuilder();

for (String arg : args)
{
if (buf.length() > 0)
{
buf.append(delim);
}
buf.append(' ');
buf.append(arg); // we assume escaping has occurred during addArg
}

Expand All @@ -154,23 +207,9 @@ public String toString(String delim)
*
* @return the toString but each arg that has spaces is surrounded by {@code '} (single-quote tick)
*/
public String toQuotedString()
public String toCommandLine()
{
StringBuilder buf = new StringBuilder();

for (String arg : args)
{
if (buf.length() > 0)
buf.append(' ');
boolean needsQuotes = (arg.contains(" "));
if (needsQuotes)
buf.append("'");
buf.append(arg);
if (needsQuotes)
buf.append("'");
}

return buf.toString();
return commandLine.toString();
}

public void debug()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@ public void start(StartArgs args) throws IOException, InterruptedException
{
CommandLineBuilder cmd = args.getMainArgs(args.getDryRunParts());
cmd.debug();
System.out.println(cmd.toQuotedString());
System.out.println(cmd.toCommandLine());
}

if (args.isStopCommand())
Expand Down
Loading