Discussion:
realpath quoting
(too old to reply)
jeremy ardley
2024-05-02 23:10:01 UTC
Permalink
I have a need  to get the full path of a file that has spaces in its
name to use as a program argument

e.g.

***@client:~$ ls -l name\ with\ spaces
-rw-r--r-- 1 jeremy jeremy 0 May  3 06:51 'name with spaces'
***@client:~$ realpath name\ with\ spaces
/home/jeremy/name with spaces


The spaces without quotes cause problems with subsequent processing.

Can realpath or other utility return a quoted pathname?
Greg Wooledge
2024-05-02 23:40:01 UTC
Permalink
I have a need  to get the full path of a file that has spaces in its name to
use as a program argument
e.g.
-rw-r--r-- 1 jeremy jeremy 0 May  3 06:51 'name with spaces'
/home/jeremy/name with spaces
Looks good to me.
The spaces without quotes cause problems with subsequent processing.
Then the subsequent processing has bugs in it. Fix them.
Can realpath or other utility return a quoted pathname?
That would be extremely counterproductive. Do not look for kludges to
work around your script's bugs. Fix the bugs instead.

Start with <https://mywiki.wooledge.org/Quotes>.
jeremy ardley
2024-05-02 23:50:01 UTC
Permalink
Post by Greg Wooledge
Post by jeremy ardley
The spaces without quotes cause problems with subsequent processing.
Then the subsequent processing has bugs in it. Fix them.
Post by jeremy ardley
Can realpath or other utility return a quoted pathname?
That would be extremely counterproductive. Do not look for kludges to
work around your script's bugs. Fix the bugs instead.
Start with<https://mywiki.wooledge.org/Quotes>.
You don't see a problem that ls produces quoted filenames and realpath
doesn't?
Greg Wooledge
2024-05-02 23:50:01 UTC
Permalink
Post by jeremy ardley
Post by Greg Wooledge
Post by jeremy ardley
The spaces without quotes cause problems with subsequent processing.
Then the subsequent processing has bugs in it. Fix them.
Post by jeremy ardley
Can realpath or other utility return a quoted pathname?
That would be extremely counterproductive. Do not look for kludges to
work around your script's bugs. Fix the bugs instead.
Start with<https://mywiki.wooledge.org/Quotes>.
You don't see a problem that ls produces quoted filenames and realpath
doesn't?
Oh Jesus, don't get me started on ls.

We have a whole page on that. <https://mywiki.wooledge.org/ParsingLs>
David Christensen
2024-05-03 02:20:01 UTC
Permalink
Post by jeremy ardley
I have a need  to get the full path of a file that has spaces in its
name to use as a program argument
e.g.
-rw-r--r-- 1 jeremy jeremy 0 May  3 06:51 'name with spaces'
/home/jeremy/name with spaces
The spaces without quotes cause problems with subsequent processing.
Can realpath or other utility return a quoted pathname?
Perhaps Perl and the module String::ShellQuote ?

2024-05-02 18:50:28 ***@laalaa ~
$ touch "name with spaces"

2024-05-02 18:50:45 ***@laalaa ~
$ touch "name with\nnewline"

2024-05-02 19:06:01 ***@laalaa ~
$ perl -MString::ShellQuote -e 'print shell_quote(@ARGV), "\n"' name*
'name with spaces' 'name with\nnewline'


David
Greg Wooledge
2024-05-03 02:20:01 UTC
Permalink
Post by David Christensen
Perhaps Perl and the module String::ShellQuote ?
$ touch "name with spaces"
$ touch "name with\nnewline"
You didn't create a name with a newline in it here. You created a name
with a backslash in it. If you wanted a newline, you would have to use
the $'...' quoting form (in bash).

touch $'name with\nnewline'
Post by David Christensen
'name with spaces' 'name with\nnewline'
I still insist that this is a workaround that should *not* be used
to try to cancel out quoting bugs in one's shell scripts. Just write
the shell scripts correctly in the first place.
Max Nikulin
2024-05-03 03:00:01 UTC
Permalink
I still insist that this is a workaround that should *not* be used
to try to cancel out quoting bugs in one's shell scripts.
There are still specific cases when quoting is necessary, e.g. ssh
remote command (however you have to be sure concerning shell on the
remote host).

In BASH printf has %q format. GNU coreutils supports it as well, but
dash does not, so be careful.

Likely Jeremy's case does not really require this kind of quoting.

While "ls -l" output is for humans, realpath is often used in scripts.
Certainly it should nor return quoted output by default. I am in doubts
if a dedicated option should be added to realpath.
jeremy ardley
2024-05-03 04:40:01 UTC
Permalink
Post by Max Nikulin
I still insist that this is a workaround that should *not*  be used
to try to cancel out quoting bugs in one's shell scripts.
There are still specific cases when quoting is necessary, e.g. ssh
remote command (however you have to be sure concerning shell on the
remote host).
In BASH printf has %q format. GNU coreutils supports it as well, but
dash does not, so be careful.
Likely Jeremy's case does not really require this kind of quoting.
While "ls -l" output is for humans, realpath is often used in scripts.
Certainly it should nor return quoted output by default. I am in
doubts if a dedicated option should be added to realpath.
My use case is very simple. Give an argument to a program that expects a
single filename/path.

If you give it an unquoted and unescaped filename it will break parsing
the args thinking there are many.

When invoking from bash with auto completion the filename will get
escaped as required. When cutting and pasting into a debugger prompt for
args, not so.

The easy workaround for me is a few lines of python that emits a quoted
filepath.
Greg Wooledge
2024-05-03 11:10:01 UTC
Permalink
Post by jeremy ardley
My use case is very simple. Give an argument to a program that expects a
single filename/path.
Then you need to use "$1" with quotes when you reference it. Simple!
Post by jeremy ardley
If you give it an unquoted and unescaped filename it will break parsing the
args thinking there are many.
Ahhh! You're not even in the script yet. You're having trouble *passing
the filename as an argument* from your interactive shell.

The best way to do this is to use tab completion. The shell should
automatically quote the filename for you, using backslashes.
Post by jeremy ardley
When invoking from bash with auto completion the filename will get escaped
as required. When cutting and pasting into a debugger prompt for args, not
so.
Ahhh! The question changed a second time!

I have no idea what you think the shell, or the debugger, should do
about this case.

I would suggest that if you need to use a debugger to track down a bug
in your program, you should use filenames that don't require quoting
when you set up your tests.
jeremy ardley
2024-05-03 11:40:01 UTC
Permalink
Post by Greg Wooledge
I would suggest that if you need to use a debugger to track down a bug
in your program, you should use filenames that don't require quoting
when you set up your tests.
1970's style static test cases are not relevant here.

In the real world...  I download files generated by another system that
are constantly changing content and with names I don't control.

My workflow is to download a new file from a remote source and then run
my processor over it.

As a necessary consequence I need the fully quoted or escaped file name
of the new file to feed to the processor/debugger.

I can obviously add an extra step to the process to convert the new file
name to something acceptable before processing. However, my question was
how to avoid that extra step by getting fully quoted filenames to process.
Sirius
2024-05-03 11:50:01 UTC
Permalink
Post by jeremy ardley
Post by Greg Wooledge
I would suggest that if you need to use a debugger to track down a bug
in your program, you should use filenames that don't require quoting
when you set up your tests.
1970's style static test cases are not relevant here.
In the real world...  I download files generated by another system that
are constantly changing content and with names I don't control.
My workflow is to download a new file from a remote source and then run my
processor over it.
As a necessary consequence I need the fully quoted or escaped file name of
the new file to feed to the processor/debugger.
I can obviously add an extra step to the process to convert the new file
name to something acceptable before processing. However, my question was
how to avoid that extra step by getting fully quoted filenames to process.
Encase the file-name in single or double quotes. If it contains any kind
of construct that could be expanded by the shell, single quotes.

Consider this example:

$ file=abc
$ echo "$file"
$ echo '$file'

If you copy-paste anywhere, slap single quotes around it by habit and you
will not get taken by surprise by spaces or anything the shell decide
looks like something it can evaluate or expand.
--
Kind regards,

/S
David Christensen
2024-05-03 19:20:01 UTC
Permalink
Post by jeremy ardley
Post by Greg Wooledge
I would suggest that if you need to use a debugger to track down a bug
in your program, you should use filenames that don't require quoting
when you set up your tests.
1970's style static test cases are not relevant here.
In the real world...  I download files generated by another system that
are constantly changing content and with names I don't control.
My workflow is to download a new file from a remote source and then run
my processor over it.
As a necessary consequence I need the fully quoted or escaped file name
of the new file to feed to the processor/debugger.
I can obviously add an extra step to the process to convert the new file
name to something acceptable before processing. However, my question was
how to avoid that extra step by getting fully quoted filenames to process.
So, you are copying and pasting file names via some clipboard? emacs(1)
might have a way to put a filter into that process, but I am unaware of
a similar feature using Xfce and Terminal (my platform).


I have tried renaming files in similar situations, but you will want to
rename them everywhere if you use rsync(1).


What if you downloaded files to a directory with a well-formed name and
added a feature to your script to process files that appear in that
directory?


David
Max Nikulin
2024-05-03 15:20:01 UTC
Permalink
Post by jeremy ardley
My use case is very simple. Give an argument to a program that expects a
single filename/path.
Role of realpath in your workflow is not clear for me yet.

If you need to copy its result to clipboard then you may use xsel,
xclip, etc.

realpath --zero "$file" |
{ IFS= read -r -d '' path ; printf '%q' "$path" ; } | xsel -bi

You may bind some key sequence to paste PRIMARY or CLIPBOARD content to
BASH prompt quoted

_bind_x_yank() {
local buffer head tail
if [ -z "$READLINE_ARGUMENT" ]
then
buffer="$(xsel --output "${1:---primary}")"
else
buffer="$(xsel --output "${1:---primary}" |
xargs --null printf '%q')"
fi
[ -n "$buffer" ] || return
head="${READLINE_LINE:0:$READLINE_POINT}${buffer}"
tail="${READLINE_LINE:$READLINE_POINT}"
READLINE_LINE="${head}${tail}"
READLINE_POINT="${#head}"
}
bind -x emacs -x '"\C-xY": _bind_x_yank'
bind -x emacs -x '"\C-xy": _bind_x_yank --clipboard'

[Esc] [1] [Ctrl+x] [y]
from clipboard or last [Y] from PRIMARY selection.

You even may define a desktop-wide shortcut that replaces selection
content with its quoted variant. Neither task requires quoted output
from realpath directly.

I am unsure what kind of debugger you use and what kind of escaping it
needs.

P.S.
A corner case is a file path having trailing newlines
https://mywiki.wooledge.org/BashPitfalls#content.3D.24.28.3Cfile.29
David Christensen
2024-05-03 05:40:02 UTC
Permalink
Post by Max Nikulin
I still insist that this is a workaround that should *not*  be used
to try to cancel out quoting bugs in one's shell scripts.
There are still specific cases when quoting is necessary, e.g. ssh
remote command
+1
Post by Max Nikulin
(however you have to be sure concerning shell on the
remote host).
+1
Post by Max Nikulin
In BASH printf has %q format. GNU coreutils supports it as well, but
dash does not, so be careful.
My practice is to start with '#!/bin/sh' and migrate to '#!/usr/bin/env
perl' as complexity increases.
Post by Max Nikulin
Likely Jeremy's case does not really require this kind of quoting.
We need to see the full range of file names the OP is trying to deal with.
Post by Max Nikulin
While "ls -l" output is for humans, realpath is often used in scripts.
Certainly it should nor return quoted output by default. I am in doubts
if a dedicated option should be added to realpath.
Thank you for helping me realize that my solution fails to print the
resolved absolute file name. Here is the updated solution:

2024-05-02 21:57:56 ***@laalaa ~
$ touch 'name with spaces'

2024-05-02 22:23:01 ***@laalaa ~
$ touch 'name with
Post by Max Nikulin
newline'
2024-05-02 22:28:36 ***@laalaa ~
$ perl -MString::ShellQuote '-MFile::Spec::Functions qw(rel2abs)' -e
'print shell_quote(map { rel2abs $_ } @ARGV), "\n"' name*
'/home/dpchrist/name with
newline' '/home/dpchrist/name with spaces'


David
Tom Browder
2024-05-04 13:30:01 UTC
Permalink
On Fri, May 3, 2024 at 21:43 David Christensen
<***@holgerdanske.com> wrote:
...
Post by David Christensen
My practice is to start with '#!/bin/sh' and migrate to '#!/usr/bin/env
perl' as complexity increases.
I agree with David's direction, but ending with Raku instead of Perl.
I don't think golfing is the way to illustrate a practical solution,
so I show a short Raku script:

$ cat read.raku
#!/usr/bin/env raku
my $a = "name with spaces";
my $b = "name\nwith newline";
say "file 1: |$a|";
say "file 2: |$b|";

And executing it:

$ ./read.raku
file 1: |name with spaces|
file 2: |name
with newlines|

With Raku, it's easy to search the directory for the weird file names,
open them, and use their contents. Raku also has many built-in quoting
constructs to suit any situation.

I'll be happy to demo any of that here.

Best regards,

-Tom
Greg Wooledge
2024-05-04 14:00:02 UTC
Permalink
Post by Tom Browder
$ cat read.raku
#!/usr/bin/env raku
my $a = "name with spaces";
my $b = "name\nwith newline";
say "file 1: |$a|";
say "file 2: |$b|";
$ ./read.raku
file 1: |name with spaces|
file 2: |name
with newlines|
With Raku, it's easy to search the directory for the weird file names,
open them, and use their contents.
You've not really demonstrated anything that can't be done in every other
scripting language.

hobbit:~$ cat foo
#!/bin/bash
a='name with spaces'
b=$'name\nwith newline'
printf 'file 1: |%s|\n' "$a"
printf 'file 1: |%s|\n' "$b"
hobbit:~$ ./foo
file 1: |name with spaces|
file 1: |name
with newline|

hobbit:~$ cat bar
#!/usr/bin/tclsh8.6
set a "name with spaces"
set b "name\nwith newline"
puts "file 1: |$a|"
puts "file 2: |$b|"
hobbit:~$ ./bar
file 1: |name with spaces|
file 2: |name
with newline|

hobbit:~$ cat baz
#!/bin/sh
a='name with spaces'
b='name
with newline'
printf 'file 1: |%s|\n' "$a"
printf 'file 2: |%s|\n' "$b"
hobbit:~$ ./baz
file 1: |name with spaces|
file 2: |name
with newline|


The only part of this that's even *slightly* awkward is loading a literal
newline into a variable in /bin/sh. And that part drops away and ceases
to be a problem when you read the filename from some kind of input
source (such as a directory).

In real life:

hobbit:~$ mkdir /tmp/x && cd /tmp/x
hobbit:/tmp/x$ touch 'name with spaces' $'name\nwith newline'
hobbit:/tmp/x$ vi foo
hobbit:/tmp/x$ chmod +x foo
hobbit:/tmp/x$ cat foo
#!/bin/sh
for f in *; do
printf 'Next file: |%s|\n' "$f"
done
hobbit:/tmp/x$ ./foo
Next file: |foo|
Next file: |name
with newline|
Next file: |name with spaces|

There's nothing in here that requires an advanced language. /bin/sh can
do it all perfectly well. In fact, we haven't even reached the limits
of what /bin/sh can do yet.

hobbit:/tmp/x$ vi foo
hobbit:/tmp/x$ cat foo
#!/bin/sh
printf 'Next file: |%s|\n' *
hobbit:/tmp/x$ ./foo
Next file: |foo|
Next file: |name
with newline|
Next file: |name with spaces|

Is that useful in real life? Maybe. Maybe not. But it's available.

Correct use of quotes and globs solves most of the problems that people
have with sh.

Can it solve "I have to manually paste filenames containing spaces and
punctuation out of a spreadsheet into a shell"? No, probably not.
But then, what can? Sometimes, the workflow is what has to change.
David Christensen
2024-05-03 05:20:01 UTC
Permalink
Post by Greg Wooledge
Post by David Christensen
Perhaps Perl and the module String::ShellQuote ?
$ touch "name with spaces"
$ touch "name with\nnewline"
You didn't create a name with a newline in it here. You created a name
with a backslash in it. If you wanted a newline, you would have to use
the $'...' quoting form (in bash).
touch $'name with\nnewline'
Thank you for the clarification.


RTFM bash(1):

QUOTING
...
Enclosing characters in double quotes preserves the literal
value of all characters within the quotes, with the exception of
$, `, \, and, when history expansion is enabled, !. ...
The backslash retains its special meaning only when followed
by one of the following characters: $, `, ", \, or <newline>.
...
Words of the form $'string' are treated specially. The word
expands to string, with backslash-escaped characters replaced
as specified by the ANSI C standard.


I found another way to obtain a file name containing a newline -- by
pressing <Enter> when typing a double-quoted string literal:

2024-05-02 21:52:23 ***@laalaa ~
$ touch "foo
Post by Greg Wooledge
bar"
2024-05-02 21:52:29 ***@laalaa ~
$ ls -l foo*
-rw-r--r-- 1 dpchrist dpchrist 0 May 2 21:52 'foo'$'\n''bar'

2024-05-02 21:52:36 ***@laalaa ~
$ perl -MString::ShellQuote -e 'print shell_quote(@ARGV), "\n"' foo*
'foo
bar'


It also seems to work for single-quoted string literals:

2024-05-02 21:57:08 ***@laalaa ~
$ touch 'foo
Post by Greg Wooledge
bar'
2024-05-02 21:57:14 ***@laalaa ~
$ ls -l foo*
-rw-r--r-- 1 dpchrist dpchrist 0 May 2 21:57 'foo'$'\n''bar'

2024-05-02 21:57:18 ***@laalaa ~
$ perl -MString::ShellQuote -e 'print shell_quote(@ARGV), "\n"' foo*
'foo
bar'


I am unable to find $'string' in the dash(1) man page (?). As I
typically write "#!/bin/sh" shell scripts, writing such to deal with
file names containing non-printing characters is going to baffle me.
Post by Greg Wooledge
I still insist that this is a workaround that should *not* be used
to try to cancel out quoting bugs in one's shell scripts. Just write
the shell scripts correctly in the first place.
I would if I could.


While I am also unable to write Perl scripts correctly in the first
place, the quoting rules are easier.


David
Greg Wooledge
2024-05-03 11:10:01 UTC
Permalink
I am unable to find $'string' in the dash(1) man page (?). As I typically
write "#!/bin/sh" shell scripts, writing such to deal with file names
containing non-printing characters is going to baffle me.
Currently, $' quoting is a bash extension. It's supposed to appear in
some future edition of POSIX, at which point shells like dash will be
required to adopt it (whenever they get around to it). For now, though,
you should consider it bash only.
David Christensen
2024-05-03 18:50:01 UTC
Permalink
Post by Greg Wooledge
I am unable to find $'string' in the dash(1) man page (?). As I typically
write "#!/bin/sh" shell scripts, writing such to deal with file names
containing non-printing characters is going to baffle me.
Currently, $' quoting is a bash extension. It's supposed to appear in
some future edition of POSIX, at which point shells like dash will be
required to adopt it (whenever they get around to it). For now, though,
you should consider it bash only.
Thank you for the clarification. :-)


David
Teemu Likonen
2024-05-03 04:50:01 UTC
Permalink
Post by jeremy ardley
I have a need  to get the full path of a file that has spaces in its
name to use as a program argument
/home/jeremy/name with spaces
Can realpath or other utility return a quoted pathname?
Tools don't need to return a shell-quoted pathname because you as a
shell programmer must do it:

file=$(realpath ...) # quotes not needed in variable assignment
do_something_for "$file" # always quote: "$file"


If you don't need variables you can just quote the "realpath" output:

do_something_for "$(realpath ...)"
--
/// Teemu Likonen - .-.. https://www.iki.fi/tlikonen/
// OpenPGP: 6965F03973F0D4CA22B9410F0F2CAE0E07608462
Loading...