FreeRDP/winpr/libwinpr/clipboard/posix.c:397:20: error: conflicting types for ‘basename’
static const char* basename(const char* name)
^
In file included from FreeRDP/winpr/include/winpr/collections.h:25:0,
from FreeRDP/winpr/libwinpr/clipboard/posix.c:37:
/usr/include/string.h:599:14: note: previous declaration of ‘basename’ was here
extern char *basename (const char *__filename) __THROW __nonnull ((1));
This preprocessor definition has been initially intended to disable some
computationally expensive logging, however it turned out that there is
not much computation involved in the resulting implementation of new
wClipboard subsystems. Therefore we do not actually need the compilation
option, the logs can be filtered by "com.winpr.wclipboard.*" tag at
runtime if necessary. So drop the WITH_DEBUG_WCLIPBOARD CMake option and
convert all detailed logs to use WLOG_TRACE level via WLog_VRB macro.
Adds a callback that allows servers to compute NTLM hashes by themselves. The typical
use of this callback is to provide a function that gives precomputed hash values.
Sponsored by: Wheel Systems (http://www.wheelsystems.com)
Another issue revealed during testing is that older Windows systems
cannot handle the reserved file names well. While Windows 8 and 10 are
fine (they silently abort the file transfer), using reserved names with
Windows 7 can flat out crash explorer.exe or result into weird error
messages like "fatal error: 0x00000000 ERROR_SUCCESS".
This is not required by MS-RDPECLIP specification, but we should try to
avoid this issue as not using reserved file names seems to be assumed
a common sense in Windows protocols.
The most convenient way to handle the issue would be on wClipboard level
so that WinPR's clients do not bother with it. We should prohibit the
reserved names from being used in FILEDESCRIPTOR, failing the conversion
if we see such a file.
POSIX subsystem (the only one at the moment) handles remote file names
in two places so move the Unicode conversion and the new validation
check into a separate function.
The reserved file name predicate is placed into <winpr/file.h> so that
it can be used in other places too. For example, other wClipboard local
file subsystems will need it. (It would be really nice to enforce this
check somewhere in the common code, so that the subsystems can't miss
it, but other places can miss some errors thus we're doing it here, as
early as possible.)
The predicate acts on separate file name components rather than full
file names because the backslash is a reserved character too. If we
process full file names this can result in phantom directory entry in
the remote file name. Not to say that handling ready-made components
spares us from splitting the full file name to extract them :)
The implementation is... a bit verbose, but that's fine by me. In the
absence of functions for case-insensitive wide string comparison and
the need to check for the [0-9] at the end of some file names this is
quite readable. Thanks to FAT and NTFS for being case-insensitive and
to MS-DOS for having reserved file names in the first place.
One important point in the cliprdr protocol is that the peers are not
allowed to request file sizes and ranges if the clipboard content
changes. File locking should be used to gain this ability. However, our
file list is still accessible after new data is set into wClipboard.
Catch this error by storing the sequence number of the file list when it
is set and checking that it is still in effect at the time when the
client requests file sizes or ranges. There is a small chance of false
positives when the sequence number overflows, but I guess we can safely
ignore it.
This is another bunch of callbacks which provide the file contents to
the clients. We jump through some extra hoops in order to have more
pleasant user experience.
Simple stuff goes first. The file offset (or position) is split into the
low and high parts because this is the format in which the clients
receive the request from the server. They can simply copy the values as
is into the struct without repackaging them (which we do instead in the
end to get a 64-bit off_t).
Another thing is that we try to minimize the number of lseek() calls and
to keep as few file descriptors open as possible. We cannot open all the
files at once as there could be thousands of them and we'll run out of
the allowed number of the fds. However, the server can (in theory)
request the file ranges randomly so we need to be prepared for that. One
way to do that would be to always open the file before reading and close
it immediately afterwards. A dead simple solution with an acceptable
performance, but... some file systems do not support seeking, such as
FTP directories mounted over FUSE. However, they handle sequential
reading just fine *and* the server requests the data sequentially most
of the time so we can exploit this.
Thus open the file only once, during the first range request and keep
it open until the server reads all the data. In order to know how much
data is left we keep an accurate account of all reads and maintain the
file offset ourselves. This also allows us to avoid calling lseek() when
the file offset will not be effectively changed. However, if the server
requests some weird offset then we have no choice and will attempt
seeking. Unfortunately, we cannot tell whether it is a genuine failure
or the file system just does not support seeking, so we do not handle
the error further. (One workaround would be to reopen the file and keep
reading it until we're at the correct offset.) In this way we can
support sequential-only file systems if the server requests the contents
sequentially (and it does).
Also note that we do an fstat() right after opening the file in order to
have an accurate value of file size, for this exact file descriptor we
will be using. We should have it filled it by now, but just in case...
There is one more thing to explain. The cbRequested field specifies the
maximum number of bytes the server can handle, not the required number.
Usually this is some power-of-two number like 64 KB, based on the size
of the files on the clipboard. This is why posix_file_read_perform()
does not attempt to fill the whole buffer by repeatedly calling read()
if it has read less data than requested. The server can handle underruns
just fine (and this spares us from special-casing the EOF condition).
This is an example of wClipboardDelegate method implementation. POSIX
subsystem uses synchronous methods, but the interface can be used for
asynchronous request processing as well. The client should call a
Client* callback to request some action and the wClipboard will process
the request and report the result by calling an approriate Clipboard*
callback. Usually there will be two callbacks: one for reporting success
and one to report errors.
All callbacks have at least two arguments: the wClipboardDelegate itself
to pass the system context, and the wClipboard*Request structure with
the arguments to pass the call context. The request context is also
passed to the result callbacks by wClipboard so that the client can
match up the result with its previous request.
The fields of wClipboard*Request structures are heavily influenced by
the MS-RDPECLIP spec and mirror the respective fields of
CLIPRDR_FILECONTENTS_REQUEST. wClipboard should not depend on
MS-RDPECLIP, that's the reason we don't use CLIPRDR_FILECONTENTS_REQUEST
directly. However, I believe that we should not have void* fields in the
request structs so that they can be easily copied around if needed.
This is why have the weird 'streamId' field there which has nothing to
do with wClipboard and will be used only by the clients when sending
replies to the server.
Return values of the callbacks are to be used for reporting errors with
processing the request or reply per se, not for errors encountered while
performing the action requested. Thus, for example, we return NO_ERROR
from posix_file_request_size() even when we fail to report the result to
the client, because we have successfully performed the request and do
not care if the client could not handle our reply for some reason.
Also note that setup_delegate() fills in dummy implementations of
Clipboard* reply callbacks so that we do not crash in case the client
does not fill them and do not have to perform paranoid NULL checks
before calling every single callback.
This is the thing which will be used by clients to request file sizes
and ranges from wClipboard and by wClipboard to report the results of
the requests to the clients.
wClipboard and the client will fill in the (currently absent) callbacks
with their implementations of the request-report interface and will be
using them accordingly.
Initially I thought that wClipbardDelegate would be dynamically
allocated by the client and set into wClipboard (as this would be the
case with a delegate interface implementation in OOP langauges), but
after some thought I ended up with storing the delegate in wClipboard
and using the 'void* custom' field for client-private data.
So the idea is for the subsystem to fill in its callbacks during
wClipboard construction and for the client to get access to
wClipboardDelegate with a getter and fill in its callbacks during its
clipboard initialization. The subsystem will use wClipboard* pointer to
access its data and the client will have its void* pointer to store its
context.
text/uri-list contains only the files which were immediately selected by
the user. However, we need to enumerate *all* files and directories to
be pasted in CLIPRDR_FILELIST. Thus we need to walk through the
directories and add their content to the file list as well.
We use readdir() function to traverse the directory entries. It has more
sane interface than readdir_r(), but lacks (standardized) thread-safety
guarantees. However, most C liraries guarantee that so we can use it.
There is no compile-time check because it cannot be made robust. You
deserve a crash here if you are using a C library developed by people
who cannot keep their unhealthy addiction to global state under control.
Note that recursive traversal is also a good opportunity to maintain
good remote names. We just need to concatenate the directory paths and
file names correctly.
However, this recursion has one caveat: it is not bounded, so if the
file system contains a loop then we will crash due to a stack overflow.
We could track symlink loops (and hardlinks too if we try hard) to avoid
the crash, but I think it's not a common thing to do so we can ignore
this possibility.
Finally we can add a file to the file list once we have got its local
file name decoded. The interesting part here is what we use for the
remote name.
Suppose the user has selected two files in different directories. In
this case we end up receiving a text/uri-list like this:
file:///home/bob/foo/a
file:///home/bob/bar/b
We'd expect to see "a" and "b" pasted into the remote session, so that's
what we should use for the remote names: the base names of the files.
These are the parts from the end up to the last directory delimiter.
One tricky point here is that Windows expects the file names to be
encoded in Unicode, but POSIX does not specify any particular encoding
for file names. Operating systems and file systems generally handle the
file names as mostly opaque bytes strings and do not really care what
encoding is used there. There is no portable API to get the encoding,
it's entirely up to the users and the software they use to correctly
interpret the file names. But we need to do something here.
As of 2017, the most widely used encoding for file names is UTF-8. While
there are marginal communities which stick to codepages for legacy
reasons, we can safely assume that most of the time the file names will
be encoded in UTF-8. In fact, popular desktop environments like GNOME
also assume this. So that's what we will do here as well.
Nothing really interesting here, it's exactly what it says on the tin.
The percent-encoding is specified by RFC 3986. And we take care to
detect invalid encodings.
Now we start handling the actual format data. As the first step we need
to convert the text/uri-list data into the list of file names. Each file
or directory the user selects to copy is represented with a URI, and the
whole list looks like this:
file:///home/bob/text-file
file:///home/bob/a-directory
file:///home/bob/white%20space
The MIME format is actually specified by RFC 2483. As said in the
comments, we allow some slack for other applications: they can use
singular LF and CR as line terminators, and we also will handle missing
terminator for the last line (some applications actually do this, but I
can't recall which ones at the moment).
We will handle only the file:// URI scheme because these refer to local
filesystem paths. It is possible for text/uri-list to contain URIs with
other schemes when the user selects files from remote filesystems (like
an FTP server or an SMB share connect to from a file manager). We cannot
pass such paths to open() and for some reason the file managers use the
remote URIs even when the remote filesystems are actually mapped to the
local filesystem via FUSE. Therefore we restrict ourselves to handling
only file://.
Now we do the actual conversion of a list of struct posix_files into an
array of FILEDESCRIPTORs. In order to correctly fill in the fields we
need to know the size of the file and whether it is a directory. This
can be looked up by stat() call. Do this during struct posix_file
construction and cache the values for later use (we will need them).
Define _FILE_OFFSET_BITS to make off_t a 64-bit value and to call
appropriate functions when we write stat() in the code. FILEDESCRIPTOR
and cliprdr protocol expect the file sizes to be 64-bit so we can
provide accurate information with that. Take care to define it before
including any system headers ("config.h" contains only defines).
Also take care to not overrun the file name buffer. Windows has a hard
cap of 260 Unicode characters for the full file name, including the
terminating null character.
Here you can see an outline of our approach to handling file lists put
on the clipboard. Typical usage of wClipboard by the clients sums up to
doing a ClipboardSetData() call followed by a ClipboardGetData() call
with appropriate format ID passed to them. Thus for files we would
expect the clients to first set the local format (like "text/uri-list")
and then to get the remote format (the "FileGroupDescriptorW").
MS-RDPECLIP has a concept of locally-stored list of files on the
clipboard. This is modeled by clipboard->localFiles ArrayList. We need
to populate this list before we serialize it into CLIPRDR_FILELIST and
send it to the server. The easiest way to achieve this is a bit hacky,
but it works: we populate the file list from inside the synthesizer
callback registered for text/uri-list -> FileGroupDescriptorW
conversion.
So the client would first set the data it received from local clipboard
as "text/uri-list" format, then it gets a "FileGroupDescriptorW" format,
during that conversion we will prepare to serve file content requests,
and in the end we provide a FILEDESCRIPTOR array to the client as the
conversion result. The client will then serialize the array into
CLIPRDR_FILELIST and sent it to the server. (We cannot do serialization
in WinPR as WinPR should not know about cliprdr and its data formats.)
The subsystems are expected to store their private structures in the
clipboard->localFiles array. POSIX subsystem uses struct posix_file
which currently has bare minimum of fields: the local file name (for
open() and the like) and the remote file name (the one to put into
FILEDESCRIPTOR).
This adds some initial skeleton for local file subsystems of wClipboard.
The idea is to delegate handling of local file formats to dedicated
subsystems selected at runtime based on the compiled-in support code.
This is somewhat similar to the approach used by audin, rdpsnd, rdpgfx
channels in FreeRDP.
Only one subsystem is actually used by wClipboard during runtime. It is
selected by the ClipboardInitLocalFileSubsystem() function which will
try initializing the compiled-in subsystems in the preferred order. Thus
when adding new subsystems one must make sure to 1) return as soon as
any initialization succeeds, 2) leave wClipboard in usable state if the
initialization fails.
A POSIX file subsystem is added as a pioneer. It will handle local file
format "text/uri-list" and will use POSIX API to access the files. This
is the combination one would expect to be supported by Linux systems
which can run the XFreeRDP client.
The POSIX subsystem is enabled only when CMake detects <unistd.h> as
available. This is the core POSIX include file so we can reasonably
expect the rest of the POSIX API to be available along with that file.
We also define a new configuration option WITH_DEBUG_WCLIPBOARD which
will be used to guard some debug-only verbose logging in wClipboad.
Unify error handling in ClipboardInitFormats() and actually handle the
return value of ClipboardInitSynthesizers(). Currently it always returns
TRUE, but this may change, so we'd better be clean.
Declare 'formatName' in wClipboardFormat as non-const. It is customary
in C to declare owned pointers as non-const because various deallocation
functions like free() take non-const pointers as arguments. Furthermore,
const char* is tightly associated with "string literals" which must not
be freed. Thus declaring this field as non-const is more accurate, and
removes that ugly void* cast from ClipboardInitFormats().
Unify error handling in ClipboardCreate(). The cleanup snippet should
not be repeated as it's prone to errors, like leaking the allocation of
clipboard->formats when ClipboardInitFormats() fails. Unified error
handling makes it much harder to forget resource cleanup on errors.