When using 'local' connections, privilege escalation would fail if
ansible_ssh_user was in the current context to the same value as
become_user.
This commit ensures that for 'local' connections we reset remote_user to
the local username.
This fixes#12782.
I inadvertently introduced it in
ca826508d9 and didn't notice, because
there are no unit tests for playbook_executor.py. Sorry!
(The "from ansible.errors import *" was used *only* to get the 'os'
module, which makes go "what?")
If you convert the error string to bytes and embed it inside another
error string, you get
Prefix:
b'Embedded\nerror\nstring'
which is not what we want.
But we also don't want Unicode in error messages causing unexpected
UnicodeEncodeErrors when on Python 2.
So let's convert the error message into the native string type (bytes on
Python 2, unicode on Python 3).
* Don't throw away the full path of the module code being loaded,
as this can cause conflicts when files of the same name are being
instantiated
* Generalize the module loading code
Fixes#12738
* corrupt/invalid file causes tracebacks
* incorrect initialization of display/_display in BaseCacheModule class
* tweaking the way errors in get() on jsonfile caches work, to raise
a proper AnsibleError in that situation so the playbook/task is stopped
Fixes#12708
The first call to persisting facts would work due to the assignment of a
MutableMapping calling __setitem__ but subsequent module fact data would
not be propogated to the fact cache plugins because update() doesn't
invoke __setitem__. This changes the behavior a little bit and ensures
set() is called on cache plugins.
Also does some reorganization/cleanup on the magic vars/delegated
variable generation portions of VariableManager to make the above
possible.
Fixes#12633
better error reporting on fetching errors
use scm if it exists over src
unified functions in requirements
simplified logic
added verbose to tests
cleanup code refs, unused options and dead code
moved get_opt to base class
fixes#11920fixes#12612fixes#10454
This is because we pass the whole dd command string into the shell
that's running on the contained environment rather than running it
directly from python via subprocess without a shell.
corrected output from default callback
added new tests for no_log loops
updated makefile test to check for both positive and negative occurrences of no_log
The earlier code behaved exactly as though this default had been set,
but it was actually handled as a(n unnecessary) special case inside the
connection plugin, rather than set as an explicit default.
If the default is overriden either in ansible.cfg or the environment,
the new code will continue to work (in fact, it won't know or care,
since it just uses the value set in the PlayContext).
This is submitted as a separate commit for easier review to address
backwards-compatibility concerns.
Using set_host_overrides() in the connection plugin to access the ssh
argument variables from the inventory didn't see group_vars/host_vars
settings, as noted earlier. Instead, we can set the correct values in
the PlayContext, which has access to all command-line options, task
settings, and variables.
The only downside of doing so is that the source of the settings is no
longer available in ssh.py, and therefore can't be logged. But the code
is simpler, and it actually works.
This change was suggested by @jimi-c in response to the FIXME in the
earlier commit.
Now we have the following ways to set additional arguments:
1. [ssh_connection]ssh_args in ansible.cfg: global setting, prepended to
every command line for ssh/scp/sftp. Overrides default ControlPersist
settings.
2. ansible_ssh_common_args inventory variable. Appended to every command
line for ssh/scp/sftp. Used in addition to ssh_args, if set above, or
the default settings.
3. ansible_{sftp,scp,ssh}_extra_args inventory variables. Appended to
every command line for the relevant binary only. Used in addition to
#1 and #2, if set above, or the default settings.
3. Using the --ssh-common-args or --{sftp,scp,ssh}-extra-args command
line options (which are overriden by #2 and #3 above).
This preserves backwards compatibility (for ssh_args in ansible.cfg),
but also permits global settings (e.g. ProxyCommand via _common_args) or
ssh-specific options (e.g. -R via ssh_extra_args).
Fixes#12576
<crab> jimi|ansible: do you think it should be possible to add both
foo:22 and foo:23 to the inventory?
<jimi|ansible> no
…so we don't want an invitation to FIXME.
CLI already provides a pager() method that feeds $PAGER on stdin, so we
just feed that the plaintext from the vault file. We can also eliminate
the redundant and now-unused shell_pager_command method in VaultEditor.
(Reminder: cannot use six here, module_utils get shipped to remote
machines that may not have six installed -- besides six doens't support
Python 2.4.)
Since c8f2483d, ini.py expects to always be passed in a pre-created list
of groups, and can no longer deal sensibly with an empty list; this just
makes that expectation clear.
This fixes a corner case where ini files live in a subdir
of the main inventory directory.
Reproducing the original error:
mkdir -p inventory/ini
cat > inventory/ini/hosts << EOF
[www]
www1
EOF
$ ansible -i inventory/ all -m ping
ERROR! 'all'
(or without the [www] group, it would complain about 'ungrouped')
Fixes another failing test.
(I don't want to do a global search/replace for 'basestring' because I
want to have unit tests covering each occurrence. When I run out of
existing failing tests, I'll try to write new ones.)
* Remove extraneous imports
* Fix some error handling
* Enable pipelining
* Disable su since it doesn't work
* Add error message when installed docker is not recent enough to
support this plugin
* Move nested functions to class level
* Make transport a class attribute
* Make exec_command, put_file and fetch_file more robust
Removed deletion of salt param from lookup file by 'password' lookup_filter.
Old behaviour leads to constant changed status when two tasks uses same lookup,
one with 'encrypt' parameter, and other without.
For example:
tasks:
- name: Create user
user:
password: "{{ lookup('password', inventory_dir + '/creds/user/pass' ncrypt=sha512_crypt) }}"
...
# Lookup file 'creds/user/pass' now contain password with salt
- name: Create htpasswd
htpasswd:
password: "{{ lookup('password', inventory_dir + '/creds/user/pass') }}"
...
# Salt gets deleted from lookup file 'creds/user/pass'
# Next run of "Create user" task will create it again and will have 'changed' status
* Disable su as it's not currently working 100% (and was disabled in v1).
* Move BUFSIZE out of the class to match other conenction plugins
* _connect shouldn't return self.
This is also peripheral to what _build_command needs, can be improved
and tested independently, and so makes more sense in a separate method.
This commit doesn't change any functionality (and I've verified that it
works with the various combinations: control_path set in ansible.cfg,
ssh_args adding or not adding ControlMaster/ControlPersist, etc.).
SSH pipelining can be a significant performance improvement, but it will
not work if sudoers is configured to requiretty. With this change, one
could have pipelining enabled in ansible.cfg, but use sudo to turn off
requiretty in a separate play (or task) where pipelining is disabled:
- hosts: foo
vars:
ansible_pipelining: no
tasks:
- lineinfile: dest=/etc/sudoers line='Defaults requiretty' state=absent
sudo_user: root
(Note that sudoers has a complicated syntax, so the above lineinfile
invocation may be too simplistic for production use; but the point is
that a separate play can do something to disable requiretty.)
Also get pipelining working for people who look to chroot as an example
for their own connection plugins
Note: In the latest v2 API, action handles become but chroot doesn't
reliably handle become. Maybe we need to add a has_become attribute
that the action can display an appropriate error.
* allow global no_log setting, no need to set at play or task level, but can be overriden by them
* allow turning off syslog only on task execution from target host (manage_syslog), overlaps with no_log functionality
* created log function for task modules to use, now we can remove all syslog references, will use systemd journal if present
* added debug flag to modules, so they can make it call new log function conditionally
* added debug logging in module's run_command
Due to the way we're now calculating delegate_to, if that value is based
on a loop variable ('item') we need to calculate all of the possible
delegated_to variables for that loop.
Fixes#12499
There doesn't appear to be anything that actually uses tmp_path in the
connection plugins so we don't need to pass that in to exec_command.
That change also means that we don't need to pass tmp_path around in
many places in the action plugins any more. there may be more cleanup
that can be done there as well (the action plugin's public run() method
takes tmp as a keyword arg but that may not be necessary).
As a sideeffect of this patch, some potential problems with chmod and
the patch, assemble, copy, and template modules has been fixed (those
modules called _remote_chmod() with the wrong order for their
parameters. Removing the tmp parameter fixed them.)
The process is already gone, so there's not going to be any new data
showing up on its stderr; we only want to make sure that we haven't
missed something that was already written. So polling once is enough.
This change is motivated by an ssh oddity: when ControlPersist is
enabled, the first (i.e. master) connection goes into the background; we
see EOF on its stdout and the process exits, but we never see EOF on its
stderr. So if we ran a command like this:
ANSIBLE_SSH_PIPELINING=1 ansible -T 30 -vvv somehost -u someuser -m command -a whoami
We would first do select([stdout,stderr], timeout) and read the command
module output, then select([stdout,stderr], timeout) again and read EOF
on stdout, then select([stderr], timeout) AGAIN (though the process has
exited), and select() would wait for the full timeout before returning
rfd=[], and then we would exit. The use of a very short timeout in the
code masked the underlying problem (that we don't see EOF on stderr).
It's always preferable to call select() with a long timeout so that the
process doesn't use any CPU until one of the events it's interested in
happens (and then select will return independent of elapsed time).
(A long timeout value means "if nothing happens, sleep for up to <x>";
omitting the timeout value means "if nothing happens, sleep forever";
specifying a zero timeout means "don't sleep at all", i.e. poll for
events and return immediately.)
This commit uses a long timeout, but explicitly detects the condition
where we've seen EOF on stdout and the process has exited, but we have
not seen EOF on stderr. If and only if that happens, it reruns select()
with a short timeout (in practice it could just exit at that point, but
I chose to be extra cautious). As a result, we end up calling select()
far less often, and use less CPU while waiting, but don't sleep for a
long time waiting for something that will never happen.
Note that we don't omit the timeout to select() altogether because if
we're waiting for an escalation prompt, we DO want to give up with an
error after some time. We also don't set exceptfds, because we're not
actually acting on any notifications of exceptional conditions.
On Python 2, shlex.split() raises if you pass it a unicode object with
non-ASCII characters in it. The Ansible codebase copes by explicitly
converting the string using to_bytes() before passing it to
shlex.split().
On Python 3, shlex.split() raises ('bytes' object has no attribute 'read')
if you pass a bytes object. Oops.
This commit introduces a new wrapper function, shlex_split, that
transparently performs the to_bytes/to_unicode conversions only on
Python 2.
Currently I've only converted one call site (the one that was causing a
unit test to fail on Python 3). If this approach is deemed suitable,
I'll convert them all.
Without this, we could execute «ssh -q ...» and call select(), which
would timeout after the default 10s, and only then send initial data.
(This is a relic of the earlier change where we always ran ssh with
-vvv, so the situation where it would sit quietly never happened in
practice; but this would have been the right thing to do even then.)
Make the code compatible with Pythons 2.4 through 3.5 by using
sys.exc_info()[1] instead.
This is necessary but not sufficient for Python 3 compatibility.
The event loop (even after it was brought into one place in _run in the
previous commit) was hard to follow. The states and transitions weren't
clear or documented, and the privilege escalation code was non-blocking
while the rest was blocking.
Now we have a state machine with four states: awaiting_prompt,
awaiting_escalation, ready_to_send (initial data), and awaiting_exit.
The actions in each state and the transitions between then are clearly
documented.
The check_incorrect_password() method no longer checks for empty strings
(since they will always match), and check_become_success() uses equality
rather than a substring match to avoid thinking an echoed command is an
indication of successful escalation. Also adds a check_missing_password
connection method to detect the error from sudo -n/doas -n.
The main exec_command/put_file/fetch_file methods now _build_command and
call _run to handle input from/output to the ssh process. The purpose is
to bring connection handling together in one place so that the locking
doesn't have to be split across functions.
Note that this doesn't change the privilege escalation and connection IO
code at all—just puts it all into one function.
Most of the changes are just moving code from one place to another (e.g.
from _connect to _build_command, from _exec_command and _communicate to
_run), but there are some other notable changes:
1. We test for the existence of sshpass the first time we need to use
password authentication, and remember the result.
2. We set _persistent in _build_command if we're using ControlPersist,
for later use in close(). (The detection could be smarter.)
3. Some apparently inadvertent inconsistencies between put_file and
fetch_file (e.g. argument quoting, sftp -b use) have been removed.
Also reorders functions into a logical sequence, removes unused imports
and functions, etc.
Aside: the high-level EXEC/PUT/FETCH description should really be logged
from ConnectionBase, while individual subclasses log transport-specific
details.
* Make LookupBase an abc with required methods (run()) marked as an
abstractmethod
* Mark methods that don't use self as @staticmethod
* Document how to implement the run method of a lookup plugin.