Several Critical Vulnerabilities Patched in AI ChatBot Plugin for WordPress

Marco Wotschka
October 25, 2023

On September 28, 2023, the Wordfence Threat Intelligence team initiated the responsible disclosure process for multiple vulnerabilities in AI ChatBot, a WordPress plugin with over 4,000 active installations.

After making our initial contact attempt on September 28th, 2023, we received a response on September 29, 2023 and sent over our full disclosure details. Receipt of the disclosure by the vendor was acknowledged the same day and a fully patched version of the plugin was released on October 19, 2023.

We issued a firewall rule to protect Wordfence Premium, Wordfence Care, and Wordfence Response customers on September 29, 2023. Sites still running the free version of Wordfence will receive the same protection on October 29, 2023.

Please note that these vulnerabilities were originally fixed in 4.9.1 (released October 10, 2023). However, some of them were reintroduced in 4.9.2 and then subsequently patched again in 4.9.3. We recommend that all Wordfence users update to version 4.9.3 or higher immediately.

A complete list of the vulnerabilities we reported is below. Links to Wordfence Intelligence are included where you can find full details:

In this post we will focus on the most impactful vulnerabilities.

Vulnerability Details and Technical Analysis

The AI ChatBot plugin provides website owners with a plug and play chat solution that can be expanded upon with customizable FAQs and custom text responses. It provides website users with an interface that allows them to look up order information, leave contact information for later callbacks and can be integrated with OpenAI’s ChatGPT or Google’s DialogFlow.

A lot of the interactions with the chatbot happen via AJAX actions. Many of these actions were made available to unauthenticated users in order to allow them to interact with the chatbot. Other actions required at least subscriber-level access.

Unauthenticated SQL Injection – CVE-2023-5204

Description: Unauthenticated SQL Injection via qc_wpbo_search_response
Affected Plugin: AI ChatBot
Plugin slug: chatbot
Vendor: QuantumCloud
Affected versions: <= 4.8.9
CVE ID: CVE-2023-5204
CVSS score: 9.8 (Critical)
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
Researcher: Marco Wotschka
Fully Patched Version: 4.9.1

One of the many vulnerabilities we discovered was an unauthenticated SQL Injection. The following two AJAX actions are used for searches during interactions with the chatbot:

add_action( 'wp_ajax_nopriv_wpbo_search_response', 'qc_wpbo_search_response' );

add_action( 'wp_ajax_wpbo_search_response', 'qc_wpbo_search_response' );

The wp_ajax_nopriv_wpbo_search_response AJAX action can be used by users who are not authenticated to WordPress due to the hook utilizing ‘nopriv’. On the other hand, the standard wp_ajax_wpbo_search_response AJAX action can only be used by authenticated users due to the inherent functionality of AJAX actions.

function qc_wpbo_search_response (shortened for brevity)

The qc_wpbo_search_response function hooked by the aforementioned AJAX actions is used to search within the database for responses containing certain keywords. If the $_POST[‘strid’] parameter is set, a record is retrieved from the wpbot_response table by ID. The $strid variable supplied by the POST parameter can be leveraged for SQL Injection, despite being sanitized using the sanitize_text_field function.

According to the WordPress Developer Resources, the sanitize_text_field function checks for invalid UTF-8; converts single < characters to entities; strips all tags; removes line breaks, tabs, and extra whitespace; strips percent-encoded characters. This does not provide sufficient protection against SQL Injection attempts, and is only intended for Cross-Site Scripting protection. Furthermore, the get_results function used in the above function call does not perform any preparation, nor is there any escaping of the user supplied input passed to the SQL Query. We always recommend the use of the prepare function on SQL queries as it provides adequate escaping on the user-supplied values, which prevents SQL injection from being successful. In addition, ensuring that the $strid is an integer would help prevent a SQL Injection attack from being successful.

The lack of a UNION operation in the above SQL query makes exploiting this vulnerability more difficult, but a time-based blind injection approach using the SLEEP() function and CASE statements can still be used to extract information from the database by observing the duration of individual queries. While tedious, this technique can be used to extract sensitive information from the database. This includes hashed passwords.

Arbitrary File Deletion – CVE-2023-5212

Description: Authenticated (Subscriber+) Arbitrary File Deletion via qcld_openai_delete_training_file
Affected Plugin: AI ChatBot
Plugin slug: chatbot
Vendor: QuantumCloud
Affected versions: <= 4.8.9
CVE ID: CVE-2023-5212
CVSS score: 9.6 (Critical)
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
Researcher: Marco Wotschka
Fully Patched Version: 4.9.1

The plugin offers the ability to upload training files to OpenAI. An arbitrary file deletion vulnerability existed in the qcld_openai_delete_training_file function invoked via the following AJAX action:

add_action('wp_ajax_qcld_openai_delete_training_file',[$this,'qcld_openai_delete_training_file']);

function qcld_openai_delete_training_file

This vulnerable function accepts a file path via the $_POST[‘file’] parameter and checks whether the file exists. If it does, the function adjusts permissions on the file in such a way that it can be removed and proceeds to delete it. This function misses a capability check to ensure that the user performing the action has proper privileges, as well as a nonce check to ensure that the action is performed intentionally. and is thus vulnerable to Missing Authorization and Cross-Site Request Forgery.

Furthermore, no check is performed ensuring that the file is an OpenAI training file and that it resides in a location or directory where training files are expected to be located. This could allow an authenticated attacker with subscriber-level privileges or higher to remove the wp-config.php file of an affected site, which would invoke the WordPress installation script on the next site visit and could lead to a complete site takeover.

The file path passed via the $_POST[‘file’] parameter could also point to a file outside of the affected website, thus enabling the deletion of wp-config.php files of other sites in shared hosting environments. Deleting wp-config.php forces the site into a setup state, at which point an attacker can take over the site by pointing it to a database under their control. Of course, attackers are not limited to deleting PHP files either as long as the web server can change file permissions and delete the file.

Version 4.9.1 removed this function as well as the corresponding AJAX action. Version 4.9.2 reintroduced the vulnerable function and action hook, which were both again removed in version 4.9.3.

Directory Traversal to Arbitrary File Write – CVE-2023-5241

Description: Authenticated (Subscriber+) Directory Traversal to Arbitrary File Write via qcld_openai_upload_pagetraining_file
Affected Plugin: AI ChatBot
Plugin slug: chatbot
Vendor: QuantumCloud
Affected versions: <= 4.8.9
CVE ID: CVE-2023-5241
CVSS score: 9.6 (Critical)
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:C/C:N/I:H/A:H
Researcher: Marco Wotschka
Fully Patched Version: 4.9.1

We also discovered an arbitrary file write vulnerability which exists in the qcld_openai_upload_pagetraining_file function. The entire function is rather long which is why we won’t display it here in its entirety.

function qcld_openai_upload_pagetraining_file (shortened for brevity)

The function expects a filename to be passed as a $_POST[‘filename’] parameter, which is sanitized using the sanitize_text_field function. The $file variable is used to determine the location of a file in the wp-content/uploads/qcldopenai_site_training/ directory. If the file exists, the function proceeds to declare a variable called $split_file, creates a file handle $qcld_openai_json_file and opens the file in append mode. This means that the file is not overwritten but anything written to the file is instead appended.

It is not immediately clear what the purpose of this part of the function is since it simply appends the contents that are already in the file to the end of the file until the length of the content that is added exceeds $this->wpaicg_max_file_size or the entire file has been duplicated.

The corresponding if-statement that determines when to terminate writing to the file looks as follows:

if(mb_strlen($qcld_openai_content, '8bit') >$this->wpaicg_max_file_size)

In a default installation $this->wpaicg_max_file_size is not defined and therefore NULL. Hence, in such scenarios the function adds the first line of the file specified by the user to the end of the file. Since NULL is interpreted as zero in a comparison statement like this, any positive file size will suffice to break out of this part of the function.

Unfortunately, this code is vulnerable to Directory Traversal via the filename parameter. If the filename that is passed is a relative path to wp-config.php, the file handle will ultimately point to the site’s wp-config.php file. An authenticated attacker with subscriber-privileges or higher could utilize this fact to append the first line of its content to the file wp-config.php, which would be <?php.

While an attacker does not have any influence on the data that is written, in most cases a <?php could be written to the end of a targeted PHP file, which can lead to catastrophic consequences as the added PHP tag may result in an error such as

Parse error: syntax error, unexpected token "<", expecting end of file

This prevents the site from loading properly and can be used to append to any PHP file (or other files) including those in shared hosting environments leading to Denial of Service (DoS). One way to prevent Directory Traversal is to use the sanitize_file_name function, which removes special characters including slashes and leading dots from the file name.

Version 4.9.1 removed this function as well as the corresponding AJAX action. Version 4.9.2 reintroduced the vulnerable function and action hook, which were both again removed in version 4.9.3.

Numerous Other Missing Authorization and Cross-Site Request Forgery Vulnerabilities

In addition to the vulnerabilities outlined above, we discovered several AJAX actions without proper capability checks, which made it possible for authenticated attackers with minimal access, such as subscribers, to invoke those actions. Several of the functions were also missing nonce verification, which would make it possible for attackers to forge requests on behalf of a site administrator, or any other authenticated user considering capability checks were also missing.

However, these vulnerabilities had minimal impact and led to the exposure of information such as user order details and user names, the download and extraction of a zip used by the plugin (not arbitrary zip files), cache deletion, as well as starting and stopping of search indexing jobs to name a few. The severity of those actions is lower than the ones we detailed above.

Timeline

September 25-28, 2023 – The Wordfence Threat Intelligence team discovers several vulnerabilities in the AI ChatBot plugin.
September 28, 2023 – We initiate contact with the plugin developer.
September 29, 2023 – We release a firewall rule to protect Wordfence Premium, Wordfence Care, and Wordfence Response customers and send the full disclosure to the plugin developer. Receipt of the disclosure is acknowledged.
October 10, 2023 – A fixed version (4.9.1) of the plugin that patches all reported vulnerabilities is released.
October 18, 2023 – Several of the vulnerabilities are reintroduced in version 4.9.2. We inform the vendor about this.
October 19, 2023 – Version 4.9.3 patches the vulnerabilities again.
October 29, 2023 – The firewall rule becomes available to free Wordfence users

Conclusion

In this blog post we covered an Unauthenticated SQL Injection vulnerability (affecting versions <= 4.8.9), as well as an Arbitrary File Write vulnerability and an Arbitrary File Deletion vulnerability (affecting versions <= 4.8.9 and 4.9.2). The SQL Injection vulnerability allows unauthenticated attackers to extract sensitive information from the database using a time-based blind injection approach, which could ultimately lead to exposure of admin credentials and site takeover.

The Arbitrary File Write vulnerability can be utilized by authenticated attackers to append opening PHP tags (in default configurations) to any file including the wp-config.php file, which can lead to Denial of Service (DoS). The Arbitrary File Deletion vulnerability can be used by authenticated attackers to delete any file on the web server offering the possibility of complete site takeovers.

All Wordfence running Wordfence Premium, Wordfence Care, and Wordfence Response, have been protected against these vulnerabilities as of September 29, 2023. Users still using the free version of Wordfence will receive the same protection on October 29, 2023.

If you know someone who uses this plugin on their site, we recommend sharing this advisory with them to ensure their site remains secure, as these vulnerabilities pose a significant risk.

For security researchers looking to disclose vulnerabilities responsibly and obtain a CVE ID, you can submit your findings to Wordfence Intelligence and potentially earn a spot on our leaderboard.

Did you enjoy this post? Share it!

Source :
https://www.wordfence.com/blog/2023/10/several-critical-vulnerabilities-patched-in-ai-chatbot-plugin-for-wordpress/

Announcing Vulnerability Scanning in Wordfence CLI 2.0.1 “Voodoo Child”

Matt Barry
October 31, 2023

Note: If you’re a WordPress user, we recommend the Wordfence Security Plugin which provides a robust and complete set of security controls for WordPress websites. If you host WordPress servers and need high performance malware and vulnerability scanning on the command line, read on!

Our mission at Defiant Inc, makers of Wordfence, is to Secure the Web. We made the Web safer today with the release of completely free WordPress server vulnerability scanning at a massive scale for both personal and commercial use with the release of Wordfence CLI 2.0.1, codename “Voodoo Child”.

Wordfence CLI is a high performance Linux command line application that we launched at WordCamp US two months ago with robust malware scanning. Wordfence CLI is designed for technical server administrators working on the command line to host individual WordPress sites, or to provide WordPress hosting at scale. With today’s release of Wordfence CLI 2.0.1, Wordfence CLI will now scan your WordPress server, or your entire network, for WordPress vulnerabilities with a single command. This feature is in addition to the powerful malware scanning capability that Wordfence CLI already provides.

Wordfence CLI created a lot of excitement at Wordcamp US and the one resounding question that we were asked while there was “will it scan my website for vulnerabilities”. Today we are incredibly excited to introduce WordPress vulnerability scanning at scale in Wordfence CLI.

Vulnerability Scanning is Completely Free

Vulnerability scanning in Wordfence CLI is completely free for personal AND commercial use. Wordfence CLI uses our open vulnerability database which is also freely available for you to use, including our vulnerability APIs and vulnerability Web Hooks that will alert you in real-time when we add a new vulnerability. Wordfence CLI is open source, licensed under GPLv3.

Wordfence CLI 2.0.1 “Voodoo Child” also has simplified installation. You no longer have to come to our site to get an API key to run Wordfence CLI. You can simply launch CLI, agree to our terms, and start scanning. Wordfence CLI now fetches a free API key behind the scenes, which enables fetching our vulnerability data and our free malware signatures. We made this change to get you up and running fast!

Malware scanning in the free version of Wordfence CLI uses our Free Malware Signature Set and a paid version of Wordfence CLI is available which includes our expanded Commercial Signature Set.

Powering Hosts, Agencies, Developers and The WordPress Economy

The release of vulnerability and malware scanning at scale with Wordfence CLI enables the creation of a vibrant economy built around WordPress security. It is our hope that we will see businesses of all sizes, including individual developers, get familiar with the power of Wordfence CLI, and begin to provide new or add-on security services to their customers using Wordfence CLI. Here are a few examples:

  • Wordfence CLI can be used by site cleaners and incident responders to quickly and effectively find malware on an already infected website and scan for vulnerabilities to determine potential intrusion vectors, along with providing post-clean remediation.
  • Developers and operations teams can scan a single site, or an entire server for vulnerabilities to prevent a hack before it occurs.
  • Agencies can scan thousands of WordPress sites on a server with a single command to find vulnerabilities or locate malware.
  • Hosting Providers can use a dedicated server with many CPU cores to launch a multi-process malware scan that accesses their entire server fleet in read-only mode via the network to scan for malware at massive scale. It’s quite feasible to scale this up to 15 million websites or more for the mega-hosts out there.
  • Hosting Providers can perform fast vulnerability scans at scale across an entire network to alert and provide remediation options to customers.

All of the above can be scheduled as a regularly run cron job. Wordfence CLI accepts piped input and supports piping its output. You can configure Wordfence CLI to use as many CPU cores as you’d like when conducting a malware scan, so that you’re able to efficiently use your computational resources.

Powered by Wordfence Intelligence

The Wordfence CLI vulnerability scan is powered by the Wordfence Intelligence Vulnerability API feed, which is also 100% free for personal and commercial use. This feed contains over 12,250 unique vulnerability records that affect over 7,600 plugins and themes, and is constantly updated by our Threat Intelligence team. Typically, our team adds anywhere from 20 to 150 new vulnerabilities per week with a rough average of 82 per week, based on our data from the past 12 months.

We monitor various sources such as plugin change-logs, the CVE list, vulnerability databases, and other sources while also issuing CVE IDs to independent researchers and conducting our own in-house research. This is all to ensure we have the most up-to-date and accurate vulnerability information in our database that users can trust. All vulnerability records have extensive detailed information such as a concise title, description, CWE, CVSS Score, affected version ranges, patched version, and more that is usable as output with the Wordfence CLI vulnerability scanner. This should help make alerting and prioritization easier than ever for site owners and hosting providers.

It’s often hard to believe that such a high-quality vulnerability database is completely free to access via the Web and via API, but we keep looking for more ways to provide the data for free. We believe that vulnerabilities belong to the community because they are created by the security community, and that is why we’ve taken the same approach with vulnerability scanning in Wordfence CLI as we have with our Vulnerability Database. Vulnerability Scanning with Wordfence CLI, and use of our vulnerability database is completely free for commercial and personal use. So we would like to encourage hosting providers, enterprises, and site owners to implement this data and use Wordfence CLI to help make the Web more secure.

Running Your First Vulnerability Scan

If you do not already have CLI installed, follow these installation instructions to get up and running. If you have Wordfence CLI, follow these upgrading instructions to update your installation to the latest version.

To perform a basic vulnerability scan from the command line, simply invoke:

wordfence vuln-scan /path/to/scan

If you’d like to run a malware scan, use this command to get started:

wordfence malware-scan /path/to/scan

Malware scans are a bit more CPU intensive, so we provide the ability to use multiple CPU cores when conducting a malware scan. This is not available for vulnerability scans because they run very quickly. To use 8 CPU cores for a malware scan, and to see progress in real-time, run this command:

wordfence malware-scan /path/to/scan --progress --workers 8

To scan the /var/www/wordpress directory for vulnerabilities and write the results to /home/username/wordfence-cli-vuln-scan.csv.

wordfence vuln-scan --output-path /home/username/wordfence-cli-vuln-scan.csv /var/www/wordpress

If you have multiple WordPress installations you want to scan, you can supply a path to each as a command line argument:

wordfence vuln-scan --output-path /home/username/wordfence-cli-vuln-scan.csv /var/www/wordpress1 /var/www/wordpress2 /var/www/wordpress3

To run a daily scan of your WordPress installation, you define a cron entry like this one:

0 0 * * *  username /usr/local/bin/wordfence vuln-scan --output-path /home/username/wordfence-cli-vuln-scan.csv /var/www/wordpress

This example scans the directory /var/www/wordpress and writes the results to /home/username/wordfence-cli-vuln-scan.csv as the username user. This would be similar to how a scheduled scan works within the Wordfence plugin. The cronjob uses a lock file at /tmp/wordfence-cli-vuln-scan.lock to prevent duplicate vulnerability scans from running at the same time.

Go Forth And Secure The Web!

Wordfence CLI is one of those projects where the product roadmap writes itself because there is such an obvious need for a powerful tool like this in the WordPress server administration space. We’re in this for the long haul and will continue to invest heavily in Wordfence CLI, with your guidance. Once you’ve tried CLI, we’d love to hear your feedback in the comments.

Did you enjoy this post? Share it!

Source :
https://www.wordfence.com/blog/2023/10/wordpress-vulnerability-scanning/

Sonicwall License Manager discontinues support of TLS 1.0 and TLS 1.1 encryption protocols

Last Updated:10/09/2023

Overview

As part of our ongoing commitment to security, we are discontinuing support for the TLS 1.0 and TLS 1.1 encryption protocols in our license manager. This will cause GEN 5 and GEN 6 firewalls running older firmware to not communicate with the license manager leading to licensing issues. The firewall will not be able to validate its license or obtain necessary updates resulting in a licensing failure. The firewall will operate with reduced functionality or disable certain advanced features that require periodic license validation or updates.

Examples of services that will no longer function properly include:

  • Capture Advanced Threat Protection
  • Gateway Anti-Malware
  • Intrusion Prevention
  • Application Control

These licensed features will no longer work as expected. Critical security updates, threat intelligence updates, or other important updates may not be applied, potentially leaving the network vulnerable to emerging threats.

Discontinuing TLS 1.0 and TLS 1.1 is in line with industry best practices and aims to enhance the security of data transmission between your systems and our servers. To ensure uninterrupted service and data security, it’s important to upgrade your firewall to use TLS 1.2 or higher.

Product Impact

All GEN 5 and GEN 6 appliances running an older firmware will be affected when support for TLS 1.0 and TLS 1.1 encryption protocols are disabled.  

Remediation

SonicWall strongly recommends upgrading the firewall’s firmware to the latest version to maintain a secure connection with our license manager, as the latest firmware has the support of encryption protocols TLS 1.2 or later. 

Impacted PlatformsRecommended Firmware VersionMinimum Firmware Version
Gen 5SonicOS 5.9.2.13SonicOS 5.9.2.x
Gen 6SonicOS 6.5.4.12-101nSonicOS 6.2.9.x

Timeline

  • The support for TLS 1.0 and TLS 1.1 will be disabled on October 31, 2023. After this date, connections using these protocols will be rejected by our license manager.
  • Ensure that your systems are upgraded and ready to use TLS 1.2 or higher before the mentioned date to prevent disruption of service.

Product Migration Option

SonicWall has released the GEN 7 hardware which has been designed to take your experience to new heights, offering unparalleled performance, stunning visuals, and cutting-edge features that will revolutionize the way you interact with our products. As SonicWall customers you are eligible for the secure upgrade to the next generation of hardware. For details, please review our Secure upgrade program

Related information

 NOTE: If you do not have an active support contract or need assistance on GEN 7 upgrade options, please contact renewals@sonicwall.com. If you have any questions about the process, please do not hesitate to contact SonicWall support for assistance. Our support team is available to provide guidance and address any concerns you may have. 

Source :
https://www.sonicwall.com/support/product-notification/license-manager-discontinues-support-of-tls-1-0-and-tls-1-1-encryption-protocols/230907124013593/

Attacks on 5G Infrastructure From Users’ Devices

By: Salim S.I.
September 20, 2023
Read time: 8 min (2105 words)

Crafted packets from cellular devices such as mobile phones can exploit faulty state machines in the 5G core to attack cellular infrastructure. Smart devices that critical industries such as defense, utilities, and the medical sectors use for their daily operations depend on the speed, efficiency, and productivity brought by 5G. This entry describes CVE-2021-45462 as a potential use case to deploy a denial-of-service (DoS) attack to private 5G networks.

5G unlocks unprecedented applications previously unreachable with conventional wireless connectivity to help enterprises accelerate digital transformation, reduce operational costs, and maximize productivity for the best return on investments. To achieve its goals, 5G relies on key service categories: massive machine-type communications (mMTC), enhanced mobile broadband (eMBB), and ultra-reliable low-latency communication (uRLLC).

With the growing spectrum for commercial use, usage and popularization of private 5G networks are on the rise. The manufacturing, defense, ports, energy, logistics, and mining industries are just some of the earliest adopters of these private networks, especially for companies rapidly leaning on the internet of things (IoT) for digitizing production systems and supply chains. Unlike public grids, the cellular infrastructure equipment in private 5G might be owned and operated by the user-enterprise themselves, system integrators, or by carriers. However, given the growing study and exploration of the use of 5G for the development of various technologies, cybercriminals are also looking into exploiting the threats and risks that can be used to intrude into the systems and networks of both users and organizations via this new communication standard. This entry explores how normal user devices can be abused in relation to 5G’s network infrastructure and use cases.

5G topology

In an end-to-end 5G cellular system, user equipment (aka UE, such as mobile phones and internet-of-things [IoT] devices), connect to a base station via radio waves. The base station is connected to the 5G core through a wired IP network.

Functionally, the 5G core can be split into two: the control plane and the user plane. In the network, the control plane carries the signals and facilitates the traffic based on how it is exchanged from one endpoint to another. Meanwhile, the user plane functions to connect and process the user data that comes over the radio area network (RAN).

The base station sends control signals related to device attachment and establishes the connection to the control plane via NGAP (Next-Generation Application Protocol). The user traffic from devices is sent to the user plane using GTP-U (GPRS tunneling protocol user plane). From the user plane, the data traffic is routed to the external network. 

fig1-attacks-on-5g-infrastructure-from-users-devices
Figure 1. The basic 5G network infrastructure

The UE subnet and infrastructure network are separate and isolated from each other; user equipment is not allowed to access infrastructure components. This isolation helps protect the 5G core from CT (Cellular Technology) protocol attacks generated from users’ equipment.

Is there a way to get past this isolation and attack the 5G core? The next sections elaborate on the how cybercriminals could abuse components of the 5G infrastructure, particularly the GTP-U.

GTP-U

GTP-U is a tunneling protocol that exists between the base station and 5G user plane using port 2152. The following is the structure of a user data packet encapsulated in GTP-U.

fig2-attacks-on-5g-infrastructure-from-users-devices
Figure 2. GTP-U data packet

A GTP-U tunnel packet is created by attaching a header to the original data packet. The added header consists of a UDP (User Datagram Protocol) transport header plus a GTP-U specific header. The GTP-U header consists of the following fields:

  • Flags: This contains the version and other information (such as an indication of whether optional header fields are present, among others).
  • Message type: For GTP-U packet carrying user data, the message type is 0xFF.
  • Length: This is the length in bytes of everything that comes after the Tunnel Endpoint Identifier (TEID) field.
  • TEID: Unique value for a tunnel that maps the tunnel to user devices

The GTP-U header is added by the GTP-U nodes (the base station and User Plane Function or UPF). However, the user cannot see the header on the user interface of the device. Therefore, user devices cannot manipulate the header fields.

Although GTP-U is a standard tunneling technique, its use is mostly restricted to CT environments between the base station and the UPF or between UPFs. Assuming the best scenario, the backhaul between the base station and the UPF is encrypted, protected by a firewall, and closed to outside access. Here is a breakdown of the ideal scenario: GSMA recommends IP security (IPsec) between the base station and the UPF. In such a scenario, packets going to the GTP-U nodes come from authorized devices only. If these devices follow specifications and implement them well, none of them will send anomalous packets. Besides, robust systems are expected to have strong sanity checks to handle received anomalies, especially obvious ones such as invalid lengths, types, and extensions, among others.

In reality, however, the scenario could often be different and would require a different analysis altogether. Operators are reluctant to deploy IPsec on the N3 interface because it is CPU-intensive and reduces the throughput of user traffic. Also, since the user data is perceived to be protected at the application layer (with additional protocols such as TLS or Transport Layer Security), some consider IP security redundant. One might think that for as long as the base station and packet-core conform to the specific, there will be no anomalies. Besides, one might also think that for all robust systems require sanity checks to catch any obvious anomalies. However, previous studies have shown that many N3 nodes (such as UPF) around the world, although they should not be, are exposed to the internet. This is shown in the following sections.

fig3-attacks-on-5g-infrastructure-from-users-devices

Figure 3. Exposed UPF interfaces due to misconfigurations or lack of firewalls; screenshot taken from Shodan and used in a previously published research

We discuss two concepts that can exploit the GTP-U using CVE-2021-45462. In Open5GS, a C-language open-source implementation for 5G Core and Evolved Packet Core (EPC), sending a zero-length, type=255 GTP-U packet from the user device resulted in a denial of service (DoS) of the UPF. This is CVE-2021-45462, a security gap in the packet core that can crash the UPF (in 5G) or Serving Gateway User Plane Function (SGW-U in 4G/LTE) via an anomalous GTP-U packet crafted from the UE and by sending this anomalous GTP-U packet in the GTP-U. Given that the exploit affects a critical component of the infrastructure and cannot be resolved as easily, the vulnerability has received a Medium to High severity rating.

GTP-U nodes: Base station and UPF

GTP-U nodes are endpoints that encapsulate and decapsulate GTP-U packets. The base station is the GTP-U node on the user device side. As the base station receives user data from the UE, it converts the data to IP packets and encapsulates it in the GTP-U tunnel.

The UPF is the GTP-U node on the 5G core (5GC) side. When it receives a GTP-U packet from the base station, the UPF decapsulates the outer GTP-U header and takes out the inner packet. The UPF looks up the destination IP address in a routing table (also maintained by the UPF) without checking the content of the inner packet, after which the packet is sent on its way.

GTP-U in GTP-U

What if a user device crafts an anomalous GTP-U packet and sends it to a packet core?

fig4-attacks-on-5g-infrastructure-from-users-devices
Figure 4. A specially crafted anomalous GTP-U packet
fig5-attacks-on-5g-infrastructure-from-users-devices
Figure 5. Sending an anomalous GTP-U packet from the user device

As intended, the base station will tunnel this packet inside its GTP-U tunnel and send to the UPF. This results in a GTP-U in the GTP-U packet arriving at the UPF. There are now two GTP-U packets in the UPF: The outer GTP-U packet header is created by the base station to encapsulate the data packet from the user device. This outer GTP-U packet has 0xFF as its message type and a length of 44. This header is normal. The inner GTP-U header is crafted and sent by the user device as a data packet. Like the outer one, this inner GTP-U has 0xFF as message type, but a length of 0 is not normal.

The source IP address of the inner packet belongs to the user device, while the source IP address of the outer packet belongs to the base station. Both inner and outer packets have the same destination IP address: that of the UPF.

The UPF decapsulates the outer GTP-U and passes the functional checks. The inner GTP-U packet’s destination is again the same UPF. What happens next is implementation-specific:

  • Some implementations maintain a state machine for packet traversal. Improper implementation of the state machine might result in processing this inner GTP-U packet. This packet might have passed the checks phase already since it shares the same packet-context with the outer packet. This leads to having an anomalous packet inside the system, past sanity checks.
  • Since the inner packet’s destination is the IP address of UPF itself, the packet might get sent to the UPF. In this case, the packet is likely to hit the functional checks and therefore becomes less problematic than the previous case.

Attack vector

Some 5G core vendors leverage Open5GS code. For example, NextEPC (4G system, rebranded as Open5GS in 2019 to add 5G, with remaining products from the old brand) has an enterprise offer for LTE/5G, which draws from Open5GS’ code. No attacks or indications of threats in the wild have been observed, but our tests indicate potential risks using the identified scenarios.

The importance of the attack is in the attack vector: the cellular infrastructure attacks from the UE. The exploit only requires a mobile phone (or a computer connected via a cellular dongle) and a few lines of Python code to abuse the opening and mount this class of attack. The GTP-U in GTP-U attacks is a well-known technique, and backhaul IP security and encryption do not prevent this attack. In fact, these security measures might hinder the firewall from inspecting the content.

Remediation and insights

Critical industries such as the medical and utility sectors are just some of the early adopters of private 5G systems, and its breadth and depth of popular use are only expected to grow further. Reliability for continuous, uninterrupted operations is critical for these industries as there are lives and real-world implications at stake. The foundational function of these sectors are the reason that they choose to use a private 5G system over Wi-Fi. It is imperative that private 5G systems offer unfailing connectivity as a successful attack on any 5G infrastructure could bring the entire network down.

In this entry, the abuse of CVE-2021-45462 can result in a DoS attack. The root cause of CVE-2021-45462 (and most GTP-U-in-GTP-U attacks) is the improper error checking and error handling in the packet core. While GTP-U-in-GTP-U itself is harmless, the proper fix for the gap has to come from the packet-core vendor, and infrastructure admins must use the latest versions of the software.

A GTP-U-in-GTP-U attack can also be used to leak sensitive information such as the IP addresses of infrastructure nodes. GTP-U peers should therefore be prepared to handle GTP-U-in-GTP-U packets. In CT environments, they should use an intrusion prevention system (IPS) or firewalls that can understand CT protocols. Since GTP-U is not normal user traffic, especially in private 5G, security teams can prioritize and drop GTP-U-in-GTP-U traffic.

As a general rule, the registration and use of SIM cards must be strictly regulated and managed. An attacker with a stolen SIM card could insert it to an attacker’s device to connect to a network for malicious deployments. Moreover, the responsibility of security might be ambiguous to some in a shared operating model, such as end-devices and the edge of the infrastructure chain owned by the enterprise. Meanwhile, the cellular infrastructure is owned by the integrator or carrier. This presents a hard task for security operation centers (SOCs) to bring relevant information together from different domains and solutions.

In addition, due to the downtime and tests required, updating critical infrastructure software regularly to keep up with vendor’s patches is not easy, nor will it ever be. Virtual patching with IPS or layered firewalls is thus strongly recommended. Fortunately, GTP-in-GTP is rarely used in real-world applications, so it might be safe to completely block all GTP-in-GTP traffic. We recommend using layered security solutions that combine IT and communications technology (CT) security and visibility. Implementing zero-trust solutions, such as Trend Micro™ Mobile Network Security, powered by CTOne, adds another security layer for enterprises and critical industries to prevent the unauthorized use of their respective private networks for a continuous and undisrupted industrial ecosystem, and by ensuring that the SIM is used only from an authorized device. Mobile Network Security also brings CT and IT security into a unified visibility and management console.

Source :
https://www.trendmicro.com/it_it/research/23/i/attacks-on-5g-infrastructure-from-users-devices.html

Electric Power System Cybersecurity Vulnerabilities

By: Mayumi Nishimura
October 06, 2023
Read time: 4 min (1096 words)

Digitalization has changed the business environment of the electric power industry, exposing it to various threats. This webinar will help you uncover previously unnoticed threats and develop countermeasures and solutions.

The Electric utility industry is constantly exposed to various threats, including physical threats and sophisticated national-level cyber attacks. It has been an industry that has focused on security measures. But in the last few years, power system changes have occurred. As OT becomes more networked and connected to IT, the number of interfaces between IT and OT increases, and various cyber threats that have not surfaced until now have emerged.

Trend Micro held a webinar to discuss these changes in the situation, what strategies to protect your company’s assets from the latest cyber threats, and the challenges and solutions in implementing these strategies.

This blog will provide highlights from the webinar and share common challenges in the power industry that emerged from the survey. We hope that this will be helpful to cybersecurity directors in the Electric utility industry who recognize the need for consistent security measures for IT and OT but are faced with challenges in implementing them.

Click here to watch the On-demand Webinar

Vulnerabilities in Electric Power Systems

In the webinar, we introduced examples of threats from “Critical Infrastructures Exposed and at Risk: Energy and Water Industries” conducted by Trend Micro. The main objective of this study was to demonstrate how easy it is to discover and exploit OT assets in the water and energy sector using basic open-source intelligence (OSINT) techniques. As a result of the investigation, it was possible to access the HMI remotely, view the database containing customer data, and control the start and stop of the turbine.

Figure1. Interoperability and Connected Resources
Figure1. Interoperability and Connected Resources

Cyberattacks Against Electric Power

Explained the current cyberattacks on electric power companies. Figure 2 shows the attack surface (attack x digital assets), attack flow, and ultimately, possible damage in IT and OT of energy systems.

An example of an attack surface in an IT network is an office PC that exploits VPN vulnerabilities. If attackers infiltrate the monitoring system through a VPN, they can seize privileges and gain unauthorized access to OT assets such as the HMI. There is also the possibility of ransomware being installed.

A typical attack surface in an OT network is a PC for maintenance. If this terminal is infected with a virus and a maintenance person connects to the OT network, the virus may infect the OT network and cause problems such as stopping the operation of the OT equipment.

To protect these interconnected systems, it is necessary to review cybersecurity strategies across IT, OT, and different technology domains.

Figure 2. Attack Surface Includes Both IT and OT
Figure 2. Attack Surface Includes Both IT and OT

Solutions

We have organized the issues from People, Process, and Technology perspectives when reviewing security strategies across different technology domains such as IT and OT.

One example of people-related issues is labor shortages and skills gaps. The reason for the lack of skills is that IT security personnel are not familiar with the operations side, and vice versa. In the webinar, we introduced three ideas for approaches to solving these people’s problems.

The first is improving employee security awareness and training. From management to employees, we must recognize the need for security and work together. Second, to understand the work of the IT and OT departments, we recommend job rotation and workshops for mutual understanding. The third is documentation and automation of incident response. Be careful not to aim for automation. First, it is important to identify unnecessary tasks and reduce work. After that, we recommend automating the necessary tasks. We also provide examples of solutions for Process and Technology issues in the webinar.

 Figure 3. Need Consistent Cybersecurity across IT and OT
Figure 3. Need Consistent Cybersecurity across IT and OT

Finally, we introduced Unified Kill Chain as an effective approach. It extends and combines existing models such as Lockheed Martin’s Cyber KillChain® and MITER’s ATT&CK™ to show an attacker’s steps from initiation to completion of a cyber attack. The attacker will not be able to reach their goal unless all of these steps are completed successfully, but the defender will need to break this chain at some point, which will serve as a reference for the defender’s strategy. Even when attacks cross IT and OT, it is possible to use this approach as a reference to evaluate the expected attacks and the current security situation and take appropriate security measures in response.

Figure 4. Unified Kill Chain
Figure 4. Unified Kill Chain

The Webinar’s notes

To understand the situation and thoughts of security leaders in the Electric utility industry, we have included some of the survey results regarding this webinar. The webinar, held on June 29th, was attended in real time by nearly 100 people working in the energy sector and engaged in cybersecurity-related work.

When asked what information they found most helpful, the majority of survey respondents selected “consistent cybersecurity issues and solutions across IT and OT,” indicating that “consistent cybersecurity issues and solutions across IT and OT.” I am glad that I was able to help those who feel that there are issues in implementing countermeasures.

 Chart 1. Question What was the most useful information for you from today's seminar? (N=28)
Chart 1. Question What was the most useful information for you from today’s seminar? (N=28)

Also, over 90% of respondents answered “Agree” when asked if they needed consistent cybersecurity across IT and OT. Among those who chose “Agree”, 39% answered that they have already started some kind of action, indicating the consistent importance of cybersecurity in IT and OT.

 Chart 2. Do you agree with Need Consistent Cybersecurity across IT and OT? (N=28)
Chart 2. Do you agree with Need Consistent Cybersecurity across IT and OT? (N=28)

Lastly, I would like to share the results of a question asked during webinar registration about what issues people in this industry think about OT security. Number one was siloed risk and threat visibility, and number two was legacy system support. The tie for 3rd place was due to a lack of preparation for attacks across different NWs and lack of staff personnel/skills. There is a strong sense of challenges in the visualization of risks and threats, other organizational efforts, and technical countermeasures.

 Chart 3. Please select all of the challenges you face in thinking about OT cybersecurity. (N=55 multiple choice)
Chart 3. Please select all of the challenges you face in thinking about OT cybersecurity. (N=55 multiple choice)

Resources

The above is a small excerpt from the webinar. We recommend watching the full webinar video below if you are interested in the power industry’s future cybersecurity strategy.

Click here to watch the On-demand Webinar

In addition, Trend Micro reports introduced in the webinar and other reference links related to the energy industry can be accessed below.

Tech Paper
ICS/OT Security for the Electric Utility
Critical Infrastructures Exposed and at Risk: Energy and Water Industries

VIDEO
Electric utilities need to know cross domain attack

Blog
Electricity/Energy Cybersecurity: Trends & Survey Response

Source :
https://www.trendmicro.com/it_it/research/23/j/electric-power-system-cybersecurity-vulnerabilities.html

(Non-US) D-Link Corporation Provides Details about an Information Disclosure Security Incident

On October 2, 2023, (Non-US) D-Link Corporation was notified of a claim of data breach from an online forum by an unauthorized third party, indicating the theft of certain data. Upon becoming aware of this claim, the company promptly initiated a comprehensive investigation into the situation and immediately took precautionary measures. Currently, there is no impact on any of the D-Link operations.

Through internal and external investigations by experts from Trend Mirco, the company identified numerous inaccuracies and exaggerations in the claim that were intentionally misleading and did not align with facts. The data was confirmed not from the cloud but likely originated from an old D-View 6 system, which reached its end of life as early as 2015. The data was used for registration purposes back then. So far, no evidence suggests the archaic data contained any user IDs or financial information. However, some low-sensitivity and semi-public information, such as contact names or office email addresses, were indicated.

The incident is believed to have been triggered by an employee unintentionally falling victim to a phishing attack, resulting in unauthorized access to long-unused and outdated data. Despite the company’s systems meeting the information security standards of that era, it profoundly regrets this occurrence. D-Link is fully dedicated to addressing this incident and implementing measures to enhance the security of its business operations. After the incident, the company promptly terminated the services of the test lab and conducted a thorough review of the access control. Further steps will continue to be taken as necessary to safeguard the rights of all users in the future.

D-Link believes current customers are unlikely to be affected by this incident. However, please get in touch with local customer service for more information if anyone has concerns. D-Link takes information security seriously and has a dedicated task force and product management team on call to address evolving security issues and implement appropriate security measures. D-Link shall always endeavor to provide the best services to its customers.

l   What happened?

On October 1, 2023, someone posted an article in an online forum and claimed that the D-View system, a software monitoring tool for local networking devices and network administrators, was breached, and millions of users’ data were stolen.

l   Was there credibility in this claim?

There were numerous inaccuracies and exaggerations in this claim that did not align with the facts, including but not limited to:

–       The amount of data: Believed to be approximately 700 records

–       Active user: Presumably none

–       Registration information for some of the data

–       The latest login timestamps for some of the data

We have reasons to believe the latest login timestamps were intentionally tampered with to make the archaic data look recent.

l   When did the company take the necessary actions?

We initiated a comprehensive investigation into the claim and immediately took preventive measures on the same day we were informed.

l   What measures has the company currently taken?

We immediately shut down presumably relevant servers after being informed of this incident. We blocked user accounts on the live systems, retaining only two maintenance accounts to investigate any signs of intrusion further. Simultaneously, we conducted multiple examinations to determine if any leaked backup data remained in the test lab environment and disconnected the test lab from the company’s internal network.

Subsequently, we will audit outdated user and backup data and proceed with their deletion to prevent a recurrence of similar incidents.

l   What is the impact of this incident?

The post claimed to have millions of user data. Based on the investigations, however, it only contained approximately 700 outdated and fragmented records that had been inactive for at least seven years. These records originated from a product registration system that reached its end of life in 2015. Furthermore, the majority of the data consisted of low-sensitivity and semi-public information.

Judging by the facts, we have good reasons to believe that most of D-Link’s current customers are unlikely to be affected by this incident.

l   What was the cause of this incident?

The incident may have been caused by an employee falling victim to a phishing attack, resulting in unauthorized access to the long-unused and outdated data.

l   Has there been any significant vulnerability in the company’s information security?

D-Link’s information security systems adhere to the most stringent contemporary standards to ensure user rights.

Global concepts and technologies related to information security have made significant progress in recent years, and we have kept pace with these advancements, continually enhancing the depth and breadth of our information security measures.

The D-View 6 system identified in this investigation had reached its end of life in 2015. Our current product offering is D-View 8, which differs significantly from its predecessor two generations before regarding the rigor of information security measures and the simplification of registration data.

l   What is the suggestion for users?

We will never request users to provide passwords or personal financial information (such as bank or credit card details) through any means, including phone calls, text messages, or emails. If people receive such calls or letters, please get in touch with local authorities immediately to protect your rights.

If anyone has concerns, we recommend that users consider changing shared passwords on other websites or take necessary precautions.

Source :
https://supportannouncement.us.dlink.com/announcement/publication.aspx?name=SAP10359

Enable Single Sign-On (SSO) Authentication on RDS Windows Server

May 23, 2023

Single Sign-On (SSO) allows an authenticated (signed-on) user to access other domain services without having to re-authenticate (re-entering a password) and without using saved credentials (including RDP). SSO can be used when connecting to Remote Desktop Services (terminal) servers. This prevents a user logged on to a domain computer from entering their account name and password multiple times in the RDP client window when connecting to different RDS hosts or running published RemoteApps.

This article shows how to configure transparent SSO (Single Sign-On) for users of RDS servers running Windows Server 2022/2019/2016.

Contents:

System requirements:

  • The Connection Broker server and all RDS hosts must be running Windows Server 2012 or newer;
  • You can use Windows 11,10,8.1 with Pro/Enterprise editions as client workstations.
  • SSO works only in the domain environment: Active Directory user accounts must be used, the RDS servers and user’s workstations must be joined to the same AD domain;
  • The RDP 8.0 or later must be used on the RDP clients;
  • SSO works only with password authentication (smart cards are not supported);
  • The RDP Security Layer in the connection settings should be set to Negotiate or SSL (TLS 1.0), and the encryption mode to High or FIPS Compliant.

The single sign-on setup process consists of the following steps:

  • You need to issue and assign an SSL certificate on RD Gateway, RD Web, and RD Connection Broker servers;
  • Web SSO has to be enabled on the RDWeb server;
  • Configure credential delegation group policy;
  • Add the RDS certificate thumbprint to the trusted .rdp publishers using GPO.

Enable SSO Authentication on RDS Host with Windows Server 2022/2019/2016

First, you need to issue and assign an SSL certificate to your RDS deployment. The certificate’s Enhanced Key Usage (EKU) must contain the Server Authentication identifier.  The procedure for obtaining an SSL certificate for RDS deployment is not covered. This is outside the scope of this article (you can generate a self-signed SSL certificate yourself, but you will have to deploy it to the trusted cert on all clients using the group policy).

Learn more about using SST/TLS certificates to secure RDP connections.

The certificate is assigned in the Certificates section of RDS Deployment properties.

RDS certificates

Then, on all servers with the RD Web Access role, enable Windows Authentication for the IIS RDWeb directory and disable Anonymous Authentication.

IIS Windows Authentication

After you have saved the changes, restart the IIS:

iisreset /noforce

If Remote Desktop Gateway is used, ensure that it is not used to connect internal clients (the Bypass RD Gateway server for local address option should be checked).

RD Gateway deployment

Now you need to obtain the SSL certificate thumbprint of the RD Connection Broker and add it to the list of trusted RDP publishers. For that, run the following PowerShell command on the RDS Connection Broker host:

Get-Childitem CERT:\LocalMachine\My

Get-Childitem CERT:\LocalMachine\My

Copy the value of the certificate’s thumbprint and add it to the Specify SHA1 thumbprints of certificates representing RDP publishers policy (Computer Configuration -> Administrative Templates -> Windows Desktop Services -> Remote Desktop Connection Client).

Specify SHA1 thumbprints of certificates representing RDP publishers

Configure Remote Desktop Single Sign-on on Windows Clients

The next step is to configure the credential delegation policy for user computers.

  1. Open the domain Group Policy Management Console (gpmc.msc);
  2. Create a new domain GPO and link it to an OU with users (computers) that need to be allowed to use SSO to access the RDS server;
  3. Enable the policy Allow delegation defaults credential under Computer Configuration -> Administrative Templates -> System -> Credential Delegation
  4. Add the names of RDS hosts to which the client can automatically send user credentials to perform SSO authentication. Use the following format for RDS hosts: TERMSRV/rd.contoso.com (all TERMSRV characters must be in upper case). If you need to allow credentials to be sent to all terminals in the domain (less secure), you can use this construction: TERMSRV/*.contoso.com .

The above policy will work if you are using Kerberos authentication. If the NTLM authentication protocol is not disabled in the domain, you must configure the Allow delegation default credentials with NTLM-only server authentication policy in the same way.

Then, to prevent a window warning that the remote application publisher is untrusted, add the address of the server running the RD Connection Broker role to the trusted zone on the client computers using the policy “Site to Zone Assignment List” (similar to the article How to disable Open File security warning on Windows 10):

  1. Go to the GPO section User/Computer Configuration -> Administrative Tools -> Windows Components -> Internet Explorer -> Internet Control Panel -> Security Page.
  2. Enable the policy Site to Zone Assignment List
  3. Specify the FQDN of the RD Connection Broker hostname and set Zone 2 (Trusted sites).
Site to Zone assignment : trusted zone

Next, you need to enable the Logon options policy under User/Computer Configuration -> Administrative Tools -> Windows Components -> Internet Explorer -> Internet Control Panel -> Security -> Trusted Sites Zone. Select ‘Automatic logon with current username and password’ from the dropdown list.

Then navigate to the Computer Configuration -> Policies -> Administrative Templates ->Windows Components ->Remote Desktop Services ->Remote Desktop Connection Client and disable the policy Prompt for credentials on the client computer.

GPO: dont prompt for credentials on the client computer

After updating the Group Policy settings on the client, open the mstsc.exe (Remote Desktop Connection) client and specify the FQDN of the RDS host. The UserName field automatically displays your name in the format user@domain.com:

Your Windows logon credentials will be used to connect.

Now, when you start a RemoteApp or connect directly to a Remote Desktop Services host, you will not be prompted for your password.

Your Windows logon credentials will be used to connect.

To use the RD Gateway with SSO, enable the policy Set RD Gateway Authentication Method User Configuration -> Policies -> Administrative Templates -> Windows Components -> Remote Desktop Services -> RD Gateway) and set its value to Use Locally Logged-On Credentials.

Active X component Microsoft Remote Desktop Services Web Access Control (MsRdpClientShell)

To use Web SSO on RD Web Access, please note that it is recommended to use Internet Explorer with enabled Active X component named Microsoft Remote Desktop Services Web Access Control (MsRdpClientShell, MsRdpWebAccess.dll).

On modern versions of Windows, Internet Explorer is disabled by default and you will need to use Microsoft Edge instead. You must open this URL in Microsoft Edge in compatibility mode to use RD Web with SSO (Edge won’t run Active-X components without compatibility mode).

In order for all the client computers to be able to open RDWeb in compatibility mode, you will need to install the MS Edge Administrative Templates GPO and configure policy settings under Computer Configuration -> Administrative Templates -> Microsoft:

Possible errors:

  • In our case, RDP SSO stopped working on an RDS farm with User Profile Disks profiles after installing security updates KB5018410 (Windows 10) or KB5018418 (Windows 11) in Autumn 2022. To solve the problem, edit the *.rdp connection file and change the following line:use redirection server name:i:1
  • If you have not added the SSL thumbprint of the RDCB certificate to the Trusted RDP Publishers, a warning will appear when trying to connect:Do you trust the publisher of this RemoteApp program?
  • An authentication error has occured (Code: 0x607)Check that you have assigned the correct certificate to the RDS roles.

    Source :
    https://woshub.com/sso-single-sign-on-authentication-on-rds/

Remote Desktop Licensing Mode is not Configured

August 24, 2023

When configuring a new RDS farm node on Windows Server 2022/2019/2016/2012 R2, you may see the following tray warning pop-up:Licensing mode for the Remote Desktop Session Host is not configured. Remote Desktop Service will stop working in 104 days. On the RD Connection Broker server, use Server Manager to specify the Remote Desktop licensing mode and the license server.

WinServer 2012 R2 - Licensing mode for the RDSH is not configured

At the same time, there will be warnings with an Event ID 18 in the Event Viewer:Log Name: System Source: Microsoft-Windows-TerminalServices-Licensing Level: Warning Description: The Remote Desktop license server UK-RDS01 has not been activated and therefore will only issue temporary licenses. To issue permanent licenses, the Remote Desktop license server must be activated.

This problem will also occur if there are no Remote Desktop Licensing (RDS) servers available on your network to provide a license.

These errors are an indication that your RDS is running in the License grace period mode. You can use Remote Desktop Session Host for 120 days without activating RDS licenses during the grace period. When the grace period expires, users won’t be able to connect to RDSH with an error:Remote Desktop Services will stop working because this computer is past grace period and has not contacted at least a valid Windows Server 2012 license server. Click this message to open RD Session Host Server Configuration to use Licensing Diagnosis.

The number of days remaining before the RDS grace period expires can be displayed using the command:

wmic /namespace:\\root\CIMV2\TerminalServices PATH Win32_TerminalServiceSetting WHERE (__CLASS !="") CALL GetGracePeriodDays

Check the Licensing Settings on the Remote Desktop Server

To diagnose the problem, run the Remote Desktop Licensing Diagnoser tool (lsdiag.msc, or  Administrative Tools -> Remote Desktop Services -> RD Licensing Diagnoser). The tool should display the following error:Licenses are not available for the Remote Desktop Session Host server, and RD Licensing Diagnoser has identified licensing problem for the RD Session Host server. Licensing mode for the Remote Desktop Session Host is not configured. Number of licenses available for clients: 0 Set the licensing mode on the Remote Desktop Session Host server to either Per User or Per Device. Use RD Licensing Manager to install the corresponding licenses on the license server The Remote Desktop Session Host server is within its grace period, but the Session Host server has not been configured with any license server.

As you can see, there are no licenses available to Clients on the RDS Host because the Licensing Mode is not set.

Remote Desktop Licensing Diagnoser: Licensing mode for the Remote Desktop Session Host is not configured

The most likely problem is that the administrator has not set the RDS Licensing Server and/or the licensing mode. This should be done even if the license type was already specified when the RDS Host was deployed (Configure the deployment -> RD Licensing -> Select the Remote Desktop licensing mode).

set rd licensing mode during deployment

Configuring the RDS Licensing Mode on Windows Server

There are several ways to configure host RDS licensing settings:

  • Using PowerShell
  • Via the Windows Registry
  • Using the Group Policy (preferred)

Set the Remote Desktop licensing mode via GPO

To configure the license server settings on the RDS host, you must use the domain GPO management console (gpmc.msc) or the local Group Policy editor (gpedit.msc).

On a standalone RDSH host (in a domain and workgroup), it’s easiest to use local policy. Go to Computer Configuration -> Administrative Templates -> Windows Components -> Remote Desktop Services -> Remote Desktop Session Host -> Licensing.

We need two GPO options:

  • Use the specified Remote Desktop license servers – enable the policy and specify the RDS license server addresses. If the RD license server is running on the same host, type 127.0.0.1You can specify the addresses of multiple hosts with the RDS Licensing role separated by commas; Policy - Use the specified Remote Desktop license servers
  • Set the Remote Desktop licensing mode – select the licensing mode. In our case, it is Per UserPOlicy - Set the Remote Desktop licensing mode - Per User

If you have deployed an RDS host without an AD domain (in a workgroup), you can only use Per Device RDS CALs. Otherwise, a message is displayed when a user logs in to the RDSH server in the workgroup:Remote Desktop Issue.There is a problem with your Remote Desktop license, and your session will be disconnected in 60 minutes. Contact your system administrator to fix the problem.

Set RDS licensing mode from the PowerShell prompt

Open a PowerShell console and check that the RDS licensing server address is configured on your RDSH:

$obj = gwmi -namespace "Root/CIMV2/TerminalServices" Win32_TerminalServiceSetting
$obj.GetSpecifiedLicenseServerList()

GetSpecifiedLicenseServerList

NoteIn this case, the data that the Get-RDLicenseConfiguration cmdlet returns may be completely different and incorrect.

If the RDS license server is not configured, you can set it using the command:

$obj.SetSpecifiedLicenseServerList("uk-rdslic1.woshub.com")

You can also set the licensing mode (4 — Per User, or 2 — Per Device):

$obj.ChangeMode(4)

powershell: change RDS licensing mode

You can use the Get-ADObject cmdlet from the ActiveDirectory PowerShell module to list servers with the RDS Licensing role in an Active Directory domain:

Get-ADObject -Filter {objectClass -eq 'serviceConnectionPoint' -and Name -eq 'TermServLicensing'}

You can also configure the licensing parameters of the RDS host via a host with the RD Connection Broker role:

Set-RDLicenseConfiguration -LicenseServer @("uk-rdslic1.woshub.com","uk-rdslic2.woshub.com") -Mode PerDevice -ConnectionBroker "uk-rdcb1.woshub.com"

Configuring RDS licensing settings via the registry

In the HKLM\SYSTEM\CurrentControlSet\Control\Terminal Server\RCM\Licensing Core key, you will need to change the DWORD value of the parameter LicensingMode from a value of (license mode not set):

  • 2 – if Per Device RDS licensing mode is used;
  • 4 – if Per User licensing is used.
rds licensing mode - LicensingMode registry parameter

You can change the registry setting manually by using regedit.exe or following PowerShell commands that allow you to change the values of registry items:

# Specify the RDS licensing mode: 2 - Per Device CAL, 4 - Per User CAL
$RDSCALMode = 2
# RDS Licensing hostname
$RDSlicServer = "uk-rdslic1.woshub.com"
# Set the server name and licensing mode in the registry
New-Item "HKLM:\SYSTEM\CurrentControlSet\Services\TermService\Parameters\LicenseServers"
New-ItemProperty "HKLM:\SYSTEM\CurrentControlSet\Services\TermService\Parameters\LicenseServers" -Name SpecifiedLicenseServers -Value $RDSlicServer -PropertyType "MultiString"
Set-ItemProperty "HKLM:\SYSTEM\CurrentControlSet\Control\Terminal Server\RCM\Licensing Core\" -Name "LicensingMode" -Value $RDSCALMode

Once you have made the changes, restart your RDSH server. Then open the RDS Licensing Diagnoser console.  If you have configured everything correctly, you should see the number of licenses available for clients and the licensing mode you have set (Licensing mode: Per device).RD Licensing Diagnoser did not identify any licensing problems for the Remote Desktop Session Host.

number of available rds licenses

If a firewall is used on your network, you must open the following ports from the RDSH host to the RDS licensing server – TCP:135, UDP:137, UDP:138, TCP:139, TCP:445, TCP:49152–65535 (RPC range).

You can use the Test-NetConnection cmdlet to check for open and closed ports. If ports are closed in the local Windows Defender firewall, you can use PowerShell or GPO to manage firewall rules.

Also note that if the RD Licensing Server has, for example, Windows Server 2016 OS and CALs for RDS 2016 installed, you will not be able to install RDS CAL licenses for Windows Server 2019 or 2022. The 'Remote Desktop Licensing mode is not configured'error persists even when you specify the correct license type and RDS license server name. Older versions of Windows Server simply don’t support RDS CALs for newer versions of WS.

In this case, the following message will be displayed in the RD License Diagnoser window:The Remote Desktop Session Host is in Per User licensing mode and no Redirector Mode, but license server does not have any installed license with the following attributes: Product version: Windows Server 2016 Use RD Licensing Manager to install the appropriate licenses on the license server.

The Remote Desktop Session Host is in Per User licensing mode and no Redirector Mode, but license server does not have any installed appropriate license with the

You must first upgrade the version of Windows Server on the license server or deploy a new RD License host. A newer version of Windows Server (for example, WS 2022) has support for RDS CALs for all previous versions of Windows Server.

Note. Licensing report not generated if RDS host is in a workgroup. Although the terminal RDS licenses themselves are correctly issued to clients/devices. You will need to keep track of the number of RDS CALs you have left. You must monitor the number of RDS CALs remaining.

Source :
https://woshub.com/licensing-mode-rds-host-not-configured/

HTTP/2 Rapid Reset: deconstructing the record-breaking attack

10/10/2023

Lucas Pardue
Julien Desgats

Starting on Aug 25, 2023, we started to notice some unusually big HTTP attacks hitting many of our customers. These attacks were detected and mitigated by our automated DDoS system. It was not long however, before they started to reach record breaking sizes — and eventually peaked just above 201 million requests per second. This was nearly 3x bigger than our previous biggest attack on record.Under attack or need additional protection? Click here to get help.

Concerning is the fact that the attacker was able to generate such an attack with a botnet of merely 20,000 machines. There are botnets today that are made up of hundreds of thousands or millions of machines. Given that the entire web typically sees only between 1–3 billion requests per second, it’s not inconceivable that using this method could focus an entire web’s worth of requests on a small number of targets.

Detecting and Mitigating

This was a novel attack vector at an unprecedented scale, but Cloudflare’s existing protections were largely able to absorb the brunt of the attacks. While initially we saw some impact to customer traffic — affecting roughly 1% of requests during the initial wave of attacks — today we’ve been able to refine our mitigation methods to stop the attack for any Cloudflare customer without it impacting our systems.

We noticed these attacks at the same time two other major industry players — Google and AWS — were seeing the same. We worked to harden Cloudflare’s systems to ensure that, today, all our customers are protected from this new DDoS attack method without any customer impact. We’ve also participated with Google and AWS in a coordinated disclosure of the attack to impacted vendors and critical infrastructure providers.

This attack was made possible by abusing some features of the HTTP/2 protocol and server implementation details (see  CVE-2023-44487 for details). Because the attack abuses an underlying weakness in the HTTP/2 protocol, we believe any vendor that has implemented HTTP/2 will be subject to the attack. This included every modern web server. We, along with Google and AWS, have disclosed the attack method to web server vendors who we expect will implement patches. In the meantime, the best defense is using a DDoS mitigation service like Cloudflare’s in front of any web-facing web or API server.

This post dives into the details of the HTTP/2 protocol, the feature that attackers exploited to generate these massive attacks, and the mitigation strategies we took to ensure all our customers are protected. Our hope is that by publishing these details other impacted web servers and services will have the information they need to implement mitigation strategies. And, moreover, the HTTP/2 protocol standards team, as well as teams working on future web standards, can better design them to prevent such attacks.

RST attack details

HTTP is the application protocol that powers the Web. HTTP Semantics are common to all versions of HTTP — the overall architecture, terminology, and protocol aspects such as request and response messages, methods, status codes, header and trailer fields, message content, and much more. Each individual HTTP version defines how semantics are transformed into a “wire format” for exchange over the Internet. For example, a client has to serialize a request message into binary data and send it, then the server parses that back into a message it can process.

HTTP/1.1 uses a textual form of serialization. Request and response messages are exchanged as a stream of ASCII characters, sent over a reliable transport layer like TCP, using the following format (where CRLF means carriage-return and linefeed):

 HTTP-message   = start-line CRLF
                   *( field-line CRLF )
                   CRLF
                   [ message-body ]

For example, a very simple GET request for https://blog.cloudflare.com/ would look like this on the wire:

GET / HTTP/1.1 CRLFHost: blog.cloudflare.comCRLFCRLF

And the response would look like:

HTTP/1.1 200 OK CRLFServer: cloudflareCRLFContent-Length: 100CRLFtext/html; charset=UTF-8CRLFCRLF<100 bytes of data>

This format frames messages on the wire, meaning that it is possible to use a single TCP connection to exchange multiple requests and responses. However, the format requires that each message is sent whole. Furthermore, in order to correctly correlate requests with responses, strict ordering is required; meaning that messages are exchanged serially and can not be multiplexed. Two GET requests, for https://blog.cloudflare.com/ and https://blog.cloudflare.com/page/2/, would be:

GET / HTTP/1.1 CRLFHost: blog.cloudflare.comCRLFCRLFGET /page/2/ HTTP/1.1 CRLFHost: blog.cloudflare.comCRLFCRLF

With the responses:

HTTP/1.1 200 OK CRLFServer: cloudflareCRLFContent-Length: 100CRLFtext/html; charset=UTF-8CRLFCRLF<100 bytes of data>CRLFHTTP/1.1 200 OK CRLFServer: cloudflareCRLFContent-Length: 100CRLFtext/html; charset=UTF-8CRLFCRLF<100 bytes of data>

Web pages require more complicated HTTP interactions than these examples. When visiting the Cloudflare blog, your browser will load multiple scripts, styles and media assets. If you visit the front page using HTTP/1.1 and decide quickly to navigate to page 2, your browser can pick from two options. Either wait for all of the queued up responses for the page that you no longer want before page 2 can even start, or cancel in-flight requests by closing the TCP connection and opening a new connection. Neither of these is very practical. Browsers tend to work around these limitations by managing a pool of TCP connections (up to 6 per host) and implementing complex request dispatch logic over the pool.

HTTP/2 addresses many of the issues with HTTP/1.1. Each HTTP message is serialized into a set of HTTP/2 frames that have type, length, flags, stream identifier (ID) and payload. The stream ID makes it clear which bytes on the wire apply to which message, allowing safe multiplexing and concurrency. Streams are bidirectional. Clients send frames and servers reply with frames using the same ID.

In HTTP/2 our GET request for https://blog.cloudflare.com would be exchanged across stream ID 1, with the client sending one HEADERS frame, and the server responding with one HEADERS frame, followed by one or more DATA frames. Client requests always use odd-numbered stream IDs, so subsequent requests would use stream ID 3, 5, and so on. Responses can be served in any order, and frames from different streams can be interleaved.

Stream multiplexing and concurrency are powerful features of HTTP/2. They enable more efficient usage of a single TCP connection. HTTP/2 optimizes resources fetching especially when coupled with prioritization. On the flip side, making it easy for clients to launch large amounts of parallel work can increase the peak demand for server resources when compared to HTTP/1.1. This is an obvious vector for denial-of-service.

In order to provide some guardrails, HTTP/2 provides a notion of maximum active concurrent streams. The SETTINGS_MAX_CONCURRENT_STREAMS parameter allows a server to advertise its limit of concurrency. For example, if the server states a limit of 100, then only 100 requests can be active at any time. If a client attempts to open a stream above this limit, it must be rejected by the server using a RST_STREAM frame. Stream rejection does not affect the other in-flight streams on the connection.

The true story is a little more complicated. Streams have a lifecycle. Below is a diagram of the HTTP/2 stream state machine. Client and server manage their own views of the state of a stream. HEADERS, DATA and RST_STREAM frames trigger transitions when they are sent or received. Although the views of the stream state are independent, they are synchronized.

HEADERS and DATA frames include an END_STREAM flag, that when set to the value 1 (true), can trigger a state transition.

Let’s work through this with an example of a GET request that has no message content. The client sends the request as a HEADERS frame with the END_STREAM flag set to 1. The client first transitions the stream from idle to open state, then immediately transitions into half-closed state. The client half-closed state means that it can no longer send HEADERS or DATA, only WINDOW_UPDATEPRIORITY or RST_STREAM frames. It can receive any frame however.

Once the server receives and parses the HEADERS frame, it transitions the stream state from idle to open and then half-closed, so it matches the client. The server half-closed state means it can send any frame but receive only WINDOW_UPDATE, PRIORITY or RST_STREAM frames.

The response to the GET contains message content, so the server sends HEADERS with END_STREAM flag set to 0, then DATA with END_STREAM flag set to 1. The DATA frame triggers the transition of the stream from half-closed to closed on the server. When the client receives it, it also transitions to closed. Once a stream is closed, no frames can be sent or received.

Applying this lifecycle back into the context of concurrency, HTTP/2 states:

Streams that are in the “open” state or in either of the “half-closed” states count toward the maximum number of streams that an endpoint is permitted to open. Streams in any of these three states count toward the limit advertised in the SETTINGS_MAX_CONCURRENT_STREAMS setting.

In theory, the concurrency limit is useful. However, there are practical factors that hamper its effectiveness— which we will cover later in the blog.

HTTP/2 request cancellation

Earlier, we talked about client cancellation of in-flight requests. HTTP/2 supports this in a much more efficient way than HTTP/1.1. Rather than needing to tear down the whole connection, a client can send a RST_STREAM frame for a single stream. This instructs the server to stop processing the request and to abort the response, which frees up server resources and avoids wasting bandwidth.

Let’s consider our previous example of 3 requests. This time the client cancels the request on stream 1 after all of the HEADERS have been sent. The server parses this RST_STREAM frame before it is ready to serve the response and instead only responds to stream 3 and 5:

Request cancellation is a useful feature. For example, when scrolling a webpage with multiple images, a web browser can cancel images that fall outside the viewport, meaning that images entering it can load faster. HTTP/2 makes this behaviour a lot more efficient compared to HTTP/1.1.

A request stream that is canceled, rapidly transitions through the stream lifecycle. The client’s HEADERS with END_STREAM flag set to 1 transitions the state from idle to open to half-closed, then RST_STREAM immediately causes a transition from half-closed to closed.

Recall that only streams that are in the open or half-closed state contribute to the stream concurrency limit. When a client cancels a stream, it instantly gets the ability to open another stream in its place and can send another request immediately. This is the crux of what makes CVE-2023-44487 work.

Rapid resets leading to denial of service

HTTP/2 request cancellation can be abused to rapidly reset an unbounded number of streams. When an HTTP/2 server is able to process client-sent RST_STREAM frames and tear down state quickly enough, such rapid resets do not cause a problem. Where issues start to crop up is when there is any kind of delay or lag in tidying up. The client can churn through so many requests that a backlog of work accumulates, resulting in excess consumption of resources on the server.

A common HTTP deployment architecture is to run an HTTP/2 proxy or load-balancer in front of other components. When a client request arrives it is quickly dispatched and the actual work is done as an asynchronous activity somewhere else. This allows the proxy to handle client traffic very efficiently. However, this separation of concerns can make it hard for the proxy to tidy up the in-process jobs. Therefore, these deployments are more likely to encounter issues from rapid resets.

When Cloudflare’s reverse proxies process incoming HTTP/2 client traffic, they copy the data from the connection’s socket into a buffer and process that buffered data in order. As each request is read (HEADERS and DATA frames) it is dispatched to an upstream service. When RST_STREAM frames are read, the local state for the request is torn down and the upstream is notified that the request has been canceled. Rinse and repeat until the entire buffer is consumed. However this logic can be abused: when a malicious client started sending an enormous chain of requests and resets at the start of a connection, our servers would eagerly read them all and create stress on the upstream servers to the point of being unable to process any new incoming request.

Something that is important to highlight is that stream concurrency on its own cannot mitigate rapid reset. The client can churn requests to create high request rates no matter the server’s chosen value of SETTINGS_MAX_CONCURRENT_STREAMS.

Rapid Reset dissected

Here’s an example of rapid reset reproduced using a proof-of-concept client attempting to make a total of 1000 requests. I’ve used an off-the-shelf server without any mitigations; listening on port 443 in a test environment. The traffic is dissected using Wireshark and filtered to show only HTTP/2 traffic for clarity. Download the pcap to follow along.

It’s a bit difficult to see, because there are a lot of frames. We can get a quick summary via Wireshark’s Statistics > HTTP2 tool:

The first frame in this trace, in packet 14, is the server’s SETTINGS frame, which advertises a maximum stream concurrency of 100. In packet 15, the client sends a few control frames and then starts making requests that are rapidly reset. The first HEADERS frame is 26 bytes long, all subsequent HEADERS are only 9 bytes. This size difference is due to a compression technology called HPACK. In total, packet 15 contains 525 requests, going up to stream 1051.

Interestingly, the RST_STREAM for stream 1051 doesn’t fit in packet 15, so in packet 16 we see the server respond with a 404 response.  Then in packet 17 the client does send the RST_STREAM, before moving on to sending the remaining 475 requests.

Note that although the server advertised 100 concurrent streams, both packets sent by the client sent a lot more HEADERS frames than that. The client did not have to wait for any return traffic from the server, it was only limited by the size of the packets it could send. No server RST_STREAM frames are seen in this trace, indicating that the server did not observe a concurrent stream violation.

Impact on customers

As mentioned above, as requests are canceled, upstream services are notified and can abort requests before wasting too many resources on it. This was the case with this attack, where most malicious requests were never forwarded to the origin servers. However, the sheer size of these attacks did cause some impact.

First, as the rate of incoming requests reached peaks never seen before, we had reports of increased levels of 502 errors seen by clients. This happened on our most impacted data centers as they were struggling to process all the requests. While our network is meant to deal with large attacks, this particular vulnerability exposed a weakness in our infrastructure. Let’s dig a little deeper into the details, focusing on how incoming requests are handled when they hit one of our data centers:

We can see that our infrastructure is composed of a chain of different proxy servers with different responsibilities. In particular, when a client connects to Cloudflare to send HTTPS traffic, it first hits our TLS decryption proxy: it decrypts TLS traffic, processes HTTP 1, 2 or 3 traffic, then forwards it to our “business logic” proxy. This one is responsible for loading all the settings for each customer, then routing the requests correctly to other upstream services — and more importantly in our case, it is also responsible for security features. This is where L7 attack mitigation is processed.

The problem with this attack vector is that it manages to send a lot of requests very quickly in every single connection. Each of them had to be forwarded to the business logic proxy before we had a chance to block it. As the request throughput became higher than our proxy capacity, the pipe connecting these two services reached its saturation level in some of our servers.

When this happens, the TLS proxy cannot connect anymore to its upstream proxy, this is why some clients saw a bare “502 Bad Gateway” error during the most serious attacks. It is important to note that, as of today, the logs used to create HTTP analytics are also emitted by our business logic proxy. The consequence of that is that these errors are not visible in the Cloudflare dashboard. Our internal dashboards show that about 1% of requests were impacted during the initial wave of attacks (before we implemented mitigations), with peaks at around 12% for a few seconds during the most serious one on August 29th. The following graph shows the ratio of these errors over a two hours while this was happening:

We worked to reduce this number dramatically in the following days, as detailed later on in this post. Both thanks to changes in our stack and to our mitigation that reduce the size of these attacks considerably, this number today is effectively zero.

499 errors and the challenges for HTTP/2 stream concurrency

Another symptom reported by some customers is an increase in 499 errors. The reason for this is a bit different and is related to the maximum stream concurrency in a HTTP/2 connection detailed earlier in this post.

HTTP/2 settings are exchanged at the start of a connection using SETTINGS frames. In the absence of receiving an explicit parameter, default values apply. Once a client establishes an HTTP/2 connection, it can wait for a server’s SETTINGS (slow) or it can assume the default values and start making requests (fast). For SETTINGS_MAX_CONCURRENT_STREAMS, the default is effectively unlimited (stream IDs use a 31-bit number space, and requests use odd numbers, so the actual limit is 1073741824). The specification recommends that a server offer no fewer than 100 streams. Clients are generally biased towards speed, so don’t tend to wait for server settings, which creates a bit of a race condition. Clients are taking a gamble on what limit the server might pick; if they pick wrong the request will be rejected and will have to be retried. Gambling on 1073741824 streams is a bit silly. Instead, a lot of clients decide to limit themselves to issuing 100 concurrent streams, with the hope that servers followed the specification recommendation. Where servers pick something below 100, this client gamble fails and streams are reset.

There are many reasons a server might reset a stream beyond concurrency limit overstepping. HTTP/2 is strict and requires a stream to be closed when there are parsing or logic errors. In 2019, Cloudflare developed several mitigations in response to HTTP/2 DoS vulnerabilities. Several of those vulnerabilities were caused by a client misbehaving, leading the server to reset a stream. A very effective strategy to clamp down on such clients is to count the number of server resets during a connection, and when that exceeds some threshold value, close the connection with a GOAWAY frame. Legitimate clients might make one or two mistakes in a connection and that is acceptable. A client that makes too many mistakes is probably either broken or malicious and closing the connection addresses both cases.

While responding to DoS attacks enabled by CVE-2023-44487, Cloudflare reduced maximum stream concurrency to 64. Before making this change, we were unaware that clients don’t wait for SETTINGS and instead assume a concurrency of 100. Some web pages, such as an image gallery, do indeed cause a browser to send 100 requests immediately at the start of a connection. Unfortunately, the 36 streams above our limit all needed to be reset, which triggered our counting mitigations. This meant that we closed connections on legitimate clients, leading to a complete page load failure. As soon as we realized this interoperability issue, we changed the maximum stream concurrency to 100.

Actions from the Cloudflare side

In 2019 several DoS vulnerabilities were uncovered related to implementations of HTTP/2. Cloudflare developed and deployed a series of detections and mitigations in response.  CVE-2023-44487 is a different manifestation of HTTP/2 vulnerability. However, to mitigate it we were able to extend the existing protections to monitor client-sent RST_STREAM frames and close connections when they are being used for abuse. Legitimate client uses for RST_STREAM are unaffected.

In addition to a direct fix, we have implemented several improvements to the server’s HTTP/2 frame processing and request dispatch code. Furthermore, the business logic server has received improvements to queuing and scheduling that reduce unnecessary work and improve cancellation responsiveness. Together these lessen the impact of various potential abuse patterns as well as giving more room to the server to process requests before saturating.

Mitigate attacks earlier

Cloudflare already had systems in place to efficiently mitigate very large attacks with less expensive methods. One of them is named “IP Jail”. For hyper volumetric attacks, this system collects the client IPs participating in the attack and stops them from connecting to the attacked property, either at the IP level, or in our TLS proxy. This system however needs a few seconds to be fully effective; during these precious seconds, the origins are already protected but our infrastructure still needs to absorb all HTTP requests. As this new botnet has effectively no ramp-up period, we need to be able to neutralize attacks before they can become a problem.

To achieve this we expanded the IP Jail system to protect our entire infrastructure: once an IP is “jailed”, not only it is blocked from connecting to the attacked property, we also forbid the corresponding IPs from using HTTP/2 to any other domain on Cloudflare for some time. As such protocol abuses are not possible using HTTP/1.x, this limits the attacker’s ability to run large attacks, while any legitimate client sharing the same IP would only see a very small performance decrease during that time. IP based mitigations are a very blunt tool — this is why we have to be extremely careful when using them at that scale and seek to avoid false positives as much as possible. Moreover, the lifespan of a given IP in a botnet is usually short so any long term mitigation is likely to do more harm than good. The following graph shows the churn of IPs in the attacks we witnessed:

As we can see, many new IPs spotted on a given day disappear very quickly afterwards.

As all these actions happen in our TLS proxy at the beginning of our HTTPS pipeline, this saves considerable resources compared to our regular L7 mitigation system. This allowed us to weather these attacks much more smoothly and now the number of random 502 errors caused by these botnets is down to zero.

Observability improvements

Another front on which we are making change is observability. Returning errors to clients without being visible in customer analytics is unsatisfactory. Fortunately, a project has been underway to overhaul these systems since long before the recent attacks. It will eventually allow each service within our infrastructure to log its own data, instead of relying on our business logic proxy to consolidate and emit log data. This incident underscored the importance of this work, and we are redoubling our efforts.

We are also working on better connection-level logging, allowing us to spot such protocol abuses much more quickly to improve our DDoS mitigation capabilities.

Conclusion

While this was the latest record-breaking attack, we know it won’t be the last. As attacks continue to become more sophisticated, Cloudflare works relentlessly to proactively identify new threats — deploying countermeasures to our global network so that our millions of customers are immediately and automatically protected.

Cloudflare has provided free, unmetered and unlimited DDoS protection to all of our customers since 2017. In addition, we offer a range of additional security features to suit the needs of organizations of all sizes. Contact us if you’re unsure whether you’re protected or want to understand how you can be.

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet applicationward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you’re looking for a new career direction, check out our open positions.

Source :
https://blog.cloudflare.com/technical-breakdown-http2-rapid-reset-ddos-attack/

How it works: The novel HTTP/2 ‘Rapid Reset’ DDoS attack

October 10, 2023

Juho Snellman
Staff Software Engineer

Daniele Iamartino
Staff Site Reliability Engineer

A number of Google services and Cloud customers have been targeted with a novel HTTP/2-based DDoS attack which peaked in August. These attacks were significantly larger than any previously-reported Layer 7 attacks, with the largest attack surpassing 398 million requests per second.

The attacks were largely stopped at the edge of our network by Google’s global load balancing infrastructure and did not lead to any outages. While the impact was minimal, Google’s DDoS Response Team reviewed the attacks and added additional protections to further mitigate similar attacks. In addition to Google’s internal response, we helped lead a coordinated disclosure process with industry partners to address the new HTTP/2 vector across the ecosystem.

Hear monthly from our Cloud CISO in your inbox

Get security updates, musings, and more from Google Cloud CISO Phil Venables direct to your inbox every month.

Subscribe today

https://storage.googleapis.com/gweb-cloudblog-publish/images/gcat_small_1.max-300x168.jpg

Below, we explain the predominant methodology for Layer 7 attacks over the last few years, what changed in these new attacks to make them so much larger, and the mitigation strategies we believe are effective against this attack type. This article is written from the perspective of a reverse proxy architecture, where the HTTP request is terminated by a reverse proxy that forwards requests to other services. The same concepts apply to HTTP servers that are integrated into the application server, but with slightly different considerations which potentially lead to different mitigation strategies.

A primer on HTTP/2 for DDoS

Since late 2021, the majority of Layer 7 DDoS attacks we’ve observed across Google first-party services and Google Cloud projects protected by Cloud Armor have been based on HTTP/2, both by number of attacks and by peak request rates.

A primary design goal of HTTP/2 was efficiency, and unfortunately the features that make HTTP/2 more efficient for legitimate clients can also be used to make DDoS attacks more efficient.

Stream multiplexing

HTTP/2 uses “streams”, bidirectional abstractions used to transmit various messages, or “frames”, between the endpoints. “Stream multiplexing” is the core HTTP/2 feature which allows higher utilization of each TCP connection. Streams are multiplexed in a way that can be tracked by both sides of the connection while only using one Layer 4 connection. Stream multiplexing enables clients to have multiple in-flight requests without managing multiple individual connections.

One of the main constraints when mounting a Layer 7 DoS attack is the number of concurrent transport connections. Each connection carries a cost, including operating system memory for socket records and buffers, CPU time for the TLS handshake, as well as each connection needing a unique four-tuple, the IP address and port pair for each side of the connection, constraining the number of concurrent connections between two IP addresses.

In HTTP/1.1, each request is processed serially. The server will read a request, process it, write a response, and only then read and process the next request. In practice, this means that the rate of requests that can be sent over a single connection is one request per round trip, where a round trip includes the network latency, proxy processing time and backend request processing time. While HTTP/1.1 pipelining is available in some clients and servers to increase a connection’s throughput, it is not prevalent amongst legitimate clients.

With HTTP/2, the client can open multiple concurrent streams on a single TCP connection, each stream corresponding to one HTTP request. The maximum number of concurrent open streams is, in theory, controllable by the server, but in practice clients may open 100 streams per request and the servers process these requests in parallel. It’s important to note that server limits can not be unilaterally adjusted.

For example, the client can open 100 streams and send a request on each of them in a single round trip; the proxy will read and process each stream serially, but the requests to the backend servers can again be parallelized. The client can then open new streams as it receives responses to the previous ones. This gives an effective throughput for a single connection of 100 requests per round trip, with similar round trip timing constants to HTTP/1.1 requests. This will typically lead to almost 100 times higher utilization of each connection.

The HTTP/2 Rapid Reset attack

The HTTP/2 protocol allows clients to indicate to the server that a previous stream should be canceled by sending a RST_STREAM frame. The protocol does not require the client and server to coordinate the cancellation in any way, the client may do it unilaterally. The client may also assume that the cancellation will take effect immediately when the server receives the RST_STREAM frame, before any other data from that TCP connection is processed.

This attack is called Rapid Reset because it relies on the ability for an endpoint to send a RST_STREAM frame immediately after sending a request frame, which makes the other endpoint start working and then rapidly resets the request. The request is canceled, but leaves the HTTP/2 connection open.

https://storage.googleapis.com/gweb-cloudblog-publish/images/2023_worlds_largest_rapid_reset_diagram.max-1616x909.png

HTTP/1.1 and HTTP/2 request and response pattern

The HTTP/2 Rapid Reset attack built on this capability is simple: The client opens a large number of streams at once as in the standard HTTP/2 attack, but rather than waiting for a response to each request stream from the server or proxy, the client cancels each request immediately.

The ability to reset streams immediately allows each connection to have an indefinite number of requests in flight. By explicitly canceling the requests, the attacker never exceeds the limit on the number of concurrent open streams. The number of in-flight requests is no longer dependent on the round-trip time (RTT), but only on the available network bandwidth.

In a typical HTTP/2 server implementation, the server will still have to do significant amounts of work for canceled requests, such as allocating new stream data structures, parsing the query and doing header decompression, and mapping the URL to a resource. For reverse proxy implementations, the request may be proxied to the backend server before the RST_STREAM frame is processed. The client on the other hand paid almost no costs for sending the requests. This creates an exploitable cost asymmetry between the server and the client.

Another advantage the attacker gains is that the explicit cancellation of requests immediately after creation means that a reverse proxy server won’t send a response to any of the requests. Canceling the requests before a response is written reduces downlink (server/proxy to attacker) bandwidth.

HTTP/2 Rapid Reset attack variants

In the weeks after the initial DDoS attacks, we have seen some Rapid Reset attack variants. These variants are generally not as efficient as the initial version was, but might still be more efficient than standard HTTP/2 DDoS attacks.

The first variant does not immediately cancel the streams, but instead opens a batch of streams at once, waits for some time, and then cancels those streams and then immediately opens another large batch of new streams. This attack may bypass mitigations that are based on just the rate of inbound RST_STREAM frames (such as allow at most 100 RST_STREAMs per second on a connection before closing it).

These attacks lose the main advantage of the canceling attacks by not maximizing connection utilization, but still have some implementation efficiencies over standard HTTP/2 DDoS attacks. But this variant does mean that any mitigation based on rate-limiting stream cancellations should set fairly strict limits to be effective.

The second variant does away with canceling streams entirely, and instead optimistically tries to open more concurrent streams than the server advertised. The benefit of this approach over the standard HTTP/2 DDoS attack is that the client can keep the request pipeline full at all times, and eliminate client-proxy RTT as a bottleneck. It can also eliminate the proxy-server RTT as a bottleneck if the request is to a resource that the HTTP/2 server responds to immediately.

RFC 9113, the current HTTP/2 RFC, suggests that an attempt to open too many streams should invalidate only the streams that exceeded the limit, not the entire connection. We believe that most HTTP/2 servers will not process those streams, and is what enables the non-cancelling attack variant by almost immediately accepting and processing a new stream after responding to a previous stream.

A multifaceted approach to mitigations

We don’t expect that simply blocking individual requests is a viable mitigation against this class of attacks — instead the entire TCP connection needs to be closed when abuse is detected. HTTP/2 provides built-in support for closing connections, using the GOAWAY frame type. The RFC defines a process for gracefully closing a connection that involves first sending an informational GOAWAY that does not set a limit on opening new streams, and one round trip later sending another that forbids opening additional streams.

However, this graceful GOAWAY process is usually not implemented in a way which is robust against malicious clients. This form of mitigation leaves the connection vulnerable to Rapid Reset attacks for too long, and should not be used for building mitigations as it does not stop the inbound requests. Instead, the GOAWAY should be set up to limit stream creation immediately.

This leaves the question of deciding which connections are abusive. The client canceling requests is not inherently abusive, the feature exists in the HTTP/2 protocol to help better manage request processing. Typical situations are when a browser no longer needs a resource it had requested due to the user navigating away from the page, or applications using a long polling approach with a client-side timeout.

Mitigations for this attack vector can take multiple forms, but mostly center around tracking connection statistics and using various signals and business logic to determine how useful each connection is. For example, if a connection has more than 100 requests with more than 50% of the given requests canceled, it could be a candidate for a mitigation response. The magnitude and type of response depends on the risk to each platform, but responses can range from forceful GOAWAY frames as discussed before to closing the TCP connection immediately.

To mitigate against the non-cancelling variant of this attack, we recommend that HTTP/2 servers should close connections that exceed the concurrent stream limit. This can be either immediately or after some small number of repeat offenses.

Applicability to other protocols

We do not believe these attack methods translate directly to HTTP/3 (QUIC) due to protocol differences, and Google does not currently see HTTP/3 used as a DDoS attack vector at scale. Despite that, our recommendation is for HTTP/3 server implementations to proactively implement mechanisms to limit the amount of work done by a single transport connection, similar to the HTTP/2 mitigations discussed above.

Industry coordination

Early in our DDoS Response Team’s investigation and in coordination with industry partners, it was apparent that this new attack type could have a broad impact on any entity offering the HTTP/2 protocol for their services. Google helped lead a coordinated vulnerability disclosure process taking advantage of a pre-existing coordinated vulnerability disclosure group, which has been used for a number of other efforts in the past.

During the disclosure process, the team focused on notifying large-scale implementers of HTTP/2 including infrastructure companies and server software providers. The goal of these prior notifications was to develop and prepare mitigations for a coordinated release. In the past, this approach has enabled widespread protections to be enabled for service providers or available via software updates for many packages and solutions.

During the coordinated disclosure process, we reserved CVE-2023-44487 to track fixes to the various HTTP/2 implementations.

Next steps

The novel attacks discussed in this post can have significant impact on services of any scale. All providers who have HTTP/2 services should assess their exposure to this issue. Software patches and updates for common web servers and programming languages may be available to apply now or in the near future. We recommend applying those fixes as soon as possible.

For our customers, we recommend patching software and enabling the Application Load Balancer and Google Cloud Armor, which has been protecting Google and existing Google Cloud Application Load Balancing users.

Source :
https://cloud.google.com/blog/products/identity-security/how-it-works-the-novel-http2-rapid-reset-ddos-attack