The Journey to Zero Trust with Industry Frameworks

By: Alifiya Sadikali – Trendmicro
August 09, 2023
Read time: 4 min (1179 words)

Discover the core principles and frameworks of Zero Trust, NIST 800-207 guidelines, and best practices when implementing CISA’s Zero Trust Maturity Model.

With the growing number of devices connected to the internet, traditional security measures are no longer enough to keep your digital assets safe. To protect your organization from digital threats, it’s crucial to establish strong security protocols and take proactive measures to stay vigilant.

What is Zero Trust?

Zero Trust is a cybersecurity philosophy based on the premise that threats can arise internally and externally. With Zero Trust, no user, system, or service should automatically be trusted, regardless of its location within or outside the network. Providing an added layer of security to protect sensitive data and applications, Zero Trust only grants access to authenticated and authorized users and devices. And in the event of a data breach, compartmentalizing access to individual resources limits potential damage.

Your organization should consider Zero Trust as a proactive security strategy to protect its data and assets better.

The pillars of Zero Trust

At its core, the basis for Zero Trust is comprised of a few fundamental principles:

  • Verify explicitly. Only grant access once the user or device has been explicitly authenticated and verified. By doing so, you can ensure that only those with a legitimate need to access your organization’s resources can do so.
  • Least privilege access. Only give users access to the resources they need to do their job and nothing more. Limiting access in this way prevents unauthorized access to your organization’s data and applications.
  • Assume breach. Act as if a compromise to your organization’s security has occurred. Take steps to minimize the damage, including monitoring for unusual activity, limiting access to sensitive data, and ensuring that backups are up-to-date and secure.
  • Microsegmentation. Divide your organization’s network into smaller, more manageable segments and apply security controls to each segment individually. This reduces the risk of a breach spreading from one part of your network to another.
  • Security automation. Use tools and technologies to automate the process of monitoring, detecting, and responding to security threats. This ensures that your organization’s security is always up-to-date and can react quickly to new threats and vulnerabilities.

A Zero Trust approach is a proactive and effective way to protect your organization’s data and assets from cyber-attacks and data breaches. By following these core principles, your organization can minimize the risk of unauthorized access, reduce the impact of a breach, and ensure that your organization’s security is always up-to-date and effective.

The role of NIST 800-207 in Zero Trust

NIST 800-207 is a cybersecurity framework developed by the National Institute of Standards and Technology. It provides guidelines and best practices for organizations to manage and mitigate cybersecurity risks.

Designed to be flexible and adaptable for a variety of organizations and industries, the framework supports the customization of cybersecurity plans to meet their specific needs. Its implementation can help organizations improve their cybersecurity posture and protect against cyber threats.

One of the most important recommendations of NIST 800-207 is to establish a policy engine, policy administrator, and policy enforcement point. This will help ensure consistent policy enforcement and that access is granted only to those who need it.

Another critical recommendation is conducting continuous monitoring and having real-time risk-based decision-making capabilities. This can help you quickly identify and respond to potential threats.

Additionally, it is essential to understand and map dependencies among assets and resources. This will help you ensure your security measures are appropriately targeted based on potential vulnerabilities.

Finally, NIST recommends replacing traditional paradigms, such as implicit trust in assets or entities, with a “trust but verify” methodology. Adopting this approach can better protect your organization’s assets and resources from internal and external threats.

CISA’s Zero Trust Maturity Model

The Zero Trust Maturity Model (ZMM), developed by CISA, provides a comprehensive framework for assessing an organization’s Zero Trust posture. This model covers critical areas including:

  • Identity management: To implement a Zero Trust strategy, it is important to begin with identity. This involves continuously verifying, authenticating, and authorizing any entity before granting access to corporate resources. To achieve this, comprehensive visibility is necessary.
  • Devices, networks, applications: To maintain Zero Trust, use endpoint detection and response capabilities to detect threats and keep track of device assets, network connections, application configurations, and vulnerabilities. Continuously assess and score device security posture and implement risk-informed authentication protocols to ensure only trusted devices, networks and applications can access sensitive data and enterprise systems.
  • Data and governance: To maximize security, implement prevention, detection, and response measures for identity, devices, networks, IoT, and cloud. Monitor legacy protocols and device encryption status. Apply Data Loss Prevention and access control policies based on risk profiles.
  • Visibility and analytics: Zero Trust strategies cannot succeed within silos. By collecting data from various sources within an organization, organizations can gain a complete view of all entities and resources. This data can be analyzed through threat intelligence, generating reliable and contextualized alerts. By tracking broader incidents connected to the same root cause, organizations can make informed policy decisions and take appropriate response actions.
  • Automation and orchestration: To effectively automate security responses, it is important to have access to comprehensive data that can inform the orchestration of systems and manage permissions. This includes identifying the types of data being protected and the entities that are accessing it. By doing so, it ensures that there is proper oversight and security throughout the development process of functions, products, and services.

By thoroughly evaluating these areas, your organization can identify potential vulnerabilities in its security measures and take prompt action to improve your overall cybersecurity posture. CISA’s ZMM offers a holistic approach to security that will enable your organization to remain vigilant against potential threats.

Implementing Zero Trust with Trend Vision One

Trend Vision One seamlessly integrates with third-party partner ecosystems and aligns to industry frameworks and best practices, including NIST and CISA, offering coverage from prevention to extended detection and response across all pillars of zero trust.

Trend Vision One is an innovative solution that empowers organizations to identify their vulnerabilities, monitor potential threats, and evaluate risks in real-time, enabling them to make informed decisions regarding access control. With its open platform approach, Trend enables seamless integration with third-party partner ecosystems, including IAM, Vulnerability Management, Firewall, BAS, and SIEM/SOAR vendors, providing a comprehensive and unified source of truth for risk assessment within your current security framework. Additionally, Trend Vision One is interoperable with SWG, CASB, and ZTNA and includes Attack Surface Management and XDR, all within a single console.

Conclusion

CISOs today understand that the journey towards achieving Zero Trust is a gradual process that requires careful planning, step-by-step implementation, and a shift in mindset towards proactive security and cyber risk management. By understanding the core principles of Zero Trust and utilizing the guidelines provided by NIST and CISA to operationalize Zero Trust with Trend Vision One, you can ensure that your organization’s cybersecurity measures are strong and can adapt to the constantly changing threat landscape.

To read more thought leadership and research about Zero Trust, click here.

Source :
https://www.trendmicro.com/en_us/research/23/h/industry-zero-trust-frameworks.html

ChatGPT Highlights a Flaw in the Educational System

By: William Malik – Trendmicro
August 14, 2023
Read time: 4 min (1014 words)

Rethinking learning metrics and fostering critical thinking in the era of generative AI and LLMs

I recently participated in a conversation about artificial intelligence, specifically ChatGPT and its kin, with a group of educators in South Africa. They were concerned that the software would help students cheat.

We discussed two possible alternatives to ChatGPT: First, teachers could require that students submit handwritten homework. This would force students to at least read the material once before submitting it; Second, teachers could grade the paper submissions no higher than 89 percent (or a “B”), but that to get an “A,” the student would have to stand in front of the class and verbally discuss the material, their research, their conclusion, and answer any questions the teacher or other classmates might ask. (With that verbal defense of the ideas, the teacher might even waive the requirement for paper submission at all!)

The fundamental problem is that the grading system depends on homework. If education aims to teach an individual both a) a body of knowledge and b) the techniques of reasoning with that knowledge, then the metrics proving that achievement is misaligned.

One of the most quoted management scientists is Fredrick W. Taylor. He is most known for saying, “If you can’t measure it, you can’t manage it.” Interestingly, he never said that – which is fortunate because it is entirely wrong. People always manage things without metrics – from driving a car to raising children. He said: “If you measure it, you’ll manage it” – and he intended that as a warning. Whenever you adopt a metric, you will adjust your assessment of the underlying process in terms of your chosen metric. His warning is to be very careful about which metrics you choose.

Sometime in the past forty years, we decided that the purpose of education is to do well on tests. Unfortunately, that is also wrong. The purpose of education is to teach people to gather evidence and to think clearly about it. Students should learn how to judge various forms of evidence. They should understand rhetorical techniques (in the classical sense – how to render ideas clearly). They should be aware of common errors in thinking – the cognitive pitfalls we all fall into when rushed or distracted and logical fallacies which rob our arguments of their validity.

Large Language Models (LLMs) aggregate vast troves of text. Those data sources are not curated, so LLMs reflect the biases, logical limitations, and cognitive distortions in so much of what’s online. We are all familiar with early chatbots that were easily corrupted – the Microsoft chatbot Tay was perverted into being a racist resonator. (See “Twitter taught Microsoft’s AI Chatbot to be a Racist A**hole in Less than a Day” from The Verge, March 24, 2016, at https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist accessed Aug 2023.)

LLMs do not think. They scan as much material as possible, then build a set of probabilities about which word is most likely to follow another word. If the word “pterodactyl” occurs in a text, then the next most likely word might be “soaring,” and “flying” might be in second place. If ChatGPT gets the word “pterodactyl” as input, it will put “soaring” next to it. This may look plausible to a person reading the output, but it cannot be correct. Correctness implies some kind of comprehension and judgment. ChatGPT does neither. It merely arranges words based on their statistical likelihood in the LLM’s database. We are now learning that LLMs that ingest computer-generated content become even more skewed – augmenting the likelihood of one word following another by rescanning the previous output. Over time, LLMs fed AI-generated content will drift farther and farther from actual human writing. The oft-mentioned hallucinations that LLMs generate will become more common as the distillation and amplification of the more likely subset of words leads to a contracted pool of possible machine-generated responses. Eventually – if we are not able to prevent LLMs from ingesting already-processed content – the output of ChatGPT will become more and more constrained, which, taken to the extreme, will yield one plot, one answer, one painting, and one outcome regardless of the specific input. Long before then, people will have abandoned LLM-based efforts for any activity that requires creativity.

Where can LLMs help? By sorting through bounded sets of information. That means an LLM trained on protein sequences could rapidly develop a most likely model for a protein that could attack a particular disease or interrupt an allergic reaction. In that case, the issue isn’t seeking creativity but rapidly scanning a set of nearly identical data overreactions to find the few that stand out enough to make a difference. A human doing this kind of work would quickly grow bored and likely make errors. LLMs can help science move quickly through vast quantities of data in closed domains. But when looking at an unbounded domain (art, poetry, fiction, movies, music, and the like), LLMs can only build average content, filling in the space between works. Artists seek to reach beyond the space their prior work defined.

The core problem with LLMs may be unsolvable. At this point, various organizations are exploring ways to tag AI-generated content (written and graphic) so humans can spend a moment assessing the accuracy and validity of the material. Of course, message digests can be corrupted and watermarks forged. A bad actor might maliciously tag authentic content as AI-generated. Recent developments include malicious ChatGPT variants designed to create BEC and phishing email content,

Students will always look for a shortcut, and that habit is difficult to overcome. In business, it will also be tempting for bureaucrats to use tools to simplify their tasks. How will your firm incorporate LLMs safely into your business processes? Organizations should consider how they will audit their internal procedures to ensure that LLM outputs are incorporated appropriately into communications. Imagine the potential for harm if some publicly traded company was found to have used an LLM to develop its annual financial report!

What do you think? Let me know in the comments below, or contact me @wjmalik@noc.social

Source :
https://www.trendmicro.com/en_us/research/23/h/chatgpt-flaw.html

OT Security is Less Mature but Progressing Rapidly

By: Kazuhisa Tagaya – Trendmicro
August 14, 2023
Read time: 2 min (638 words)

The latest study said that OT security is less mature in several capabilities than IT security, but most organizations are improving it.

e asked participants whether OT security for cybersecurity capabilities is less mature or more mature than IT in their organizations with reference to the NIST CSF.

As an average of all items, 39.5% answered that OT has a lower level of maturity. (18% answered OT security is more mature, and 36.4% at the same level)

Categorizing security capabilities into the five cores of the NIST CSF and aggregating them for each core, the most was that Detect is lower maturity in OT security than in IT. (42%)

figure1
Figure1: What security capabilities in OT are lower than IT (NIST CSF 5 Core)

Furthermore, looking at the specific security capabilities, the score of “Cyber event detection” is the most(45.7%).

figure2
Figure2: What security capabilities in OT are lower than IT (detail)

The OT environment has more diverse legacy assets, and protocol stacks dedicated to ICS/OT, making it difficult to implement sensors to detect malicious behavior or apply the patches on the assets. The inability to implement uniform measures in the same way as IT security is an obstacle to increasing the maturity level.

Detection in OT: Endpoint and Network

The survey asked respondents about their Endpoint Detection and Response (EDR) and Network Security Monitoring (NSM) implementations to measure their visibility in their OT environments. They answered whether EDR (including antivirus) was implemented in the following three places.

  • Server assets running commercial OS (Windows, Linux, Unix): 41%
  • Engineering (engineering workstations, instrumentation laptops, calibration and test equipment) assets running commercial OS (Windows, Unix, Linux): 34%
  • Operator assets (HMI, workstations) running commercial OS (Windows, Linux, Unix): 33% 

In addition, 76% of organizations that have already deployed EDR said they plan to expand their deployment within 24 months.

figure3
Figure3: EDR deployment

We also asked whether NSM (including IDS) was implemented at the following levels referring to the Purdue model.

  • Purdue Level 4 (Enterprise): 30%
  • Purdue Level 3.5 (DMZ): 36%
  • Purdue Level 3 (Site or SCADA-wide): 38%
  • Purdue Level 2 (Control): 20%
  • Purdue Levels 1/0 (Sensors and Actuators): 8%

Like EDR, 70% of organizations that have already implemented NSM said they have plans to expand implementation within 24 months.

figure4
Figure4: NSM deployment

In this survey, EDR implementation rates tended to vary depending on the respondent’s industry and size of organization. The implementation rate of NSM was relatively high in DMZ and Level 3, and the implementation rate decreased according to the lower layers. But I think it is not appropriate to conclude the decisive trend from the average value in the questions, because there are variations in the places where they are implemented EDR and NSM depending on the organization. The implementation rate shown here is just a rough standard. Where and how much to invest depends on the environment and decision-making of the organization. Asset owners can use the result as a reference to see where to implement EDR and NSM and evaluate their implementation plans.

To learn about how to assess risk in your OT environment to invest appropriately, please refer to our practices of risk assessment in smart factories.

Reference:
Breaking IT/OT Silos with ICS/OT Visibility – 2023 SANS ICS/OT visibility survey

Source :
https://www.trendmicro.com/en_us/research/23/h/ot-security-2023.html

Top 10 AI Security Risks According to OWASP

By: Trend Micro
August 15, 2023
Read time: 4 min (1157 words)

The unveiling of the first-ever Open Worldwide Application Security Project (OWASP) risk list for large language model AI chatbots was yet another sign of generative AI’s rush into the mainstream—and a crucial step toward protecting enterprises from AI-related threats.

For more than 20 years, the Open Worldwide Application Security Project (OWASP) top 10 risk list has been a go-to reference in the fight to make software more secure. So it’s no surprise developers and cybersecurity professionals paid close attention earlier this spring when OWASP published an all-new list focused on large language model AI vulnerabilities.

OWASP’s move is yet more proof of how quickly AI chatbots have swept into the mainstream. Nearly half (48%) of corporate respondents to one survey said that by February 2023 they had already replaced workers with ChatGPT—just three months after its public launch. With many observers expressing concern that AI adoption has rushed ahead without understanding of the risks involved, the OWASP top 10 AI risk list is both timely and essential.

Large language model vulnerabilities at a glance

OWASP has released two draft versions of its AI vulnerability list so far: one in May 2023 and a July 1 update with refined classifications and definitions, examples, scenarios, and links to additional references. The most recent is labeled ‘version 0.5’, and a formal version 1 is reported to be in the works.

We did some analysis and found the vulnerabilities identified by OWASP fall broadly into three categories:

  1. Access risks associated with exploited privileges and unauthorized actions.
  2. Data risks such as data manipulation or loss of services.
  3. Reputational and business risks resulting from bad AI outputs or actions.

In this blog, we take a closer look at the specific risks in each case and offer some suggestions about how to handle them.

1. Access risks

Of the 10 vulnerabilities listed by OWASP, four are specific to access and misuse of privileges: insecure plugins, insecure output handling, permissions issues, and excessive agency.

According to OWASP, any large language model that uses insecure plugins to receive “free-form text” inputs could be exposed to malicious requests, resulting in unwanted behaviors or the execution of unauthorized remote code. On the flipside, plugins or applications that handle large language model outputs insecurely—without evaluating them—could be susceptible to cross-site and server-side request forgeries, unauthorized privilege escalations, hijack attacks, and more.

Similarly, when authorizations aren’t tracked between plugins, permissions issues can arise that open the way for indirect prompt injections or malicious plugin usage.

Finally, because AI chatbots are ‘actors’ able to make and implement decisions, it matters how much free reign (i.e., agency) they’re given. As OWASP explains, “When LLMs interface with other systems, unrestricted agency may lead to undesirable operations and actions.” Examples include personal mail reader assistants being exploited to propagate spam or customer service AI chatbots manipulated into issuing undeserved refunds.

In all of these cases, the large language model becomes a conduit for bad actors to infiltrate systems.

2. Data risks

Poisoned training data, supply chain vulnerabilities, prompt injection vulnerabilities and denials of serviceare all data-specific AI risks.

Data can be poisoned deliberately by bad actors who want to harm an organization. It can also be distorted inadvertently when an AI system learns from unreliable or unvetted sources. Both types of poisoning can occur within an active AI chatbot application or emerge from the large language model supply chain, where reliance on pre-trained models, crowdsourced data, and insecure plugin extensions may produce biased data outputs, security breaches, or system failures.

With prompt injections, ill-meaning inputs may cause a large language model AI chatbot to expose data that should be kept private or perform other actions that lead to data compromises.

AI denial of service attacks are similar to classic DOS attacks. They may aim to overwhelm a large language model and deprive users of access to data and apps, or—because many AI chatbots rely on pay-as-you-go IT infrastructure—force the system to consume excessive resources and rack up massive costs.

3. Reputational and business risks

The final OWASP vulnerability (according to our buckets) is already reaping consequences around the world today:overreliance on AI. There’s no shortage of stories about large language models generating false or inappropriate outputs from fabricated citations and legal precedents to racist and sexist language.

OWASP points out that depending on AI chatbots without proper oversight can make organizations vulnerable to publishing misinformation or offensive content that results in reputational damage or even legal action.
Given all these various risks, the question becomes, “What can we do about it?” Fortunately, there are some protective steps organizations can take. 

What enterprises can do about large language model vulnerabilities

From our perspective at Trend Micro, defending against AI access risks requires a zero-trust security stance with disciplined separation of systems (sandboxing). Even though generative AI has the ability to challenge zero-trust defenses in ways that other IT systems don’t—because it can mimic trusted entities—a zero-trust posture still adds checks and balances that make it easier to identify and contain unwanted activity. OWASP also advises that large language models “should not self-police” and calls for controls to be embedded in application programming interfaces (APIs).

Sandboxing is also key to protecting data privacy and integrity: keeping confidential information fully separated from shareable data and making it inaccessible to AI chatbots and other public-facing systems. (See our recent blog on AI cybersecurity policies for more.)

Good separation of data prevents large language models from including private or personally identifiable information in public outputs, and from being publicly prompted to interact with secure applications such as payment systems in inappropriate ways.

On the reputational front, the simplest remedies are to not rely solely on AI-generated content or code, and to never publish or use AI outputs without first verifying they are true, accurate, and reliable.

Many of these defensive measures can—and should—be embedded in corporate policies. Once an appropriate policy foundation is in place, security technologies such as endpoint detection and response (EDR), extended detection and response (XDR), and security information and event management (SIEM) can be used for enforcement and to monitor for potentially harmful activity.

Large language model AI chatbots are here to stay

OWASP’s initial work cataloguing AI risks proves that concerns about the rush to embrace AI are well justified. At the same time, AI clearly isn’t going anywhere, so understanding the risks and taking responsible steps to mitigate them is critically important.

Setting up the right policies to manage AI use and implementing those policies with the help of cybersecurity solutions is a good first step. So is staying informed. The way we see it at Trend Micro, OWASP’s top 10 AI risk list is bound to become as much of an annual must-read as its original application security list has been since 2003.

Next steps

For more Trend Micro thought leadership on AI chatbot security, check out these resources:

Source :
https://www.trendmicro.com/en_us/research/23/h/top-ai-risks.html

8 Essential Tips for Data Protection and Cybersecurity in Small Businesses

Michelle Quill — June 6, 2023

Small businesses are often targeted by cybercriminals due to their lack of resources and security measures. Protecting your business from cyber threats is crucial to avoid data breaches and financial losses.

Why is cyber security so important for small businesses?

Small businesses are particularly in danger of cyberattacks, which can result in financial loss, data breaches, and damage to IT equipment. To protect your business, it’s important to implement strong cybersecurity measures.

Here are some tips to help you get started:

One important aspect of data protection and cybersecurity for small businesses is controlling access to customer lists. It’s important to limit access to this sensitive information to only those employees who need it to perform their job duties. Additionally, implementing strong password policies and regularly updating software and security measures can help prevent unauthorized access and protect against cyber attacks. Regular employee training on cybersecurity best practices can also help ensure that everyone in the organization is aware of potential threats and knows how to respond in the event of a breach.

When it comes to protecting customer credit card information in small businesses, there are a few key tips to keep in mind. First and foremost, it’s important to use secure payment processing systems that encrypt sensitive data. Additionally, it’s crucial to regularly update software and security measures to stay ahead of potential threats. Employee training and education on cybersecurity best practices can also go a long way in preventing data breaches. Finally, having a plan in place for responding to a breach can help minimize the damage and protect both your business and your customers.

Small businesses are often exposed to cyber attacks, making data protection and cybersecurity crucial. One area of particular concern is your company’s banking details. To protect this sensitive information, consider implementing strong passwords, two-factor authentication, and regular monitoring of your accounts. Additionally, educate your employees on safe online practices and limit access to financial information to only those who need it. Regularly backing up your data and investing in cybersecurity software can also help prevent data breaches.

Small businesses are often at high risk of cyber attacks due to their limited resources and lack of expertise in cybersecurity. To protect sensitive data, it is important to implement strong passwords, regularly update software and antivirus programs, and limit access to confidential information.

It is also important to have a plan in place in case of a security breach, including steps to contain the breach and notify affected parties. By taking these steps, small businesses can better protect themselves from cyber threats and ensure the safety of their data.

Tips for protecting your small business from cyber threats and data breaches are crucial in today’s digital age. One of the most important steps is to educate your employees on cybersecurity best practices, such as using strong passwords and avoiding suspicious emails or links.

It’s also important to regularly update your software and systems to ensure they are secure and protected against the latest threats. Additionally, implementing multi-factor authentication and encrypting sensitive data can add an extra layer of protection. Finally, having a plan in place for responding to a cyber-attack or data breach can help minimize the damage and get your business back on track as quickly as possible.

Small businesses are attackable to cyber-attacks and data breaches, which can have devastating consequences. To protect your business, it’s important to implement strong cybersecurity measures. This includes using strong passwords, regularly updating software and systems, and training employees on how to identify and avoid phishing scams.

It’s also important to have a data backup plan in place and to regularly test your security measures to ensure they are effective. By taking these steps, you can help protect your business from cyber threats and safeguard your valuable data.

To protect against cyber threats, it’s important to implement strong data protection and cybersecurity measures. This can include regularly updating software and passwords, using firewalls and antivirus software, and providing employee training on safe online practices. Additionally, it’s important to have a plan in place for responding to a cyber attack, including backing up data and having a designated point person for handling the situation.

In today’s digital age, small businesses must prioritize data protection and cybersecurity to safeguard their operations and reputation. With the rise of remote work and cloud-based technology, businesses are more vulnerable to cyber attacks than ever before. To mitigate these risks, it’s crucial to implement strong security measures for online meetings, advertising, transactions, and communication with customers and suppliers. By prioritizing cybersecurity, small businesses can protect their data and prevent unauthorized access or breaches.

Here are 8 essential tips for data protection and cybersecurity in small businesses.

8 Essential Tips for Data Protection and Cybersecurity in Small Businesses

1. Train Your Employees on Cybersecurity Best Practices

Your employees are the first line of defense against cyber threats. It’s important to train them on cybersecurity best practices to ensure they understand the risks and how to prevent them. This includes creating strong passwords, avoiding suspicious emails and links, and regularly updating software and security systems. Consider providing regular training sessions and resources to keep your employees informed and prepared.

2. Use Strong Passwords and Two-Factor Authentication

One of the most basic yet effective ways to protect your business from cyber threats is to use strong passwords and two-factor authentication. Encourage your employees to use complex passwords that include a mix of letters, numbers, and symbols, and to avoid using the same password for multiple accounts. Two-factor authentication adds an extra layer of security by requiring a second form of verification, such as a code sent to a mobile device, before granting access to an account. This can help prevent unauthorized access even if a password is compromised.

3. Keep Your Software and Systems Up to Date

One of the easiest ways for cybercriminals to gain access to your business’s data is through outdated software and systems. Hackers are constantly looking for vulnerabilities in software and operating systems, and if they find one, they can exploit it to gain access to your data. To prevent this, make sure all software and systems are kept up-to-date with the latest security patches and updates. This includes not only your computers and servers but also any mobile devices and other connected devices used in your business. Set up automatic updates whenever possible to ensure that you don’t miss any critical security updates.

4. Use Antivirus and Anti-Malware Software

Antivirus and anti-malware software are essential tools for protecting your small business from cyber threats. These programs can detect and remove malicious software, such as viruses, spyware, and ransomware before they can cause damage to your systems or steal your data. Make sure to install reputable antivirus and anti-malware software on all devices used in your business, including computers, servers, and mobile devices. Keep the software up-to-date and run regular scans to ensure that your systems are free from malware.

5. Backup Your Data Regularly

One of the most important steps you can take to protect your small business from data loss is to back up your data regularly. This means creating copies of your important files and storing them in a secure location, such as an external hard drive or cloud storage service. In the event of a cyber-attack or other disaster, having a backup of your data can help you quickly recover and minimize the impact on your business. Make sure to test your backups regularly to ensure that they are working properly and that you can restore your data if needed.

6. Carry out a risk assessment

Small businesses are especially in peril of cyber attacks, making it crucial to prioritize data protection and cybersecurity. One important step is to assess potential risks that could compromise your company’s networks, systems, and information. By identifying and analyzing possible threats, you can develop a plan to address security gaps and protect your business from harm.

For Small businesses making data protection and cybersecurity is a crucial part. To start, conduct a thorough risk assessment to identify where and how your data is stored, who has access to it, and potential threats. If you use cloud storage, consult with your provider to assess risks. Determine the potential impact of breaches and establish risk levels for different events. By taking these steps, you can better protect your business from cyber threats

7. Limit access to sensitive data

One effective strategy is to limit access to critical data to only those who need it. This reduces the risk of a data breach and makes it harder for malicious insiders to gain unauthorized access. To ensure accountability and clarity, create a plan that outlines who has access to what information and what their roles and responsibilities are. By taking these steps, you can help safeguard your business against cyber threats.

8. Use a firewall

For Small businesses, it’s important to protect the system from cyber attacks by making data protection and reducing cybersecurity risk. One effective measure is implementing a firewall, which not only protects hardware but also software. By blocking or deterring viruses from entering the network, a firewall provides an added layer of security. It’s important to note that a firewall differs from an antivirus, which targets software affected by a virus that has already infiltrated the system.

Small businesses can take steps to protect their data and ensure cybersecurity. One important step is to install a firewall and keep it updated with the latest software or firmware. Regularly checking for updates can help prevent potential security breaches.

Conclusion

Small businesses are particularly vulnerable to cyber attacks, so it’s important to take steps to protect your data. One key tip is to be cautious when granting access to your systems, especially to partners or suppliers. Before granting access, make sure they have similar cybersecurity practices in place. Don’t hesitate to ask for proof or to conduct a security audit to ensure your data is safe.

Source :
https://onlinecomputertips.com/support-categories/networking/tips-for-cybersecurity-in-small-businesses/

Export Your Google Sites Website to Your Computer

Preston Mason — January 12, 2023

Everyone who uses a computer knows that Google has a service or an app for just about anything you think of. And for those who want an easy way to create websites, they also have their free online website creation tool called Google Sites which doesn’t require any coding skills or HTML knowledge to use. But if you want to backup your website, you will notice that there is no option to do so. In this article, we will show you how to export your google sites website to your computer.

If you are using traditional website development tools or software, you would generally be working on your individual HTML files and also be backing them up as needed. But with Google Sites, you are working within the Sites interface online and your website it stored in your Google Drive account.

The Backup Process

When you go to your Google Drive account and find your website, you will see that it appears as a single file just like a document or spreadsheet would appear. Then if you right click on it, there is no download option like you would see for other files.

Google Drive right click options

To backup your Sites website, you will first need to make a copy of it within your Google Drive account and move to a different folder. Within Drive, click on the New button and then choose New folder. Name this folder something like Website Backup or whatever you would like.

Next, go to your website file in Drive, right click it and then choose the Make a copy option. This will make a copy of your website in the same location and will add copy of in front of the existing name.

Google Drive file right click options make a copy

You can then drag the copy to your new folder or right click on the copy and choose the Move to option and then choose your new folder. You can also move your original website file to this new folder if you don’t want to make the copy for the backup.

Using Google Takeout

After this copy is created, you can go to the Google Takeout website and just make sure you are logged in with the same Google account that you use for your website. Then the first thing you want to do is click on Deselect all since all of the checkboxes will be selected by default.

Google Takeout website

Then scroll down the list and look for Drive (not Classic Sites) and check the box and then click on All Drive data included.

Google Takeout website

Next you will uncheck the box that says Include all files and folders in Drive and check only the box next to the folder name that contains your website file and click OK.

Export Your Google Sites Website to Your Computer

Then you will need to scroll to the bottom of the page and click on the button that says Next step. Now you will be able to configure the export to either send a download link to your email address or add your backup to your Drive, OneDrive, Dropbox or Box account. I prefer the email option.

You can also setup this backup as a one time export or have it be exported every 2 months for the next year. For the export file type you can choose between zip and tgz file formats. If you have a really large website then the export will be split into 2GB files but your website is most likely much smaller than 2GB.

Once you have everything set correctly, you can click on the Create export button.

Export Your Google Sites Website to Your Computer

You will then be shown an export progress screen telling you that your will be sent an email when the export is complete. Even though it says it can possibly take days to complete, its usually fairly quick and might only take a few minutes.

Export Your Google Sites Website to Your Computer

Once you receive the email, you can click on the Download your files button or the Mange exports button to be taken back to the Google Takeout website to download your files.

Export Your Google Sites Website to Your Computer
Google Takeout Mange Exports

After you download the zip file, you can then extract it and navigate to the location of your website files.

Export Your Google Sites Website to Your Computer

Source :
https://onlinecomputertips.com/support-categories/software/export-your-google-sites-website-to-your-computer/

Part 2: Rethinking cache purge with a new architecture

21/06/2023

In Part 1: Rethinking Cache Purge, Fast and Scalable Global Cache Invalidation, we outlined the importance of cache invalidation and the difficulties of purging caches, how our existing purge system was designed and performed, and we gave a high level overview of what we wanted our new Cache Purge system to look like.

It’s been a while since we published the first blog post and it’s time for an update on what we’ve been working on. In this post we’ll be talking about some of the architecture improvements we’ve made so far and what we’re working on now.

Cache Purge end to end

We touched on the high level design of what we called the “coreless” purge system in part 1, but let’s dive deeper into what that design encompasses by following a purge request from end to end:

Step 1: Request received locally

An API request to Cloudflare is routed to the nearest Cloudflare data center and passed to an API Gateway worker. This worker looks at the request URL to see which service it should be sent to and forwards the request to the appropriate upstream backend. Most endpoints of the Cloudflare API are currently handled by centralized services, so the API Gateway worker is often just proxying requests to the nearest “core” data center which have their own gateway services to handle authentication, authorization, and further routing. But for endpoints which aren’t handled centrally the API Gateway worker must handle authentication and route authorization, and then proxy to an appropriate upstream. For cache purge requests that upstream is a Purge Ingest worker in the same data center.

Step 2: Purges tested locally

The Purge Ingest worker evaluates the purge request to make sure it is processible. It scans the URLs in the body of the request to see if they’re valid, then attempts to purge the URLs from the local data center’s cache. This concept of local purging was a new step introduced with the coreless purge system allowing us to capitalize on existing logic already used in every data center.

By leveraging the same ownership checks our data centers use to serve a zone’s normal traffic on the URLs being purged, we can determine if those URLs are even cacheable by the zone. Currently more than 50% of the URLs we’re asked to purge can’t be cached by the requesting zones, either because they don’t own the URLs (e.g. a customer asking us to purge https://cloudflare.com) or because the zone’s settings for the URL prevent caching (e.g. the zone has a “bypass” cache rule that matches the URL). All such purges are superfluous and shouldn’t be processed further, so we filter them out and avoid broadcasting them to other data centers freeing up resources to process more legitimate purges.

On top of that, generating the cache key for a file isn’t free; we need to load zone configuration options that might affect the cache key, apply various transformations, et cetera. The cache key for a given file is the same in every data center though, so when we purge the file locally we now return the generated cache key to the Purge Ingest worker and broadcast that key to other data centers instead of making each data center generate it themselves.

Step 3: Purges queued for broadcasting

purge request to small colo, ingest worker sends to queue worker in T1

Once the local purge is done the Purge Ingest worker forwards the purge request with the cache key obtained from the local cache to a Purge Queue worker. The queue worker is a Durable Object worker using its persistent state to hold a queue of purges it receives and pointers to how far along in the queue each data center in our network is in processing purges.

The queue is important because it allows us to automatically recover from a number of scenarios such as connectivity issues or data centers coming back online after maintenance. Having a record of all purges since an issue arose lets us replay those purges to a data center and “catch up”.

But Durable Objects are globally unique, so having one manage all global purges would have just moved our centrality problem from a core data center to wherever that Durable Object was provisioned. Instead we have dozens of Durable Objects in each region, and the Purge Ingest worker looks at the load balancing pool of Durable Objects for its region and picks one (often in the same data center) to forward the request to. The Durable Object will write the purge request to its queue and immediately loop through all the data center pointers and attempt to push any outstanding purges to each.

While benchmarking our performance we found our particular workload exhibited a “goldilocks zone” of throughput to a given Durable Object. On script startup we have to load all sorts of data like network topology and data center health–then refresh it continuously in the background–and as long as the Durable Object sees steady traffic it stays active and we amortize those startup costs. But if you ask a single Durable Object to do too much at once like send or receive too many requests, the single-threaded runtime won’t keep up. Regional purge traffic fluctuates a lot depending on local time of day, so there wasn’t a static quantity of Durable Objects per region that would let us stay within the goldilocks zone of enough requests to each to keep them active but not too many to keep them efficient. So we built load monitoring into our Durable Objects, and a Regional Autoscaler worker to aggregate that data and adjust load balancing pools when we start approaching the upper or lower edges of our efficiency goldilocks zone.

Step 4: Purges broadcast globally

multiple regions, durable object sends purges to fanouts in other regions, fanout sends to small colos in their region

Once a purge request is queued by a Purge Queue worker it needs to be broadcast to the rest of Cloudflare’s data centers to be carried out by their caches. The Durable Objects will broadcast purges directly to all data centers in their region, but when broadcasting to other regions they pick a Purge Fanout worker per region to take care of their region’s distribution. The fanout workers manage queues of their own as well as pointers for all of their region’s data centers, and in fact they share a lot of the same logic as the Purge Queue workers in order to do so. One key difference is fanout workers aren’t Durable Objects; they’re normal worker scripts, and their queues are purely in memory as opposed to being backed by Durable Object state. This means not all queue worker Durable Objects are talking to the same fanout worker in each region. Fanout workers can be dropped and spun up again quickly by any metal in the data center because they aren’t canonical sources of state. They maintain queues and pointers for their region but all of that info is also sent back downstream to the Durable Objects who persist that data themselves, reliably.

But what does the fanout worker get us? Cloudflare has hundreds of data centers all over the world, and as we mentioned above we benefit from keeping the number of incoming and outgoing requests for a Durable Object fairly low. Sending purges to a fanout worker per region means each Durable Object only has to make a fraction of the requests it would if it were broadcasting to every data center directly, which means it can process purges faster.

On top of that, occasionally a request will fail to get where it was going and require retransmission. When this happens between data centers in the same region it’s largely unnoticeable, but when a Durable Object in Canada has to retry a request to a data center in rural South Africa the cost of traversing that whole distance again is steep. The data centers elected to host fanout workers have the most reliable connections in their regions to the rest of our network. This minimizes the chance of inter-regional retries and limits the latency imposed by retries to regional timescales.

The introduction of the Purge Fanout worker was a massive improvement to our distribution system, reducing our end-to-end purge latency by 50% on its own and increasing our throughput threefold.

Current status of coreless purge

We are proud to say our new purge system has been in production serving purge by URL requests since July 2022, and the results in terms of latency improvements are dramatic. In addition, flexible purge requests (purge by tag/prefix/host and purge everything) share and benefit from the new coreless purge system’s entrypoint workers before heading to a core data center for fulfillment.

The reason flexible purge isn’t also fully coreless yet is because it’s a more complex task than “purge this object”; flexible purge requests can end up purging multiple objects–or even entire zones–from cache. They do this through an entirely different process that isn’t coreless compatible, so to make flexible purge fully coreless we would have needed to come up with an entirely new multi-purge mechanism on top of redesigning distribution. We chose instead to start with just purge by URL so we could focus purely on the most impactful improvements, revamping distribution, without reworking the logic a data center uses to actually remove an object from cache.

This is not to say that the flexible purges haven’t benefited from the coreless purge project. Our cache purge API lets users bundle single file and flexible purges in one request, so the API Gateway worker and Purge Ingest worker handle authorization, authentication and payload validation for flexible purges too. Those flexible purges get forwarded directly to our services in core data centers pre-authorized and validated which reduces load on those core data center auth services. As an added benefit, because authorization and validity checks all happen at the edge for all purge types users get much faster feedback when their requests are malformed.

Next steps

While coreless cache purge has come a long way since the part 1 blog post, we’re not done. We continue to work on reducing end-to-end latency even more for purge by URL because we can do better. Alongside improvements to our new distribution system, we’ve also been working on the redesign of flexible purge to make it fully coreless, and we’re really excited to share the results we’re seeing soon. Flexible cache purge is an incredibly popular API and we’re giving its refresh the care and attention it deserves.

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet applicationward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you’re looking for a new career direction, check out our open positions.

Source :
https://blog.cloudflare.com/rethinking-cache-purge-architecture/

Part 1: Rethinking Cache Purge, Fast and Scalable Global Cache Invalidation

14/05/2022

There is a famous quote attributed to a Netscape engineer: “There are only two difficult problems in computer science: cache invalidation and naming things.” While naming things does oddly take up an inordinate amount of time, cache invalidation shouldn’t.

In the past we’ve written about Cloudflare’s incredibly fast response times, whether content is cached on our global network or not. If content is cached, it can be served from a Cloudflare cache server, which are distributed across the globe and are generally a lot closer in physical proximity to the visitor. This saves the visitor’s request from needing to go all the way back to an origin server for a response. But what happens when a webmaster updates something on their origin and would like these caches to be updated as well? This is where cache “purging” (also known as “invalidation”) comes in.

Customers thinking about setting up a CDN and caching infrastructure consider questions like:

  • How do different caching invalidation/purge mechanisms compare?
  • How many times a day/hour/minute do I expect to purge content?
  • How quickly can the cache be purged when needed?

This blog will discuss why invalidating cached assets is hard, what Cloudflare has done to make it easy (because we care about your experience as a developer), and the engineering work we’re putting in this year to make the performance and scalability of our purge services the best in the industry.

What makes purging difficult also makes it useful

(i) Scale
The first thing that complicates cache invalidation is doing it at scale. With data centers in over 270 cities around the globe, our most popular users’ assets can be replicated at every corner of our network. This also means that a purge request needs to be distributed to all data centers where that content is cached. When a data center receives a purge request, it needs to locate the cached content to ensure that subsequent visitor requests for that content are not served stale/outdated data. Requests for the purged content should be forwarded to the origin for a fresh copy, which is then re-cached on its way back to the user.

This process repeats for every data center in Cloudflare’s fleet. And due to Cloudflare’s massive network, maintaining this consistency when certain data centers may be unreachable or go offline, is what makes purging at scale difficult.

Making sure that every data center gets the purge command and remains up-to-date with its content logs is only part of the problem. Getting the purge request to data centers quickly so that content is updated uniformly is the next reason why cache invalidation is hard.  

(ii) Speed
When purging an asset, race conditions abound. Requests for an asset can happen at any time, and may not follow a pattern of predictability. Content can also change unpredictably. Therefore, when content changes and a purge request is sent, it must be distributed across the globe quickly. If purging an individual asset, say an image, takes too long, some visitors will be served the new version, while others are served outdated content. This data inconsistency degrades user experience, and can lead to confusion as to which version is the “right” version. Websites can sometimes even break in their entirety due to this purge latency (e.g. by upgrading versions of a non-backwards compatible JavaScript library).

Purging at speed is also difficult when combined with Cloudflare’s massive global footprint. For example, if a purge request is traveling at the speed of light between Tokyo and Cape Town (both cities where Cloudflare has data centers), just the transit alone (no authorization of the purge request or execution) would take over 180ms on average based on submarine cable placement. Purging a smaller network footprint may reduce these speed concerns while making purge times appear faster, but does so at the expense of worse performance for customers who want to make sure that their cached content is fast for everyone.

(iii) Scope
The final thing that makes purge difficult is making sure that only the unneeded web assets are invalidated. Maintaining a cache is important for egress cost savings and response speed. Webmasters’ origins could be knocked over by a thundering herd of requests, if they choose to purge all content needlessly. It’s a delicate balance of purging just enough: too much can result in both monetary and downtime costs, and too little will result in visitors receiving outdated content.

At Cloudflare, what to invalidate in a data center is often dictated by the type of purge. Purge everything, as you could probably guess, purges all cached content associated with a website. Purge by prefix purges content based on a URL prefix. Purge by hostname can invalidate content based on a hostname. Purge by URL or single file purge focuses on purging specified URLs. Finally, Purge by tag purges assets that are marked with Cache-Tag headers. These markers offer webmasters flexibility in grouping assets together. When a purge request for a tag comes into a data center, all assets marked with that tag will be invalidated.

With that overview in mind, the remainder of this blog will focus on putting each element of invalidation together to benchmark the performance of Cloudflare’s purge pipeline and provide context for what performance means in the real-world. We’ll be reviewing how fast Cloudflare can invalidate cached content across the world. This will provide a baseline analysis for how quick our purge systems are presently, which we will use to show how much we will improve by the time we launch our new purge system later this year.

How does purge work currently?

In general, purge takes the following route through Cloudflare’s data centers.

  • A purge request is initiated via the API or UI. This request specifies how our data centers should identify the assets to be purged. This can be accomplished via cache-tag header(s), URL(s), entire hostnames, and much more.
  • The request is received by any Cloudflare data center and is identified to be a purge request. It is then routed to a Cloudflare core data center (a set of a few data centers responsible for network management activities).
  • When a core data center receives it, the request is processed by a number of internal services that (for example) make sure the request is being sent from an account with the appropriate authorization to purge the asset. Following this, the request gets fanned out globally to all Cloudflare data centers using our distribution service.
  • When received by a data center, the purge request is processed and all assets with the matching identification criteria are either located and removed, or marked as stale. These stale assets are not served in response to requests and are instead re-pulled from the origin.
  • After being pulled from the origin, the response is written to cache again, replacing the purged version.

Now let’s look at this process in practice. Below we describe Cloudflare’s purge benchmarking that uses real-world performance data from our purge pipeline.

Benchmarking purge performance design

In order to understand how performant Cloudflare’s purge system is, we measured the time it took from sending the purge request to the moment that the purge is complete and the asset is no longer served from cache.  

In general, the process of measuring purge speeds involves: (i) ensuring that a particular piece of content is cached, (ii) sending the command to invalidate the cache, (iii) simultaneously checking our internal system logs for how the purge request is routed through our infrastructure, and (iv) measuring when the asset is removed from cache (first miss).

This process measures how quickly cache is invalidated from the perspective of an average user.

  • Clock starts
    As noted above, in this experiment we’re using sampled RUM data from our purge systems. The goal of this experiment is to benchmark current data for how long it can take to purge an asset on Cloudflare across different regions. Once the asset was cached in a region on Cloudflare, we identify when a purge request is received for that asset. At that same instant, the clock started for this experiment. We include in this time any retrys that we needed to make (due to data centers missing the initial purge request) to ensure that the purge was done consistently across our network. The clock continues as the request transits our purge pipeline  (data center > core > fanout > purge from all data centers).  
  • Clock stops
    What caused the clock to stop was the purged asset being removed from cache, meaning that the data center is no longer serving the asset from cache to visitor’s requests. Our internal logging measures the precise moment that the cache content has been removed or expired and from that data we were able to determine the following benchmarks for our purge types in various regions.  

Results

We’ve divided our benchmarks in two ways: by purge type and by region.

We singled out Purge by URL because it identifies a single target asset to be purged. While that asset can be stored in multiple locations, the amount of data to be purged is strictly defined.

We’ve combined all other types of purge (everything, tag, prefix, hostname) together because the amount of data to be removed is highly variable. Purging a whole website or by assets identified with cache tags could mean we need to find and remove a multitude of content from many different data centers in our network.

Secondly, we have segmented our benchmark measurements by regions and specifically we confined the benchmarks to specific data center servers in the region because we were concerned about clock skews between different data centers. This is the reason why we limited the test to the same cache servers so that even if there was skew, they’d all be skewed in the same way.  

We took the latency from the representative data centers in each of the following regions and the global latency. Data centers were not evenly distributed in each region, but in total represent about 90 different cities around the world:  

  • Africa
  • Asia Pacific Region (APAC)
  • Eastern Europe (EEUR)
  • Eastern North America (ENAM)
  • Oceania
  • South America (SA)
  • Western Europe (WEUR)
  • Western North America (WNAM)

The global latency numbers represent the purge data from all Cloudflare data centers in over 270 cities globally. In the results below, global latency numbers may be larger than the regional numbers because it represents all of our data centers instead of only a regional portion so outliers and retries might have an outsized effect.

Below are the results for how quickly our current purge pipeline was able to invalidate content by purge type and region. All times are represented in seconds and divided into P50, P75, and P99 quantiles. Meaning for “P50” that 50% of the purges were at the indicated latency or faster.  

Purge By URL

P50P75P99
AFRICA0.95s1.94s6.42s
APAC0.91s1.87s6.34s
EEUR0.84s1.66s6.30s
ENAM0.85s1.71s6.27s
OCEANIA0.95s1.96s6.40s
SA0.91s1.86s6.33s
WEUR0.84s1.68s6.30s
WNAM0.87s1.74s6.25s
GLOBAL1.31s1.80s6.35s

Purge Everything, by Tag, by Prefix, by Hostname

P50P75P99
AFRICA1.42s1.93s4.24s
APAC1.30s2.00s5.11s
EEUR1.24s1.77s4.07s
ENAM1.08s1.62s3.92s
OCEANIA1.16s1.70s4.01s
SA1.25s1.79s4.106s
WEUR1.19s1.73s4.04s
WNAM0.9995s1.53s3.83s
GLOBAL1.57s2.32s5.97s

A general note about these benchmarks — the data represented here was taken from over 48 hours (two days) of RUM purge latency data in May 2022. If you are interested in how quickly your content can be invalidated on Cloudflare, we suggest you test our platform with your website.

Those numbers are good and much faster than most of our competitors. Even in the worst case, we see the time from when you tell us to purge an item to when it is removed globally is less than seven seconds. In most cases, it’s less than a second. That’s great for most applications, but we want to be even faster. Our goal is to get cache purge to as close as theoretically possible to the speed of light limit for a network our size, which is 200ms.

Intriguingly, LEO satellite networks may be able to provide even lower global latency than fiber optics because of the straightness of the paths between satellites that use laser links. We’ve done calculations of latency between LEO satellites that suggest that there are situations in which going to space will be the fastest path between two points on Earth. We’ll let you know if we end up using laser-space-purge.

Just as we have with network performance, we are going to relentlessly measure our cache performance as well as the cache performance of our competitors. We won’t be satisfied until we verifiably are the fastest everywhere. To do that, we’ve built a new cache purge architecture which we’re confident will make us the fastest cache purge in the industry.

Our new architecture

Through the end of 2022, we will continue this blog series incrementally showing how we will become the fastest, most-scalable purge system in the industry. We will continue to update you with how our purge system is developing  and benchmark our data along the way.

Getting there will involve rearchitecting and optimizing our purge service, which hasn’t received a systematic redesign in over a decade. We’re excited to do our development in the open, and bring you along on our journey.

So what do we plan on updating?

Introducing Coreless Purge

The first version of our cache purge system was designed on top of a set of central core services including authorization, authentication, request distribution, and filtering among other features that made it a high-reliability service. These core components had ultimately become a bottleneck in terms of scale and performance as our network continues to expand globally. While most of our purge dependencies have been containerized, the message queue used was still running on bare metals, which led to increased operational overhead when our system needed to scale.

Last summer, we built a proof of concept for a completely decentralized cache invalidation system using in-house tech – Cloudflare Workers and Durable Objects. Using Durable Objects as a queuing mechanism gives us the flexibility to scale horizontally by adding more Durable Objects as needed and can reduce time to purge with quick regional fanouts of purge requests.

In the new purge system we’re ripping out the reliance on core data centers and moving all that functionality to every data center, we’re calling it coreless purge.

Here’s a general overview of how coreless purge will work:

  • A purge request will be initiated via the API or UI. This request will specify how we should identify the assets to be purged.
  • The request will be routed to the nearest Cloudflare data center where it is identified to be a purge request and be passed to a Worker that will perform several of the key functions that currently occur in the core (like authorization, filtering, etc).
  • From there, the Worker will pass the purge request to a Durable Object in the data center. The Durable Object will queue all the requests and broadcast them to every data center when they are ready to be processed.
  • When the Durable Object broadcasts the purge request to every data center, another Worker will pass the request to the service in the data center that will invalidate the content in cache (executes the purge).

We believe this re-architecture of our system built by stringing together multiple services from the Workers platform will help improve both the speed and scalability of the purge requests we will be able to handle.

Conclusion

We’re going to spend a lot of time building and optimizing purge because, if there’s one thing we learned here today, it’s that cache invalidation is a difficult problem but those are exactly the types of problems that get us out of bed in the morning.

If you want to help us optimize our purge pipeline, we’re hiring.

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet applicationward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you’re looking for a new career direction, check out our open positions.

Source :
https://blog.cloudflare.com/part1-coreless-purge/

All the way up to 11: Serve Brotli from origin and Introducing Compression Rules

23/06/2023

Throughout Speed Week, we have talked about the importance of optimizing performance. Compression plays a crucial role by reducing file sizes transmitted over the Internet. Smaller file sizes lead to faster downloads, quicker website loading, and an improved user experience.

Take household cleaning products as a real world example. It is estimated “a typical bottle of cleaner is 90% water and less than 10% actual valuable ingredients”. Removing 90% of a typical 500ml bottle of household cleaner reduces the weight from 600g to 60g. This reduction means only a 60g parcel, with instructions to rehydrate on receipt, needs to be sent. Extrapolated into the gallons, this weight reduction soon becomes a huge shipping saving for businesses. Not to mention the environmental impact.

This is how compression works. The sender compresses the file to its smallest possible size, and then sends the smaller file with instructions on how to handle it when received. By reducing the size of the files sent, compression ensures the amount of bandwidth needed to send files over the Internet is a lot less. Where files are stored in expensive cloud providers like AWS, reducing the size of files sent can directly equate to significant cost savings on bandwidth.

Smaller file sizes are also particularly beneficial for end users with limited Internet connections, such as mobile devices on cellular networks or users in areas with slow network speeds.

Cloudflare has always supported compression in the form of Gzip. Gzip is a widely used compression algorithm that has been around since 1992 and provides file compression for all Cloudflare users. However, in 2013 Google introduced Brotli which supports higher compression levels and better performance overall. Switching from gzip to Brotli results in smaller file sizes and faster load times for web pages. We have supported Brotli since 2017 for the connection between Cloudflare and client browsers. Today we are announcing end-to-end Brotli support for web content: support for Brotli compression, at the highest possible levels, from the origin server to the client.

If your origin server supports Brotli, turn it on, crank up the compression level, and enjoy the performance boost.

Brotli compression to 11

Brotli has 12 levels of compression ranging from 0 to 11, with 0 providing the fastest compression speed but the lowest compression ratio, and 11 offering the highest compression ratio but requiring more computational resources and time. During our initial implementation of Brotli five years ago, we identified that compression level 4 offered the balance between bytes saved and compression time without compromising performance.

Since 2017, Cloudflare has been using a maximum compression of Brotli level 4 for all compressible assets based on the end user’s “accept-encoding” header. However, one issue was that Cloudflare only requested Gzip compression from the origin, even if the origin supported Brotli. Furthermore, Cloudflare would always decompress the content received from the origin before compressing and sending it to the end user, resulting in additional processing time. As a result, customers were unable to fully leverage the benefits offered by Brotli compression.

Old world

With Cloudflare now fully supporting Brotli end to end, customers will start seeing our updated accept-encoding header arriving at their origins. Once available customers can transfer, cache and serve heavily compressed Brotli files directly to us, all the way up to the maximum level of 11. This will help reduce latency and bandwidth consumption. If the end user device does not support Brotli compression, we will automatically decompress the file and serve it either in its decompressed format or as a Gzip-compressed file, depending on the Accept-Encoding header.

Full end-to-end Brotli compression support

End user cannot support Brotli compression

Customers can implement Brotli compression at their origin by referring to the appropriate online materials. For example, customers that are using NGINX, can implement Brotli by following this tutorial and setting compression at level 11 within the nginx.conf configuration file as follows:

brotli on;
brotli_comp_level 11;
brotli_static on;
brotli_types text/plain text/css application/javascript application/x-javascript text/xml 
application/xml application/xml+rss text/javascript image/x-icon 
image/vnd.microsoft.icon image/bmp image/svg+xml;

Cloudflare will then serve these assets to the client at the exact same compression level (11) for the matching file brotli_types. This means any SVG or BMP images will be sent to the client compressed at Brotli level 11.

Testing

We applied compression against a simple CSS file, measuring the impact of various compression algorithms and levels. Our goal was to identify potential improvements that users could experience by optimizing compression techniques. These results can be seen in the following table:

TestSize (bytes)% Reduction of original file (Higher % better)
Uncompressed response (no compression used)2,747
Cloudflare default Gzip compression (level 8)1,12159.21%
Cloudflare default Brotli compression (level 4)1,11059.58%
Compressed with max Gzip level (level 9)1,12159.21%
Compressed with max Brotli level (level 11)90966.94%

By compressing Brotli at level 11 users are able to reduce their file sizes by 19% compared to the best Gzip compression level. Additionally, the strongest Brotli compression level is around 18% smaller than the default level used by Cloudflare. This highlights a significant size reduction achieved by utilizing Brotli compression, particularly at its highest levels, which can lead to improved website performance, faster page load times and an overall reduction in egress fees.

To take advantage of higher end to end compression rates the following Cloudflare proxy features need to be disabled.

  • Email Obfuscation
  • Rocket Loader
  • Server Side Excludes (SSE)
  • Mirage
  • HTML Minification – JavaScript and CSS can be left enabled.
  • Automatic HTTPS Rewrites

This is due to Cloudflare needing to decompress and access the body to apply the requested settings. Alternatively a customer can disable these features for specific paths using Configuration Rules.

If any of these rewrite features are enabled, your origin can still send Brotli compression at higher levels. However, we will decompress, apply the Cloudflare feature(s) enabled, and recompress on the fly using Cloudflare’s default Brotli level 4 or Gzip level 8 depending on the user’s accept-encoding header.

For browsers that do not accept Brotli compression, we will continue to decompress and send Gzipped responses or uncompressed.

Implementation

The initial step towards implementing Brotli from the origin involved constructing a decompression module that could be integrated into Cloudflare software stack. It allows us to efficiently convert the compressed bits received from the origin into the original, uncompressed file. This step was crucial as numerous features such as Email Obfuscation and Cloudflare Workers Customers, rely on accessing the body of a response to apply customizations.

We integrated the decompressor into  the core reverse web proxy of Cloudflare. This integration ensured that all Cloudflare products and features could access Brotli decompression effortlessly. This also allowed our Cloudflare Workers team to incorporate Brotli Directly into Cloudflare Workers allowing our Workers customers to be able to interact with responses returned in Brotli or pass through to the end user unmodified.

Introducing Compression rules – Granular control of compression to end users

By default Cloudflare compresses certain content types based on the Content-Type header of the file. Today we are also announcing Compression Rules for our Enterprise Customers to allow you even more control on how and what Cloudflare will compress.

Today we are also announcing the introduction of Compression Rules for our Enterprise Customers. With Compression Rules, you gain enhanced control over Cloudflare’s compression capabilities, enabling you to customize how and which content Cloudflare compresses to optimize your website’s performance.

For example, by using Cloudflare’s Compression Rules for .ktx files, customers can optimize the delivery of textures in webGL applications, enhancing the overall user experience. Enabling compression minimizes the bandwidth usage and ensures that webGL applications load quickly and smoothly, even when dealing with large and detailed textures.

Alternatively customers can disable compression or specify a preference of how we compress. Another example could be an Infrastructure company only wanting to support Gzip for their IoT devices but allow Brotli compression for all other hostnames.

Compression rules use the filters that our other rules products are built on top of with the added fields of Media Type and Extension type. Allowing users to easily specify the content you wish to compress.

Deprecating the Brotli toggle

Brotli has been long supported by some web browsers since 2016 and Cloudflare offered Brotli Support in 2017. As with all new web technologies Brotli was unknown and we gave customers the ability to selectively enable or disable BrotlI via the API and our UI.

Now that Brotli has evolved and is supported by all browsers, we plan to enable Brotli on all zones by default in the coming months. Mirroring the Gzip behavior we currently support and removing the toggle from our dashboard. If browsers do not support Brotli, Cloudflare will continue to support their accepted encoding types such as Gzip or uncompressed and Enterprise customers will still be able to use Compression rules to granularly control how we compress data towards their users.

The future of web compression

We’ve seen great adoption and great performance for Brotli as the new compression technique for the web. Looking forward, we are closely following trends and new compression algorithms such as zstd as a possible next-generation compression algorithm.

At the same time, we’re looking to improve Brotli directly where we can. One development that we’re particularly focused on is shared dictionaries with Brotli. Whenever you compress an asset, you use a “dictionary” that helps the compression to be more efficient. A simple analogy of this is typing OMW into an iPhone message. The iPhone will automatically translate it into On My Way using its own internal dictionary.

OMW
OnMyWay

This internal dictionary has taken three characters and morphed this into nine characters (including spaces) The internal dictionary has saved six characters which equals performance benefits for users.

By default, the Brotli RFC defines a static dictionary that both clients and the origin servers use. The static dictionary was designed to be general purpose and apply to everyone. Optimizing the size of the dictionary as to not be too large whilst able to generate best compression results. However, what if an origin could generate a bespoke dictionary tailored to a specific website? For example a Cloudflare-specific dictionary would allow us to compress the words and phrases that appear repeatedly on our site such as the word “Cloudflare”. The bespoke dictionary would be designed to compress this as heavily as possible and the browser using the same dictionary would be able to translate this back.

new proposal by the Web Incubator CG aims to do just that, allowing you to specify your own dictionaries that browsers can use to allow websites to optimize compression further. We’re excited about contributing to this proposal and plan on publishing our research soon.

Try it now

Compression Rules are available now! With End to End Brotli being rolled out over the coming weeks. Allowing you to improve performance, reduce bandwidth and granularly control how Cloudflare handles compression to your end users.

Watch on Cloudflare TV

https://customer-rhnwzxvb3mg4wz3v.cloudflarestream.com/f1c71fdb05263b5ec3077e6e7acdb7e2/iframe?preload=true&poster=https%3A%2F%2Fcustomer-rhnwzxvb3mg4wz3v.cloudflarestream.com%2Ff1c71fdb05263b5ec3077e6e7acdb7e2%2Fthumbnails%2Fthumbnail.jpg%3Ftime%3D1s%26height%3D600

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet applicationward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you’re looking for a new career direction, check out our open positions.

Source :
https://blog.cloudflare.com/this-is-brotli-from-origin/

Speeding up your (WordPress) website is a few clicks away

22/06/2023

Every day, website visitors spend far too much time waiting for websites to load in their browsers. This waiting is partially due to browsers not knowing which resources are critically important so they can prioritize them ahead of less-critical resources. In this blog we will outline how millions of websites across the Internet can improve their performance by specifying which critical content loads first with Cloudflare Workers and what Cloudflare will do to make this easier by default in the future.

Popular Content Management Systems (CMS) like WordPress have made attempts to influence website resource priority, for example through techniques like lazy loading images. When done correctly, the results are magical. Performance is optimized between the CMS and browser without needing to implement any changes or coding new prioritization strategies. However, we’ve seen that these default priorities have opportunities to improve greatly.

In this co-authored blog with Google’s Patrick Meenan we will explain where the opportunities exist to improve website performance, how to check if a specific site can improve performance, and provide a small JavaScript snippet which can be used with Cloudflare Workers to do this optimization for you.

What happens when a browser receives the response?

Before we dive into where the opportunities are to improve website performance, let’s take a step back to understand how browsers load website assets by default.

After the browser sends a HTTP request to a server, it receives a HTTP response containing information like status codes, headers, and the requested content. The browser carefully analyzes the response’s status code and response headers to ensure proper handling of the content.

Next, the browser processes the content itself. For HTML responses, the browser extracts important information from the <head> section of the HTML, such as the page title, stylesheets, and scripts. Once this information is parsed, the browser moves on to the response <body> which has the actual page content. During this stage, the browser begins to present the webpage to the visitor.

If the response includes additional 3rd party resources like CSS, JavaScript, or other content, the browser may need to fetch and integrate them into the webpage. Typically, browsers like Google Chrome delay loading images until after the resources in the HTML <head> have loaded. This is also known as “blocking” the render of the webpage. However, developers can override this blocking behavior using fetch priority or other methods to boost other content’s priority in the browser. By adjusting an important image’s fetch priority, it can be loaded earlier, which can lead to significant improvements in crucial performance metrics like LCP (Largest Contentful Paint).

Images are so central to web pages that they have become an essential element in measuring website performance from Core Web Vitals. LCP measures the time it takes for the largest visible element, often an image, to be fully rendered on the screen. Optimizing the loading of critical images (like LCP images) can greatly enhance performance, improving the overall user experience and page performance.

But here’s the challenge – a browser may not know which images are the most important for the visitor experience (like the LCP image) until rendering begins. If the developer can identify the LCP image or critical elements before it reaches the browser, its priority can be increased at the server to boost website performance instead of waiting for the browser to naturally discover the critical images.

In our Smart Hints blog, we describe how Cloudflare will soon be able to automatically prioritize content on behalf of website developers, but what happens if there’s a need to optimize the priority of the images right now? How do you know if a website is in a suboptimal state and what can you do to improve?

Using Cloudflare, developers should be able to improve image performance with heuristics that identify likely-important images before the browser parses them so these images can have increased priority and be loaded sooner.

Identifying Image Priority opportunities

Just increasing the fetch priority of all images won’t help if they are lazy-loaded or not critical/LCP images. Lazy-loading is a method that developers use to generally improve the initial load of a webpage if it includes numerous out-of-view elements. For example, on Instagram, when you continually scroll down the application to see more images, it would only make sense to load those images when the user arrives at them otherwise the performance of the page load would be needlessly delayed by the browser eagerly loading these out-of-view images. Instead the highest priority should be given to the LCP image in the viewport to improve performance.

So developers are left in a situation where they need to know which images are on users’ screens/viewports to increase their priority and which are off their screens to lazy-load them.

Recently, we’ve seen attempts to influence image priority on behalf of developers. For example, by default, in WordPress 5.5 all images with an IMG tag and aspect ratios were directed to be lazy-loaded. While there are plugins and other methods WordPress developers can use to boost the priority of LCP images, lazy-loading all images in a default manner and not knowing which are LCP images can cause artificial performance delays in website performance (they’re working on this though, and have partially resolved this for block themes).

So how do we identify the LCP image and other critical assets before they get to the browser?

To evaluate the opportunity to improve image performance, we turned to the HTTP Archive. Out of the approximately 22 million desktop pages tested in February 2023, 46% had an LCP element with an IMG tag. Meaning that for page load metrics, LCP had an image included about half the time. Though, among these desktop pages, 8.5 million had the image in the static HTML delivered with the page, indicating a total potential improvement opportunity of approximately 39% of the desktop pages within the dataset.

In the case of mobile pages, out of the ~28.5 million tested, 40% had an LCP element as an IMG tag. Among these mobile pages, 10.3 million had the image in the static HTML delivered with the page, suggesting a potential improvement opportunity in around 36% of the mobile pages within the dataset.

However, as previously discussed, prioritizing an image won’t be effective if the image is lazy-loaded because the directives are contradictory. In the dataset,  approximately 1.8 million LCP desktop images and 2.4 million LCP mobile images were lazy-loaded.

Therefore, across the Internet, the opportunity to improve image performance would be about ~30% of pages that have an LCP image in the original HTML markup that weren’t lazy-loaded, but with a more advanced Cloudflare Worker, the additional 9% of lazy-loaded LCP images can also be improved improved by removing the lazy-load attribute.

If you’d like to determine which element on your website serves as the LCP element so you can increase the priority or remove any lazy-loading, you can use browser developer tools, or speed tests like Webpagetest or Cloudflare Observatory.

39% of desktop images seems like a lot of opportunity to improve image performance. So the next question is how can Cloudflare determine the LCP image across our network and automatically prioritize them?

Image Index

We thought that how soon the LCP image showed up in the HTML would serve as a useful indicator. So we analyzed the HTTP Archive dataset to see where the cumulative percentage of LCP images are discovered based on their position in the HTML, including lazy-loaded images.

We found that approximately 25% of the pages had the LCP image as the first image in the HTML (around 10% of all pages). Another 25% had the LCP image as the second image. WordPress seemed to arrive at a similar conclusion and recently released a development to remove the default lazy-load attribute from the first image on block themes, but there are opportunities to go further.

Our analysis revealed that implementing a straightforward rule like “do not lazy-load the first four images,” either through the browser, a content management system (CMS), or a Cloudflare Worker could address approximately 75% of the issue of lazy-loading LCP images (example Worker below).

Ignoring small images

In trying to find other ways to identify likely LCP images we next turned to the size of the image. To increase the likelihood of getting the LCP image early in the HTML, we looked into ignoring “small” images as they are unlikely to be big enough to be a LCP element. We explored several sizes and 10,000 pixels (less than 100×100) was a pretty reliable threshold that didn’t skip many LCP images and avoided a good chunk of the non-LCP images.

By ignoring small images (<10,000px), we found that the first image became the LCP image in approximately 30-34% of cases. Adding the second image increased this percentage to 56-60% of pages.

Therefore, to improve image priority, a potential approach could involve assigning a higher priority to the first four “not-small” images.

Chrome 114 Image Prioritization Experiment

An experiment running in Chrome 114 does exactly what we described above. Within the browser there are a few different prioritization knobs to play with that aren’t web-exposed so we have the opportunity to assign a “medium” priority to images that we want to boost automatically (directly controlling priority with “fetch priority” lets you set high or low). This will let us move the images ahead of other images, async scripts and parser-blocking scripts late in the body but still keep the boosted image priority below any high-priority requests, particularly dynamically-injected blocking scripts.

We are experimenting with boosting the priority of varying numbers of images (2, 5 and 10) and with allowing one of those medium-priority images to load at a time during Chromes “tight” mode (when it is loading the render-blocking resources in the head) to increase the likelihood that the LCP image will be available when the first paint is done.

The data is still coming in and no “ship” decisions have been made yet but the early results are very promising, improving the LCP time across the entire web for all arms of the experiment (not by massive amounts but moving the metrics of the whole web is notoriously difficult).

How to use Cloudflare Workers to boost performance

Now that we’ve seen that there is a large opportunity across the Internet for helping prioritize images for performance and how to identify images on individual pages that are likely LCP images, the question becomes, what would the results be of implementing a network-wide rule that could boost image priority from this study?

We built a test worker and deployed it on some WordPress test sites with our friends at Rocket.net, a WordPress hosting platform focused on performance. This worker boosts the priority of the first four images while removing the lazy-load attribute, if present. When deployed we saw good performance results and the expected image prioritization.

export default {
  async fetch(request) {
    const response = await fetch(request);
 
    // Check if the response is HTML
    const contentType = response.headers.get('Content-Type');
    if (!contentType || !contentType.includes('text/html')) {
      return response;
    }
 
    const transformedResponse = transformResponse(response);
 
    // Return the transformed response with streaming enabled
    return transformedResponse;
  },
};
 
async function transformResponse(response) {
  // Create an HTMLRewriter instance and define the image transformation logic
  const rewriter = new HTMLRewriter()
    .on('img', new ImageElementHandler());
 
  const transformedBody = await rewriter.transform(response).text()
 
  const transformresponse = new Response(transformedBody, response)
 
  // Return the transformed response with streaming enabled
  return transformresponse
}
 
class ImageElementHandler {
  constructor() {
    this.imageCount = 0;
    this.processedImages = new Set();
  }
 
  element(element) {
    const imgSrc = element.getAttribute('src');
 
    // Check if the image is small based on Chrome's criteria
    if (imgSrc && this.imageCount < 4 && !this.processedImages.has(imgSrc) && !isImageSmall(element)) {
      element.removeAttribute('loading');
      element.setAttribute('fetchpriority', 'high');
      this.processedImages.add(imgSrc);
      this.imageCount++;
    }
  }
}
 
function isImageSmall(element) {
  // Check if the element has width and height attributes
  const width = element.getAttribute('width');
  const height = element.getAttribute('height');
 
  // If width or height is 0, or width * height < 10000, consider the image as small
  if ((width && parseInt(width, 10) === 0) || (height && parseInt(height, 10) === 0)) {
    return true;
  }
 
  if (width && height) {
    const area = parseInt(width, 10) * parseInt(height, 10);
    if (area < 10000) {
      return true;
    }
  }
 
  return false;
}

When testing the Worker, we saw that default image priority was boosted into “high” for the first four images and the fifth image remained “low.” This resulted in an LCP range of “good” from a speed test. While this initial test is not a dispositive indicator that the Worker will boost performance in every situation, the results are promising and we look forward to continuing to experiment with this idea.

While we’ve experimented with WordPress sites to illustrate the issues and potential performance benefits, this issue is present across the Internet.

Website owners can help us experiment with the Worker above to improve the priority of images on their websites or edit it to be more specific by targeting likely LCP elements. Cloudflare will continue experimenting using a very similar process to understand how to safely implement a network-wide rule to ensure that images are correctly prioritized across the Internet and performance is boosted without the need to configure a specific Worker.

Automatic Platform Optimization

Cloudflare’s Automatic Platform Optimization (APO) is a plugin for WordPress which allows Cloudflare to deliver your entire WordPress site from our network ensuring consistent, fast performance for visitors. By serving cached sites, APO can improve performance metrics. APO does not currently have a way to prioritize images over other assets to improve browser render metrics or dynamically rewrite HTML, techniques we’ve discussed in this post. Although this presents a potential opportunity for future development, it requires thorough testing to ensure safe and reliable support.

In the future we’ll look to include the techniques discussed today as part of APO, however in the meantime we recommend using Snippets (and Experiments) to test with the code example above to see the performance impact on your website.

Get in touch!

If you are interested in using the JavaScript above, we recommended testing with Workers or using Cloudflare Snippets. We’d love to hear from you on what your results were. Get in touch via social media and share your experiences.

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet applicationward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you’re looking for a new career direction, check out our open positions.

Source :
https://blog.cloudflare.com/speeding-up-your-website-in-a-few-clicks/