About two weeks ago, the Google Project Zero team found a buffer overflow problem with a Cloudflare product that helps its customers protect against content scraping programs that pulls copyrighted material to aggregators.
According to published reports, a line of code should have had an “=<” instead of an “==.” While seemingly innocuous, this single misstep allowed the memory leak to take place. The same reports pointed out that less than .000003 percent of Cloudflare’s customer traffic was potentially exposed. However, with more than 5.5 million websites being hosted behind Cloudflare’s CDN service, that makes for a significant exposure for many applications that most of us use every day.
For the Google Project Zero team, it is great that they are searching for these major security issues/problems and have many success stories to demonstrate their acumen for identifying vulnerabilities. However, the short duration between discovery and notice to the world could impact Cloudflare’s ability to conduct a responsible incident response process to determine the blast radius and notify customers.
In reaction, shortly after the Cloudbleed revelation, Armor threat researchers were quick to identify domains that were potentially impacted on Github Paste and parsed it against our customer list. Fortunately, only a single Armor customer was impacted. However, not every Cloudflare customer has a threat intel team at the ready with the capacity to help them be proactive like our Armor Threat Resistance Unit (TRU).
For now, there have been no public statements from Cloudflare on how this product, with a significant security flaw, made it to a production environment. When managing a security-focused Software Development Lifecycle, there should have been two opportunities during the Quality Assurance process to catch a mistake like this.
Software Development Best Practices
Before code is compiled, it is a security best practice to conduct a Static Code/Program Analysis scan to find common mistakes, such as the one that created this problem with a buffer overflow. Later in the development process, after the code is compiled for execution, a second test, called dynamic code/program analysis, is performed to test for operational weaknesses that might not be caught during the static testing.
The way most software is engineered today, developers stitch many libraries of code together to create software functionality, many of which come with embedded flawed code unknown to the developer. When these software libraries unite as a program, this dynamic testing must take place to understand what ports and protocols are called between the application, database and webservers/load balancers and pentest the platform.
It also looks for common mistakes such as the potential for sequel injects. There are many automated tools (both open source and paid products) available that work together well to perform static and dynamic code/program analysis. While I have no insight into the Cloudflare engineering processes, this commonly used process does prevent mistakes like this from getting to production and is essential to any secure software development lifecycle.
While this story has dominated the cyber security news cycle in recent days, luckily for Cloudflare, it will most likely soon be replaced by another story of a global vulnerability that the will be the impetus of a scramble to mitigate and patch. The last two years have been filled with *.bleed incidents that have caused great risk to everyone required to use the internet to conduct business. The root cause for these global security risks seem to be the same story: “a simple mistake was missed during the development process that caused it…again.”
For the most part, these issues stem from “fast-track” software development that leverages efficiencies of pre-existing code without holistically evaluating how they all fit together to anticipate potential gaps that could lead to vulnerabilities. While speed is an important aspect of software development, DevOps must keep security in mind as a baseline to keep systems safe.
We are lucky this time that a friendly threat research team discovered this flaw before a criminal threat or nation-state actor capitalized on it first.