Heat Map Love – R Style

January 20, 2012

Over the last several years not a month has gone by where I have not heard someone mention R – with regards to risk analysis or risk modeling – either in discussion or on a mailing list. If you do not know what R is, take a few minutes to read about it at the project’s main site. Simply put, R is a free software environment for statistical computing and graphics. Most of my quantitative modeling and analysis has been strictly Excel-based, which to date has been more then sufficient for my needs. However, Excel is not the ‘end-all-be-all’ tool. Excel does not contain every statistical distribution that risk practitioners may need to work with, there is no native Monte Carlo engine and it does have graphing limitations short of purchasing third party add-ons (advanced charts, granular configuration of graphs, etc…).

Thanks to some industry peer prodding (Jay Jacobs of Verizon’s Risk Intelligence team and Alex Hutton suggesting that ‘Numbers’ is a superior tool for visualizations). I finally bit the bullet, downloaded and then installed R.  For those completely new to R you have to realize that R is a platform to build amazing things upon. It is very command-line like in nature. You type in instructions and it executes. I like this approach because you are forced to learn the R language and syntax. Thus, in the end you will probably understand your data and resulting analysis much better.

One of the first graphics I wanted to explore with R was heat maps. At first, as I was thinking a standard risk management heat map; a 5×5 matrix with issues plotted on the matrix relative to frequency and magnitude. However, when I started searching Google for ‘R heat map’, a similar yet different style of heat map – referred to as a cluster heat map – was first returned in the search results. A cluster heat map is useful for comparing data elements in a matrix against each other depending on how your data is laid out. It is very visual in nature and allows the reader to quickly zero in on data elements or visual information of importance. From an information risk management perspective, if we have quantitative risk information and some metadata, we can begin a discussion with management by leveraging a heat map visualization. If additional information is needed as to why there are dark areas, then we can have the discussion about the underlying quantitative data. Thus, I decided to build a cluster heat map in R.

I referenced three blogs to guide my efforts – they can be found here, here and here. What I am writing here is in no way a complete copy and paste of their content because I provide some additional details on some steps that generated errors for me that in some cases took hours to figure out. This is not unexpected given the difference in data sets.

Let’s do it.

1.    Download and install R. After installation, start an R session. The version of R used for this post is 2.14.0. You can check your version by typing version at the command prompt and pressing ENTER.

2.    You will need to download and install the ggplot2 package / library. Do this through the R gui by referencing an online CRAN repository (packages -> install packages …). This method seems to be cleaner then downloading a package to your hard disk and then telling R to install it. In addition, if you reference an online repository, it will also grab any dependent packages at the same time. You can learn more about ggplot2 here.

3.    Once you have installed the ggplot2 package, we have to load it into our current R workspace.

> library(ggplot2)

4.    Next, we are going to import data to work with in R. Download ‘risktical_csv1.csv’ to your hard disk and execute the following command. Change the file path to match the file path for where you saved the file to.

risk <- read.csv(“C:/temph/risktical_csv1.csv”, sep=”,”, check.names= FALSE)

a.    We are telling R to import a Comma Separated Value file and assign it to a variable called ‘risk’.
b.    Read.csv is the method or function type of import.
c.    Notice that the slashes in the file name are opposite of what they normally would be when working with other common Windows-based applications.
d.    ‘sep=”,”’ tells R what character is used to separate values within the data set.
e.    ‘check.names=FALSE’ tells R not to check the column headers for correctness. R expects to see only letters, if it sees numbers, it will prepend an X to the column headers – we don’t want that based off the data set we are using.
f.    Once you hit enter, you can type ‘risk’ and hit enter again. The data from the file will be displayed on the screen.

5.    Now we need to ‘shape’ the data. The ggplot graphing function we want to use cannot consume the data as it currently is, so we are going to reformat the data first. The ‘melt’ function helps us accomplish this.

risk.m <- melt(risk)

a.    We are telling R to use the melt function against the ‘risk’ variable. Then we are going to take the output from melt and create a new variable called risk.m.
b.    Melt rearranges the data elements. Type ‘help(melt)’ for more information.
c.    After you hit enter, you can type ‘risk.m’ and hit enter again. Notice the way the data is displayed compared to the data prior to ‘melting’ (variable ‘risk’).

6.    Next, we have to rescale our numerical values so we can know how to shade any given section of our heat map. The higher the numerical value within a series of data, the darker the color or shade that tile of the heat map should be. The ‘ddply’ function helps us accomplish the rescaling; type ‘help(ddply)’ for more information.

risk.m <- ddply(risk.m, .(variable), transform, rescale = rescale(value), reorder=FALSE)

a.    We are telling R to execute the ‘ddply’ function against the risk.m variable.
b.    We are also passing some arguments to ‘ddply’ telling it to transform and reshape the numerical values. The result of this command produces a new column of values between 0 and 1.
c.    Finally, we pass an argument to ‘ddply’ not to reorder any rows.
d.    After you hit enter, you can type ‘risk.m’ and hit enter again and observe changes to the data elements; there should be two new columns of data.

7.    We are now ready to plot our heat map.

(p <- ggplot(risk.m, aes(variable, BU.Name)) + geom_tile(aes(fill = rescale), colour = “grey20″) + scale_fill_gradient(low = “white”, high = “red”))

a.    This command will produce a very crude looking heat map plot.
b.    The plot itself is assigned to a variable called p
c.    ‘scale_fill_gradient’ is the argument that associates color shading to the numerical values we rescaled in step 6. The higher the rescaling value – the darker the shading.
d.    The ‘aes’ function of ggplot is related to aesthetics. You can type in ‘help(aes)’ to learn about the various ‘aes’ arguments.

8.    Before we tidy up the plot, let’s set a variable that we will use in formatting axis values in step 9.

base_size <- 9

9.    Now we are going to tidy up the plot. There is a lot going on here.

p + theme_grey(base_size = base_size) + labs(x = “”, y = “”) + scale_x_discrete(expand = c(0, 0)) + scale_y_discrete(expand = c(0, 0)) + opts(legend.position = “none”, axis.ticks = theme_blank(), axis.text.x = theme_text(size = base_size * 0.8, angle = -90, hjust = 0, colour = “black”), axis.text.y = theme_text(size = base_size * 0.8, angle = 0, hjust = 0, colour = “black”))

a.    ‘labs(x = “”, y = “”)’ removes the axis labels.
b.    ‘opts(legend.position = “none”’ gets rid of the scaling legend.
c.    ‘axis.text.x = theme_text(size = base_size * 0.8, angle = -90’ sets the X axis text size as well as orientation.
d.    The heat map should look like the image below.

A few final notes:

1.    The color shading is performed within series of data, vertically. Thus, in the heat map we have generated, the color for any given tile is relative to the tile above and below it –IN THE SAME COLUMN – or in our case for a given ISO 2700X policy section.

2.    If we transposed our original data set – risktical_cvs2 – and applied the same commands with the exception of replacing BU.Name with Policy in our initial ggplot command (step 7), you should get a heat map that looks like the one below.

3.    In this heat map, we can quickly determine key areas of exposure for all 36 of our fictional business units relative to ISO 2700X. For example, most of BU3’s exposure is related to Compliance, followed by Organizational Security Policy and Access Control. If the executive in that business unit wanted more granular information in terms of dollar value exposure, we could share that information with them.

So there you have it! A quick R tutorial on developing a cluster heat map for information risk management purposes. I look forward to learning more about R and leveraging it to analyze and visualize data in unique and thought-provoking ways. As always, feel free to leave comments!


OpenPERT – A FREE Add-In for Microsoft Office Excel

August 15, 2011

INTRODUCTION. In early June of this year, Jay Jacobs and I started having a long email / phone call discussion about risk modeling, model comparisons, descriptive statistics, and risk management in general. At some point in our conversation the topic of Excel add-ins came up and how nice it would be to NOT have to rely upon 3rd party add-ins that cost between hundreds and thousands of dollars to acquire. You can sort of think of the 80-20 rule when it comes to out of the box Excel functionality – though it is probably more like 95-5 depending on your profession – most of the functionality you need to perform analysis is there. However, there are at least two capabilities not included in Excel that are useful for risk modeling and analysis: the betaPERT distribution and Monte Carlo simulation. Thus,  the need for costly 3rd-party add-ins or a free alternative, the OpenPERT add-in.

ABOUT BETAPERT. You can get very thorough explanations about  the betaPERT distribution here, here, and here. What follows is the ‘cliff notes’ version. The betaPERT distribution is often used for modeling subject matter expert estimates in scenarios where there is no data or not enough of it. The underlying distribution is the beta distribution (which is included in Microsoft Office Excel).  If we can over-simply and define a distribution as a collection or range of values – the betaPERT distribution when initially used with three values, such as minimum, most likely (think mode) and maximum values will create a distribution of values (output) that can then be used for statistical analysis and modeling. By introducing a fourth parameter – which I will refer to as confidence, regarding the ‘most likely’ estimate – we can account for the kurtosis – or peakedness – of the distribution.

WHO USES BETAPERT? There are a few professions and disciplines that leverage the betaPERT distribution:

Project Management – The project management profession is most often associated with betaPERT. PERT stands for Program (or Project) Evaluation and Review Technique. PERT was developed by the Navy and Booz-Allen-Hamilton back in the 1950’s (ref.1; see below ) – as part of the Polaris missile program. Anyway, it is often used today in project management for project / task planning and I believe it is covered as part of the PMP certification curriculum.

Risk Analysis / Modeling – There are some risk analysis scenarios where due to a lack of data, estimates are used to bring form to components of scenarios that factor into risk. The FAIR methodology – specifically some tools that leverage the FAIR methodology as applied to IT risk – is such an example of using betaPERT for risk analysis and risk modeling.

Ad-Hoc Analysis – There are many times where having access to a distribution like betaPERT is useful outside the disciplines listed above. For example, if a baker is looking to compare the price of her/his product with the rest of the market – data could be collected, a distribution created, and analysis could occur. Or, maybe a church is analyzing its year to year growth and wants to create a dynamic model that accounts for both probable growth and shrinkage – betaPERT can help with that as well.

OPENPERT ADD-IN FOR MICROSOFT OFFICE EXCEL. Jay and I developed the OpenPERT add-in as an alternative to paying money to leverage the betaPERT distribution. Of course, we underestimated the complexity of not only creating an Excel add-in but also working with the distribution itself and specific cases where divide by zero errors can occur. That said, we are very pleased with version 1.0 of OpenPERT and are excited about future enhancements as well as releasing examples of problem scenarios that are better understood with betaPERT analysis. Version 1.0 has been tested on Microsoft Office Excel 2007 and 2010; on both 32 bit and 64 bit Microsoft Windows operating systems. Version 1.0 of OpenPERT is not supported on ANY Microsoft Office for Mac products.

The project home of OpenPERT is here.

The downloads page is here. Even if you are familiar with the betaPERT distribution, please read the reference guide before installing and using the OpenPERT add-in.

Your feedback is welcome via support@openpert.org

Finally – On behalf of Jay and myself – a special thank you to members of the Society of Information Risk Analysts (SIRA) that helped test and provided feedback on the OpenPERT add-in. Find out more about SIRA here.

Ref. 1 – Malcolm, D. G., J. H. Roseboom, C. E. Clark, W. Fazar Application of a Technique for Research and Development Program Evaluation OPERATIONS RESEARCH Vol. 7, No. 5, September-October 1959, pp. 646-669


Deconstructing Some HITECH Hype

February 23, 2011

A few days ago I began analyzing some model output and noticed that the amount of loss exposure for ISO 27002 section “Communications and Operations Management” had increased by 600% in a five week time frame. It only took a few seconds to zero-in on an issue that was responsible for the increase.

The issue was related to a gap with a 3rd party of which there was some Health Information Technology for Economic and Clinical Health Act (HITECH) fine exposure. The estimated HITECH fines were really LARGE. Large in the sense that the estimates:

a.    Did not pass the sniff test
b.    Could not be justified based off any documented fines / or statutes.
c.    From a simulation perspective were completely skewing the average simulated expected loss value for the scenario itself.

I reached out to better understand the rationale of the practitioner who performed the analysis and after some discussion we were in agreement that some additional analysis was warranted to accurately represent assumptions as well as refine loss magnitude estimates – especially for potential HITECH fines. About 10 minutes of additional information gathering yielded valuable information.

In a nutshell, the HITECH penalty structure is a tiered system that takes into consideration the nature of the data breach, the fine per violation and maximum amounts of fines for a given year. See below (the tier summary is from link # 2 at the bottom of this post; supported by links # 1 and 3):

Tier A is for violations in which the offender didn’t realize he or she violated the Act and would have handled the matter differently if he or she had. This results in a $100 fine for each violation, and the total imposed for such violations cannot exceed $25,000 for the calendar year.

Tier B is for violations due to reasonable cause, but not “willful neglect.” The result is a $1,000 fine for each violation, and the fines cannot exceed $100,000 for the calendar year.

Tier C is for violations due to willful neglect that the organization ultimately corrected. The result is a $10,000 fine for each violation, and the fines cannot exceed $250,000 for the calendar year.

Tier D is for violations of willful neglect that the organization did not correct. The result is a $50,000 fine for each violation, and the fines cannot exceed $1,500,000 for the calendar year.

Given this information and the nature of the control gap – one can quickly determine the penalty tier as well as estimate fine amounts. The opportunity cost to gather this additional information was minimal and the benefits of the additional analysis will result  in not only more accurate and defendable analysis – but also spare the risk practitioner from what would have been certain scrutiny from other IT risk leaders and possibly business partner allegations of IT Risk Management once again “crying wolf”.

Key Take-Away(s)

1.    Perform sniff tests on your analysis; have others review your analysis.
2.    There is probably more information then you realize about the problem space you are dealing with.
3.    Be able to defend assumptions and estimates that you make.
4.    Become the “expert” about the problem space not the repeater of information that may not be valid to begin with.

Links / References associated with this post:

1.    HIPAA Enforcement Rule ref. HITECH <- lots of legalese
2.    HITECH Summary <- less legalese
3.    HITECH Act scroll down to section 13410 for fine information <-lots of legalese
4.    Actual instance of a HITECH-related fine
5.    Interesting Record Loss Threshold Observation; Is 500 records the magic number?


[BOOK REVIEW] The Communicators: Leadership in the Age of Crisis

December 23, 2010

I just finished reading The Communicators: Leadership in the Age of Crisis by Richard Levick and Charles Slack. For regular readers of this blog – you may recall a two part series back in 2009 on this blog – here and here – where Mr. Levick participated in a question and answer format on the topic of reputation risk. I have a lot of respect for the work Mr. Levick and his firm Levick Strategic Communications performs for their clients. “Why?” you might ask; the answer is risk management and leadership management.

***

RISK MANAGEMENT

The majority of the readers of this blog have information risk management backgrounds. So I will speak to risk management first. I am going to define risk as the probable frequency and probable magnitude of future loss. For those familiar with the FAIR risk analysis methodology – specifically the taxonomy – you will recall that in the “loss magnitude” side of the taxonomy there are concepts such as “duration of loss”, “effect of loss” and “secondary stakeholders” that can inflict secondary loss against our company when a bad event occurs.

The Communicators is filled with examples about how an individual, business leaders, or organizations as a whole – can impact (both good and bad) the duration and effect of loss as well as effectively manage the perceptions of secondary stakeholders – when a bad event (or crisis) occurs. As risk practitioners, it is no longer acceptable to just know that a big loss event can impact our employer’s reputation or other more-tangible loss forms. We have to be able give real –yet practical – scenarios and examples of loss forms. Better yet, we need to offer additional value by asking tough questions that could shed light on a systemic weakness in existing plans to deal with a crisis when it does occur.

For the information risk practitioner, the following sections stood out to me:

Section 1: The Blind Spot. While this section is more about courage and leadership; there are time honored nuggets of wisdom in this section that we should embrace no matter what your role or title in the organization is.

Section 6: Leadership in the Digital Era. Social media is a double-edged sword – every information risk practitioner knows it. While social media can enable our company it can also be an information distribution mechanism that can damage our company’s reputation and ability to minimize loss in minutes compared to days, weeks or even months. Read this section to get great perspective on social media and the risks associated with it.

Note: With regards to the subject of risk management and its relationship with “bad” events. A crisis does not need to be initiated by something “bad” or an actual loss event. The Communicators gives a few examples of these scenarios (Rule #35; When Facts Don’t Matter, Forget The Facts).

***

LEADERSHIP MANAGEMENT

As a former Marine, I cringe when I hear the words manager and leader used synonymously. Some organizations now even call all their managers “people leaders”. Philosophically, I can appreciate what is trying to be accomplished. But let’s face it there are managers out there that could not lead their teams out of an open door. I make such analogies to convey that leadership means something special to me. Thus, when I pick up a book that contains advice or examples of leadership – it better be good. The Communicators far exceeded my expectations.

If I was mentoring someone on the topic of leadership, using The Communicators as a mentoring aid and only had time to discuss one section; that section would be…

Section 9: Internal Leadership. The concept of ‘servant leadership’ is not necessarily new. Levick writes “Servant leadership defines the supervisory missions in terms of helping subordinates succeed and achieve through appreciation and reinforcement, not intimidation” (206). Just imagine a company where this approach was really embedded into its culture – not just a talking point on a PowerPoint slide deck that is helping your co-worker catch up on sleep and drool on him or herself. Better yet – forget about the manager / subordinate or corporate training aspect – what if everyone applied the concept of “servant leadership” in some or all aspects of their lives? Imagine how much more different our relationships and quality of life could be.

Leadership is not just about you and something you do relative to others. It is a mindset that can be leveraged at various levels of abstraction (personal, social, professional…) for those willing to embrace it.

In summary, I really enjoyed The Communicators and highly recommend it to anyone in the information risk management profession or anyone else that is serious about managing their career – regardless of your role or title.

Bene valete ac me vobis prodesse spero (“I bid farewell and hope I may help you”)


Simple Risk Model (Part 1 of 5): Simulate Loss Frequency #1

October 25, 2010

Let’s start this series by defining risk. I am going to use the FAIR definition of risk which is: the probable frequency and probable magnitude of future loss. From a modeling perspective, I need at least two variables to model the risk for any given risk issue: a loss frequency variable and a loss magnitude variable. Hopefully, you are using a risk analysis methodology that deconstructs risk into these two variables…

The examples I am sharing in this blog series are an example of stochastic modeling. The use of random values as an input to a probability distribution ensures there is variation in the output; thus making it stochastic. The variable output allows for analysis through many different lenses; especially when there are additional (meaningful) attributes associated with any given risk issue (policy section, business unit, risk type, etc…).

Part 1 and 2 of this series will focus on “probable or expected [loss] frequency”. Frequency implies a number of occurrences over a given period of time. Loss events are discrete in nature; there are no “partial” loss events. So, when we see probable loss frequency values like 0.10 or 0.25 – and our time period is a year – we interpret that to mean that there is a 10% or 25% chance of a loss event in any given year. Another way of thinking about it is in terms of time; we expect a loss event once every ten years (0.10) or once every four years (0.25). Make sense?

You may want to download this Excel spreadsheet to reference for the rest of the post (it should work in Excel 2003, Excel 2007 and Excel 2010; I have not tested it on Office for Mac).

Make sure you view it in Excel and NOT Google Apps.

In a simulation, how would we randomly draw loss frequency values for a risk issue whose expected loss frequency is 0.10, or once every ten years? I will share two ways; the first of which is the remainder of this post.

For any simulation iteration, we would generate a random value between 0 and 1; and compare the result to the expected loss value

a.    The stated expected loss frequency is 0.10 (cell B1; tab “loss 1”)

b.    For purposes of demonstration, we will number some cells to reflect the number of iterations (A6:A1005; A6=1; A7=A6+1; drag A7 down to you get to 1000).

c.    In Excel, we would use the =RAND() function to generate the random values in cells B6:B1005.

d.    We would then compare the randomly generated value to the expected loss frequency value in cell B1; with this code in C6 dragged down to C1005:

=IF(B6<=$B$1,1,0)

i.    If the generated random value in cell B6 is equal to or less then 0.1000 (cell B1), then the number of loss events for that iteration is 1.
ii.    If the generated random value in B6 is greater then 0.1000, then the number of loss events for that iteration is 0

e.    Once you have values in all the cells, you can now look at how many iterations resulted in a loss and how many did not. Cell B2 counts the number of iterations you had a loss and cell B3 counts the number of iterations you did not have a simulated loss; their corresponding percentages are next to each other.

f.    The pie chart shows the percentage and count for each loss versus no loss.

g.    Press the F9 key; new random values will be generated. Every time you press F9 think of it as a new simulation with 1000 iterations. Press F9 lots of times and you will notice that in some simulations loss events occur greater then 10% of the time and in some simulations less then 10% of the time.

h.    What you are observing is the effect of randomness. Over a large number of iterations and/or simulations we would expect the loss frequency to converge to 10%.

i.    Another thing worth mentioning, is that output from the RAND() function is uniform in nature. Thus, there is equal probability of all values between 0 and 1 being drawn for any given iteration.

j.    Since our expected loss frequency is 0.1000 and the RAND() functions output is uniform in nature – we would expect to see 10% of our iterations result in loss; some were more and some were less.

There are some limitations with this method for simulating the loss frequency portion of our risk model:

1.    If the expected loss frequency is greater then 1 then using RAND() is not viable, because RAND() only generates values between 0 and 1.

2.    In iterations where you had a loss event; this method does not reflect the actual number of loss events for that iteration. In reality, there could be some iterations (or years) where you have more then one loss event.

Some of the first models I built used this approach for generating loss frequency values. There is usefulness regardless of its simplicity. However, there are other methods to simulate loss frequency that are more appropriate for modeling and overcome the limitations listed above. In the next post, we will use random values, a discreet probability distribution and the expected loss frequency value to randomly generate loss frequency values.

NOTES / DISCLAIMERS: I am intentionally over-simplifying these modeling examples for a few reasons:
1.    To demonstrate that IT Risk modeling is achievable; even to someone that is not an actuarial or modeling professional.
2.    To give a glimpse of the larger forest past some of the trees blocking our view within the information risk management profession.
3.    As with any model – simple or complex – there is diligence involved to ensure that the right probability distributions and calculations are being used; reflective of the data being modeled.
4.    In cases where assumptions are being made in a model; they would be documented.


ANALYTICAL THINKING or F.U.D?

August 28, 2010

Image Source; fudsec.com

SHORT BLOG POST VERSION:
I get a lot of satisfaction from teaching others the FAIR methodology. But equally satisfying is me knowing that I am helping build a culture of analytical thinking for both the class participant and our employer.

LONG BLOG POST VERSION:
This past week I had the privilege of teaching a three-day BASIC FAIR course at my employer. This is the second FAIR course I have taught and I can honestly state that I learned a lot about my company and the course participants; most of which I will be interacting with in the coming months in a consulting capacity.

Teaching the FAIR methodology is very challenging and rewarding. Because people’s preconceived notions of risk are challenged within minutes of being introduced to FAIR – there is no shortage of AH-HAH moments for them as well as no shortage of the instructor being stretched to unimaginable limits to take their examples and questions and view them through the lens of FAIR. I have walked away from both classes feeling like I learned more then they did.

I am currently reading “The Flaw of Averages” by Sam L. Savage. I highly recommend this book for a seasoned information risk practitioner. I will probably reference the book may times in future posts but for this post I want to talk about a sentence or two from Chapter 11; page 85 (hardcover). Savage references Well Fargo in 1997 and how they ‘maintained a culture of analytical thinking’.

So ask yourself this: Does my information risk management program instill a culture of analytical thinking or one of F.U.D. (Fear, Uncertainty & Doubt)?

The FAIR methodology when used correctly will force the practitioner to be analytical. But for an entire information risk management program to require all of its members to go through this training is telling of the culture we are creating. And guess what? This analytical thinking is not limited to our information risk management program. Our practitioners have to be able to explain their risk analysis to those individuals (IT & Business) accountable for the risk and responsible for the mitigation activity.

In summary, I get a lot of satisfaction from teaching others the FAIR methodology. But equally satisfying is me knowing that I am helping build a culture of analytical thinking for both the class participant and our employer.


More Heat Map Love

May 11, 2010

In my previous post “Heat Map Love” I attempted to illustrate the relationship between plots on a heat map and a loss distribution. In this post I am going to illustrate another method to show the relationship – hopefully in simpler terms.

In the heat map above I have plotted five example risk issues:

I: Application security; cross-site scripting; external facing application(s); actual loss events between 2-10 times a year; low magnitude per event – less then $10,000.

II: Data confidentiality; lost / stolen mobile device or computer; no hard disk encryption; simulated or actual one loss event per year, low to moderate magnitude per event.

III:  PCI-DSS Compliance; level 2 Visa merchant; not compliant with numerous PCI-DSS standards; merchant self-reports not being in compliance this year; merchant expects monthly fines of $5,000 for a one year total of $60,000.

IV: Malware outbreak; large malware outbreak (greater then 10% of your protected endpoints). Less then once every ten years; magnitude per event could range between $100,000 and $1,000,000; productivity hit, external consulting, etc.

V: Availability; loss of data center; very low frequency; very large magnitude per event.

Since there is a frequency and magnitude of loss associated with each of these issues we can conceptually associate these issues with a loss distribution (assuming that our loss distribution is a normal-like or log normal).

Step 1: Hold a piece of paper with the heat map looking like the image below:

Step 2: Flip the paper towards you so the heat map looks like image below (flip vertical):


Step 3: Rotate the paper counter-clockwise 90 degrees; it should like the image below.


For ease of illustration; let’s overlay a log normal distribution.

What we see is in line with what we discussed in the “Heat Map Love” post:

Risk V – Loss of data center; is driving the tail; very low frequency; very large magnitude.
Risk IV – Malware outbreak; low frequency; but significant or high magnitude.
Risk III – Annual PCI fines from Visa via acquirer / processor; once per year; $60K.
Risk II – Lost or stolen laptop that had confidential information on it; response and notification costs not expected to be significant.
Risk I – Lots of small application security issues; for example cross site scripting; numerous detected and reported instances per year; low cost per event.

There you have it – a less technical way to perform a sniff test on your heat map plots and / or validate against a loss distribution.

Once you have taught everyone how to perform this artwork paper rotation trick. You can have a paper airplane flying contest.


Rainbow Risk

April 1, 2010

One of the benefits of quantifying risk for an information risk issue or finding is that it gives us an additional dimension for aggregate risk analysis. Instead of just simple counts of issues categorized by ISO 27002 section – we can now analyze risk (dollar values) in numerous contexts; limited primarily by the data you are collecting and your ability to associate some data elements to various frameworks. (ISO 27002, BASEL II, COBIT IT Processes, etc…).

Loss simulation and aggregate risk analysis is a big part of my world right now. However, one of the most complex challenges I face is visually representing risk in a meaningful manner to management. So, over the next few posts, I want to share some visualizations of risk and how they could possibly be used for decision making.

The chart above is often referred to as a “rainbow chart”. Technically speaking though, it is a 100% Stacked Bar Chart. What this chart is depicting is the breakdown of risk (percentage of loss and ISO 27002 section) of a simulated aggregate loss distribution; by loss distribution percentile. Say what…?

Let’s break it down…

Warning – I am going to oversimplify some statistical concepts. And yes – all of the percentages and dollar values are from a dataset with fictitious data for the purposes of illustration.

Output from simulations are often analyzed by percentiles. We often zero in on the 50th percentile; sometimes an indicator of the mean or expected loss for a defined time frame. For illustration purposes let’s associate some dollar values to loss distribution percentiles:

5th percentile: $100K
50th percentile: $1M
80th percentile: $2M
99th percentile: $5M

Given the dollar values above as well as the chart we can infer the following:

5th percentile:
a.    5% of the iterations resulted in simulated loss of around $100K.
b.    Roughly 30% of the simulated loss at the 5th percentile is related to Compliance issues (~33%; $33K).
c.    Another 30% of the simulated loss at the 5th percentile is related to Access Control issues (~30%; $30K).
d.    About 20% of the simulated loss at the 5th percentile is related to Systems Development & Maintenance issues (~20%; $20K).
e.    About 8% of the simulated loss at the 5th percentile is related to Business Continuity Mgmt issues (~8%; $8K).

50th percentile:
a.    50% of the iterations resulted in simulated loss of around $1M.
b.    Roughly 28% of the simulated loss at the 50th percentile is related to Compliance issues (~28%; $280K).
c.    Another 30% of the simulated loss at the 50th percentile is related to Access Control issues (~30%; $300K).
d.    About 22% of the simulated loss at the 50th percentile is related to Systems Development & Maintenance issues (~22%; $220K).
e.    About 12% of the simulated loss at the 50th percentile is related to Business Continuity Mgmt issues (~12%; $120K).

99th percentile:
a.    1% of the iterations resulted in simulated loss of around $5M (100% -99% = 1%).
b.    Roughly 12% of the simulated loss at the 99th percentile is related to Compliance issues (~12%; $600K).
c.    Another 15% of the simulated loss at the 99th percentile is related to Access Control issues (~15%; $750K).
d.    About 12% of the simulated loss at the 99th percentile is related to Systems Development & Maintenance issues (~12%; $600K).
e.    About 55% of the simulated loss at the 99th percentile is related to Business Continuity Mgmt issues (~55%; $2.75M).

PRACTICAL APPLICATION

This type of risk visualization allows for the following:

1.    BETTER VISIBILITY. Gives us visibility to risk by loss severity and by ISO 27002 sections.
2.    INFORMED DECISIONS. Allows for tactical and strategic decision making.
a.    Tactical. Risk around the 50th percentile is loss we would expect 50% of the time; time being annually. Thus, we can actively manage (reduce or maintain) expected annual loss by targeting categories of risk at this or surrounding percentiles.
b.    Strategic. Risk around the 80th (1-in-5) or 90th (1-in-10) percentiles is greater in severity but less likely to occur. However when you think of this in terms of years and given the volatility of threats in our profession – we have to be mindful of low frequency / high magnitude loss events. Thus, to address the risk at these percentiles – we could be more strategic in our planning and investments. It could take a few years to mitigate the risk down to an acceptable level – but we can spread those mitigation costs over time.
3.    COST / BENEFIT ANALYSIS. Above I listed risk at the 80th percentile to be about $2M dollars. We can quickly see that 30% of the risk at the 80th percentile is related to Access Control (~30%, $600K). If I want to reduce the access control related risk at this percentile and I estimate it’s going to cost me $100K – I can demonstrate a risk reduction both mathematically and by re-running my simulations absent issues that would be mitigated by my investment.
4.    DRILL DOWN. Depending on how your issues are tagged, we could drill down into ISO 27002 sections to see which sub-sections are driving the most risk for its parent section.

FINAL THOUGHT. There is uncertainty with risk. Performing aggregate analysis and simulations on your entire risk repository highlights the variability of loss and can make it easier to explain uncertainty to others outside the information security profession. Take for example the “compliance risk” values above. Annual expected loss associated with Compliance-related issues could be as little as $33K, most likely $280K and possibly $600K or worse. Do not underestimate management’s ability to understand this uncertainty and their appetite to make effective decisions around it.


Application Security Risk Assessments

March 16, 2009

I have so many topics and thoughts that I want to communicate on this blog. I could write for days on PCI-DSS; especially an exercise I recently lead to select a QSA for a professional services consulting engagement; not to be confused with a PCI-DSS compliance assessment. I have a load of thoughts about the current book I am reading by David Vose titled “Risk Analysis, A Quantitative Guide”. I still have not transferred my notes regarding Securosis’ “Business Justification for Data Security” paper to a blog post. BTW, Rich Mogull – congrats on the new born!

For this post I want to share some information about a professional development project I lead back in 2007 and 2008 for my employer that bridges application security and risk assessments.

When I first started working for my employer, I knew that I was going to be performing risk assessments. However, I thought most of my risk assessments were going to be more in the area of networking information security and data information security. Within a year of starting the job – I knew that application security and risk assessments were going to be a significant part of my job. So, because I was weak in the application security discipline, I added application security to my personal development plan in 2007; it is still there today. My quest for learning more about application security resulted in co-developing an application security assessment methodology for our employer as well as lead me to reviving the Columbus, OH OWASP Chapter.

The problem I faced in 2007 was that I was being assigned to more and more complex application development projects. Because I was not very skilled in the application security discipline, I could not perform effective risk assessments. For example, I struggled identifying meaningful vulnerabilities let alone being able to validate vulnerabilities without wasting another team member’s time. So, my manager allowed me and two higher skilled individuals on our team to sit down and document how they approach an application security risk assessment. Eventually, one of the two individuals left our company and went to go work for Amazon as one of their head application security professionals. The vacancy was short. We added another person to our team who came up through the application development ranks and provided the extra spark we needed to get the methodology from a straw-man to a usable product in late 2008.

Disclaimer 1: Our application security assessment methodology is not limited to just web applications. In large and/or complex environments – you do not have the luxury of dealing with just web platforms and simple backend databases. You will most likely be dealing with multiple applications, use cases, and business processes that span numerous technologies, languages and protocols.

There six high level concepts to this methodology. The first three concepts are straightforward and usually give you 80% of the information you need to facilitate a risk assessment. The last three concepts usually require more time and understanding. You will also notice how they all build upon one another – which should also reduce the number of information gathering activities. Let’s jump into them.

1. Information classification.
The very first thing we want to understand is what type of information is handled or exposed as part of this development effort. Is it public information or confidential information? Are we sharing information with a new application? Are we sharing information with an external business partner? Does this development effort allow for data to be modified? What is the business purpose of this information?

Let’s face it, for most vulnerabilities that can be exploited AND result in data loss – if we do not understand the value of the information – we cannot appropriately communicate the severity of the exposure (or lack thereof) to our business partners. I want to give a quick plug to the Securosis team on the information value concept. They dedicate a portion of their “Business Justification” white paper to information value and it is worth reading if you need a refresher on or are new to information value.

2. Use Cases.
The next area we want to have visibility into is use cases. Understanding the use cases is probably one of the most efficient ways to learn information about an application. Properly documented use cases should let you know who is using the application, what other applications the application under review interfaces with, data flow, business rules, and a ton of other information.

Do not underestimate the value of use cases. BTW, documented use cases should be a requirement within any project delivery methodology. If your organization does not have a defined project delivery methodology – you can find use case templates on the Internet. Make it one of *your* requirements in order to complete an application security assessment. Not only will you benefit – but more then likely the whole project team will benefit from it as well.

3. Application Architecture.
Application architecture can mean different things to different people to different organizations. To keep this high level – let’s stick with application architecture in the context of web applications and interfaces with other applications. If a web application: is the architecture appropriately tiered? Does the architecture result in the application interfacing with other applications or business partners where it could be inheriting risk of those applications? Does the data being handled have special architecture security requirements? There are dozens of other questions but you should get the point.

A simple example of where application architecture comes into play is PCI-DSS. In appendix F of the PCI-DSS requirements – you will notice that one of the first questions the QSA (auditor) is asked is whether scope of the assessment can be reduced due to network segmentation. I am sure you can think of some other business processes or information types where security requirements can significantly impact the application architecture (or find vulnerabilities within the existing application architecture).

4. Access management.
Simply put, how do we control access to the application and how does this application interface with other applications, databases, or services. I consider this security 101 stuff – but in an application security context. Also, even in companies with well defined security policies and mature security practices – do not think for even a quarter of a second that there is not someone trying to do more with less when it comes to shared IDs and passwords or using unauthorized user repositories for access management (local DB versus LDAP integration).

An additional thought that I would share on this topic is not taking for granted excessive rights a user might have for public data. An example that one of the co-developers of our methodology uses is that of an intranet application that allows all employees to view the cafeteria menu. This is essentially public information. However, we still need to have more stringent controls around who can modify the data lest we find ourselves reading that the main lunch course is “possum belly and grits” instead of a “muffaletta sandwich with olive salad”.

5. Code implementation.
This is a huge and important concept of which I can only scratch the surface as part of this blog post. For the OWASP folks out there – you know what this is. Do we have sound code development practices? Do we have security controls at all tiers of the application; including the most forgotten tier – the client tier? Do I have special data security requirements that need to be accounted for in my application code?

While this concept may be the most complex it is also a concept that separates the wheat from the chaff amongst application security professionals, information security professionals as well as security-minded developers. If you ask a self-proclaimed security professional or a self-proclaimed security-minded application developer what input validation or buffer overflows are and they miss the answer by a mile – red flag. Inspecting code and validating vulnerabilities is another topic worth mentioning in this post. I have been in the position where I ran a web application vulnerability scanner and raised red flags on SQL injection findings to the development team; only to find out that it was a false positive. I was personally embarrassed and professionally I felt that it made our profession look bad. It was a valuable lesson learned and continues to be something I reflect on from time to time to make sure my personal development plans are hitting the mark.

6. Operational Plans.
This concept is more about how an application team manages their application. Whether more in the context of the software development life cycle (SDLC) or day-to day operations – there are often points of vulnerability in this area that get overlooked. Some questions related to this concept may be: How do I control my source code, let alone access to it? Do I have adequate and appropriate logging within the application? Am I logging information I should not be? How do I provision and de-provision access to the application? How is production support handled?

It is too easy to forget about the day-to-day operations of an application especially for projects where the application is brand new. The way I look at it, this is the perfect time to make sure the requirements are established up front as well as implemented – versus being reactive post implementation. For existing applications – especially application that never received a security review or a recent security review – here is the reality: The threat landscape and regulatory landscape is ever changing. Just because there were not security requirements five years ago or even 20+ years ago (yes, we have an application that old) does not mean security controls should not be implemented today.

So there you have it folks, a very general approach to performing an application security assessment. Of course, once you flush out some risk issues, you can assess them for risk using your favorite risk assessment methodology.

If this application security assessment methodology appeals to you – let me know. Better yet – if what general information I have shared is lacking – please let me know. Either point me in the direction of a better methodology or give me some insight to help make it better. One final note, I do have intentions of getting my employer’s permission to give a more detailed presentation of our application security assessment methodology at a future Columbus, OH OWASP Chapter meeting.

Thanks for reading my blog – have an awesome day!


Follow

Get every new post delivered to your Inbox.