Monday, November 7, 2016

Swift and massive data classification advances score a win for better securing sensitive information

The next BriefingsDirect Voice of the Customer digital transformation case study explores how -- in an era when cybersecurity attacks are on the rise and enterprises and governments are increasingly vulnerable -- new data intelligence capabilities are being brought to the edge to provide better data loss prevention (DLP).

We'll learn how Digital Guardian in Waltham, Massachusetts analyzes both structured and unstructured data to predict and prevent loss of data and intellectual property (IP) with increased accuracy.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or  download a copy.
 
To learn how data recognition technology supports network and endpoint forensic insights for enhanced security and control, we're joined by Marcus Brown, Vice President of Corporate Business Development for Digital Guardian. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: What are some of the major trends making DLP even more important, and even more effective?

Brown: Data protection has very much to come to the forefront in the last couple of years. Unfortunately, we wake up every morning and read in the newspapers, see on television, and hear on the radio a lot about data breaches. It’s pretty much every type of company, every type of organization, government organizations, etc., that’s being hit by this phenomenon at the moment.

Brown

So, awareness is very high, and apart from the frequency, a couple of key points are changing. First of all, you have a lot of very skilled adversaries coming into this, criminals, nation-state actors, hactivists, and many others. All these people are well-trained and very well resourced to come after your data. That means that companies have a pretty big challenge in front of them. The threat has never been bigger.

In terms of data protection, there are a couple of key trends at the cyber-security level. People have been aware of the so-called insider threat for a long time. This could be a disgruntled employee or it could be someone who has been recruited for monetary gain to help some organization get to your data. That’s a difficult one, because the insider has all the privilege and the visibility and knows where the data is. So, that’s not a good thing.

Then, you have employees, well-meaning employees, who just make mistakes. It happens to all of us. We touch something in Outlook, and we have a different email address than the one we were intending, and it goes out. The well-meaning employees, as well, are part of the insider threat.

Outside threats

What’s really escalated over the last couple of years are the advanced external attackers or the outside threat, as we call it. These are well-resourced, well-trained people from nation-states or criminal organizations trying to break in from the outside. They do that with malware or phishing campaigns.

About 70 percent of the attacks stop with the phishing campaign, when someone clicks on something that looked normal. Then, there's just general hacking, a lot of people getting in without malware at all. They just hack straight in using different techniques that don’t rely on malware.

People have become so good at developing malware and targeting malware at particular organizations, at particular types of data, that a lot of tools like antivirus and intrusion prevention just don’t work very well. The success rate is very low. So, there are new technologies that are better at detecting stuff at the perimeter and on the endpoint, but it’s a tough time.

There are internal and external attackers. A lot of people outside are ultimately after the two main types of data that companies have. One is a customer data, which is credit card numbers, healthcare information, and all that stuff. All of this can be sold on the black market per record for so-and-so many dollars. It’s a billion-dollar business. People are very motivated to do this.
Learn More About HPE IDOL
Advanced Enterprise Search and Analytics
For Unstructured Data
Most companies don’t want to lose their customers’ data. That’s seen as a pretty bad thing, a bad breach of trust, and people don’t like that. Then, obviously, for any company that has a product where you have IP, you spent lots of money developing that, whether it’s the new model of a car or some piece of electronics. It could be a movie, some new clothing, or whatever. It’s something that you have developed and it’s a secret IP. You don’t want that to get out, as well as all of your other internal information, whether it’s your financials, your plans, or your pricing. There are a lot of people going after both of those things, and that’s really the challenge.

In general, the world has become more mobile and spread out. There is no more perimeter to stop people from getting in. Everyone is everywhere, private life and work life is mixed, and you can access anything from anywhere. It’s a pretty big challenge.

Gardner: Even though there are so many different types of threats, internal, external, and so forth, one of the common things that we can do nowadays is get data to learn more about what we have as part of our inventory of important assets.

While we might not be able to seal off that perimeter, maybe we can limit the damage that takes place by early detection of problems. The earlier that an organization can detect that something is going on that shouldn’t be, the quicker they can come to the rescue. How does the instant analysis of data play a role in limiting negative outcomes?

Can't protect everything

Brown: If you want to protect something, you have to know it’s sensitive and that you want to protect it. You can’t protect everything. You're going to find which data is sensitive, and we're able to do that on-the-fly to recognize sensitive data and nonsensitive data. That’s a key part of the DLP puzzle, the data protection puzzle.

We work for some pretty large organizations, some of the largest companies and government organizations in the world, as well as lot of medium- and smaller-sized customers. Whatever it is we're trying to protect, personal information or indeed the IP, we need to be in the right place to see what people are doing with that data.

Our solution consists of two main types of agents. Some agents are on endpoint computers, which could be desktops or servers, Windows, Linux, and Macintosh. It’s a good place to be on the endpoint computer, because that’s where people, particularly the insider, come into play and start doing something with data. That’s where people work. That’s how they come into the network and it’s how they handle a business process.

So the challenge in DLP is to support the business process. Let people do with data what they need to do, but don’t let that data get out. The way to do that is to be in the right place. I already mentioned the endpoint agent, but we also have network agents, sensors, and appliances in the network that can look at data moving around.

The endpoint is really in the middle of the business process. Someone is working, they're working with different applications, getting data out of those applications, and they're doing whatever they need to do in their daily work. That’s where we sit, right in the middle of that, and we can see who the user is and what application they're working with it. It could be an engineer working with the computer-aided design (CAD) or the product lifecycle management (PLM) system developing some new automobile or whatever, and that’s a great place to be.

We rely very heavily on the HPE IDOL technology for helping us classify data. We use it particularly for structured data, anything like a credit card number, or alphanumeric data. It could be also free text about healthcare, patient information, and all this sort of stuff.

We use IDOL to help us scan documents. We can recognize regular expressions, that’s a credit card number type of thing, or Social Security. We can also recognize terminology. We rely on the fact that IDOL supports hundreds of languages and many different subject areas. So, using IDOL, we're able to recognize a whole lot of anything that’s written in textual language.

Our endpoint agent also has some of its own intelligence built in that we put on top of what we call contextual recognition or contextual classification. As I said, we see the customer list coming out of Salesforce.com or we see the jet fighter design coming out of the PLM system and we then tag that as well. We're using IDOL, we're using some of our technology, and we're using our vantage point on the endpoint being in the business process to figure out what the data is.

We call that data-in-use monitoring and, once we see something is sensitive, we put a tag on it, and that tag travels with the data no matter where it goes.

An interesting thing is that if you have someone making a mistake, an unintentional, good-willed employee, accidentally attaching the wrong doc to something that it goes out, obviously it will warn the user of that.

We can stop that

If you have someone who is very, very malicious and is trying to obfuscate what they're doing, we can see that as well. For example, taking a screenshot of some top-secret diagram, embedding that in a PowerPoint and then encrypting the PowerPoint, we're tagging those docs. Anything that results from IP or top-secret information, we keep tagging that. When the guy then goes to put it on a thumb drive, put it on Dropbox, or whatever, we see that and stop that.

So that’s still a part of the problem, but the two points are classify it, that’s what we rely on IDOL a lot for, and then stop it from going out, that’s what our agent is responsible for.

Gardner: Let’s talk a little bit about the results here, when behaviors, people and the organization are brought to bear together with technology, because it’s people, process and technology. When it becomes known in the organization that you can do this, I should think that that must be a fairly important step. How do we measure effectiveness when you start using a technology like Digital Guardian? Where does that become explained and known in the organization and what impact does that have?

Brown: Our whole approach is a risk-based approach and it’s based on visibility. You’ve got to be able to see the problem and then you can take steps and exercise control to stop the problems.
Learn More About HPE IDOL
Advanced Enterprise Search and Analytics
For Unstructured Data
When you deploy our solution, you immediately gain a lot of visibility. I mentioned the endpoints and I mentioned the network. Basically, you get a snapshot without deploying any rules or configuring in any complex way. You just turn this on and you suddenly get this rich visibility, which is manifested in reports, trends, and all this stuff. What you get, after a very short period of time, is a set of reports that tell you what your risks are, and some of those risks may be that your HR information is being put on Dropbox.

You have engineers putting the source code onto thumb drives. It could all be well-meaning, they want to work on it at home or whatever, or it could be some bad guy.

One the biggest points of risk in any company is when an employee resigns and decides to move on. A lot of our customers use the monitoring and the reporting we have at that time to actually sit down with the employee and say, "We noticed that you downloaded 2,000 files and put them on a thumb drive. We’d like you to sign this saying that you're going to give us that data back."

That’s a typical use case, and that’s the visibility you get. You turn it on and you suddenly see all these risks, hopefully, not too many, but a certain number of risks and then you decide what you're going to do about it. In some areas you might want to be very draconian and say, "I'm not going to allow this. I'm going to completely block this. There is no reason why you should put the jet fighter design up on Dropbox."

Gardner: That’s where the epoxy in the USB drives comes in.

Warning people

Brown: Pretty much. On the other hand, you don’t want to stop people using USB, because it’s about their productivity, etc. So, you might want to warn people, if you're putting some financial data on to a thumb drive, we're going to encrypt that so nothing can happen to it, but do you really want to do this? Is this approach appropriate? People get a feeling that they're being monitored and that the way they are acting maybe isn't according to company policy. So, they'll back out of it.

In a nutshell, you look at the status quo, you put some controls in place, and after those controls are in place, within the space of a week, you suddenly see the risk posture changing, getting better, and the incidence of these dangerous actions dropping dramatically.

Very quickly, you can measure the security return on investment (ROI) in terms of people’s behavior and what’s happening. Our customers use that a lot internally to justify what they're doing.

Generally, you can get rid of a very large amount of the risk, say 90 percent, with an initial pass, or initial first two passes of rules to say, we don’t want this, we don’t want that. Then, you're monitoring the status, and suddenly, new things will happen. People discover new ways of doing things, and then you’ve got to put some controls in place, but you're pretty quickly up into the 90 percent and then you fine-tuning to get those last little bits of risk out.

Gardner: Because organizations are becoming increasingly data-driven, they're getting information and insight across their systems and their applications. Now, you're providing them with another data set that they could use. Is there some way that organizations are beginning to assimilate and analyze multiple data sets including what Digital Guardian’s agents are providing them in order to have even better analytics on what’s going on or how to prevent unpleasant activities?

Brown: In this security world, you have the security operations center (SOC), which is kind of the nerve center where everything to do with security comes into play. The main piece of technology in that area is the security information and event management (SIEM) technology. The market leader is HPE’s ArcSight, and that’s really where all of the many tools that security organizations use come together in one console, where all of that information can be looked at in a central place and can also be correlated.

We provide a lot of really interesting information for the SIEM for the SOC. I already mentioned we're on the endpoint and the network, particularly on the endpoint. That’s a bit of a blind spot for a lot of security organizations. They're traditionally looking at firewalls, other network devices, and this kind of stuff.

We provide rich information about the user, about the data, what’s going on with the data, and what’s going on with the system on the endpoint. That’s key for detecting malware, etc. We have all this rich visibility on the endpoint and also from the network. We actually pre-correlate that. We have our own correlation rules. On the endpoint computer in real time, we're correlating stuff. All of that gets populated into ArcSight.

At the recent HPE Protect Show in National Harbor in September we showed the latest generation of our integration, which we're very excited about. We have a lot of ArcSight content, which helps people in the SOC leverage our data, and we gave a couple of presentations at the show on that.

Gardner: And is there a way to make this even more protected? I believe encryption could be brought to bear and it plays a role in how the SIEM can react and behave.

Seamless experience

Brown: We actually have a new partnership, related to HPE's acquisition of Voltage, which is a real leader in the e-mail security space. It’s all about applying encryption to messages and managing the keys and making that user experience very seamless and easy to use.

Adding to that, we're bundling up some of the classification functionality that we have in our network sensors. What we have is a combination between Digital Guardian Network, DOP, and the HPE Data Security Encryption solution, where an enterprise can define a whole bunch of rules based on templates.

We can say, "I need to comply with HIPAA," "I need to comply with PCI," or whatever standard it is. Digital Guardian on the network will automatically scan all the e-mail going out and automatically classify according to our rules which e-mails are sensitive and which attachments are sensitive. It then goes on to the HPE Data Security Solution where it gets encrypted automatically and then sent out.

It’s basically allowing corporations to apply standard set of policies, not relying on the user to say they need to encrypt this, not leaving it to the user’s judgment, but actually applying standard policies across the enterprise for all e-mail making sure they get encrypted. We are very excited about it.
Learn More About HPE IDOL
Advanced Enterprise Search and Analytics
For Unstructured Data
Gardner: That sounds key -- using encryption to the best of its potential, being smart about it, not just across the waterfront, and then not depending on a voluntary encryption, but doing it based on need and intelligence.
 
Brown: Exactly.

Gardner: For those organizations that are increasingly trying to be data-driven, intelligent, taking advantage of the technologies and doing analysis in new interesting ways, what advice might you offer in the realm of security? Clearly, we’ve heard at various conferences and other places that security is, in a sense, the killer application of big-data analytics. If you're an organization seeking to be more data-driven, how can you best use that to improve your security posture?

Brown: The key, as far as we’re concerned, is that you have to watch your data, you have to understand your data, you need to collect information, and you need visibility of your data.

The other key point is that the security market has been shifting pretty dramatically from more of a network view much more toward the endpoint. I mentioned earlier that antivirus and some of these standard technologies on the endpoint aren't really cutting it anymore. So, it’s very important that you get visibility down at the endpoint and you need to see what users are doing, you need to understand what your systems are running, and you need to understand where your data is.

So collect that, get that visibility, and then leverage that visibility with analytics and tools so that you can profit from an automated kind of intelligence.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or  download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

Tuesday, November 1, 2016

2016 election campaigners look to big data analysis to gain an edge in intelligently reaching voters

The next BriefingsDirect Voice of the Customer digital transformation case study explores how data-analysis services startup BlueLabs in Washington, DC helps presidential election campaigns better know and engage with potential voters.

We'll learn how BlueLabs relies on high-performing analytics platforms that allow a democratization of querying, of opening the value of vast data resources to discretely identify more of those in the need to know.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

Here to describe how big data is being used creatively by contemporary political organizations for two-way voter engagement, we're joined by Erek Dyskant Co-Founder and Vice President of Impact at BlueLabs Analytics in Washington. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Obviously, this is a busy season for the analytics people who are focused on politics and campaigns. What are some of the trends that are different in 2016 from just four years ago. It’s a fast-changing technology set, it's also a fast-changing methodology. And of course, the trends about how voters think, react, use social, and engage are also dynamic. So what's different this cycle?

Dyskant: From a voter-engagement perspective, in 2012, we could reach most of our voters online through a relatively small set of social media channels -- Facebook, Twitter, and a little bit on the Instagram side. Moving into 2016, we see a fragmentation of the online and offline media consumption landscape and many more folks moving toward purpose-built social media platforms.

If I'm at the HPE Conference and I want my colleagues back in D.C. to see what I'm seeing, then maybe I'll use Periscope, maybe Facebook Live, but probably Periscope. If I see something that I think one of my friends will think is really funny, I'll send that to them on Snapchat.
Join myVertica
To Get the Free
HPE Vertica Community Edition
Where political campaigns have traditionally broadcast messages out through the news-feed style social-media strategies, now we need to consider how it is that one-to-one social media is acting as a force multiplier for our events and for the ideas of our candidates, filtered through our campaign’s champions.

Gardner: So, perhaps a way to look at that is that you're no longer focused on precincts physically and you're no longer able to use broadcast through social media. It’s much more of an influence within communities and identifying those communities in a new way through these apps, perhaps more than platforms.

Social media

Dyskant: That's exactly right. Campaigns have always organized voters at the door and on the phone. Now, we think of one more way. If you want to be a champion for a candidate, you can be a champion by knocking on doors for us, by making phone calls, or by making phone calls through online platforms.

You can also use one-to-one social media channels to let your friends know why the election matters so much to you and why they should turn out and vote, or vote for the issues that really matter to you.

Gardner: So, we're talking about retail campaigning, but it's a bit more virtual. What’s interesting though is that you can get a lot more data through the interaction than you might if you were physically knocking on someone's door.

Dyskant: The data is different. We're starting to see a shift from demographic targeting. In 2000, we were targeting on precincts. A little bit later, we were targeting on combinations of demographics, on soccer moms, on single women, on single men, on rural, urban, or suburban communities separately.

Dyskant

Moving to 2012, we've looked at everything that we knew about a person and built individual-level predictive models, so that we knew each person's individual set of characteristics made that person more or less likely to be someone that our candidate would have an engaging conversation through a volunteer.

Now, what we're starting to see is behavioral characteristics trumping demographic or even consumer data. You can put whiskey drinkers in your model, you can put cat owners in your model, but isn't it a lot more interesting to put in your model that fact that this person has an online profile on our website and this is their clickstream? Isn't it much more interesting to put into a model that this person is likely to consume media via TV, is likely to be a cord-cutter, is likely to be a social media trendsetter, is likely to view multiple channels, or to use both Facebook and media on TV?

That lets us have a really broad reach or really broad set of interested voters, rather than just creating an echo chamber where we're talking to the same voters across different platforms.

Gardner: So, over time, the analytics tools have gone from semi-blunt instruments to much more precise, and you're also able to better target what you think would be the right voter for you to get the right message out to.

One of the things you mentioned that struck me is the word "predictive." I suppose I think of campaigning as looking to influence people, and that polling then tries to predict what will happen as a result. Is there somewhat less daylight between these two than I am thinking, that being predictive and campaigning are much more closely associated, and how would that work?

Predictive modeling

Dyskant: When I think of predictive modeling, what I think of is predicting something that the campaign doesn't know. That may be something that will happen in the future or it may be something that already exists today, but that we don't have an observation for it.

In the case of the role of polling, what I really see about that is understanding what issues matter the most to voters and how it is that we can craft messages that resonate with those issues. When I think of predictive analytics, I think of how is it that we allocate our resources to persuade and activate voters.

Over the course of elections, what we've seen is an exponential trajectory of the amount of data that is considered by predictive models. Even more important than that is an exponential set of the use cases of models. Today, we see every time a predictive model is used, it’s used in a million and one ways, whereas in 2012 it might have been used in 50, 20, or 100 sessions about each voter contract.

Gardner: It’s a fascinating use case to see how analytics and data can be brought to bear on the democratic process and to help you get messages out, probably in a way that's better received by the voter or the prospective voter, like in a retail or commercial environment. You don’t want to hear things that aren’t relevant to you, and when people do make an effort to provide you with information that's useful or that helps you make a decision, you benefit and you respect and even admire and enjoy it.

Dyskant: What I really want is for the voter experience to be as transparent and easy as possible, that campaigns reach out to me around the same time that I'm seeking information about who I'm going to vote for in November. I know who I'm voting for in 2016, but in some local actions, I may not have made that decision yet. So, I want a steady stream of information to be reaching voters, as they're in those key decision points, with messaging that really is relevant to their lives.
I want a steady stream of information to be reaching voters, as they're in those key decision points, with messaging that really is relevant to their lives.

I also want to listen to what voters tell me. If a voter has a conversation with a volunteer at the door, that should inform future communications. If somebody has told me that they're definitely voting for the candidate, then the next conversation should be different from someone who says, "I work in energy. I really want to know more about the Secretary’s energy policies."

Gardner: Just as if a salesperson is engaging with process, they use customer relationship management (CRM), and that data is captured, analyzed, and shared. That becomes a much better process for both the buyer and the seller. It's the same thing in a campaign, right? The better information you have, the more likely you're going to be able to serve that user, that voter.

Dyskant: There definitely are parallels to marketing, and that’s how we at BlueLabs decided to found the company and work across industries. We work with Fortune 100 retail organizations that are interested in how, once someone buys one item, we can bring them back into the store to buy the follow-on item or maybe to buy the follow-on item through that same store’s online portal. How it is that we can provide relevant messaging as users engage in complex processes online? All those things are driven from our lessons in politics.

Politics is fundamentally different from retail, though. It's a civic decision, rather than an individual-level decision. I always want to be mindful that I have a duty to voters to provide extremely relevant information to them, so that they can be engaged in the civic decision that they need to make.

Gardner: Suffice it to say that good quality comparison shopping is still good quality comparison decision-making.

Dyskant: Yes, I would agree with you.

Relevant and speedy

Gardner: Now that we've established how really relevant, important, and powerful this type of analysis can be in the context of the 2016 campaign, I'd like to learn more about how you go about getting that analysis and making it relevant and speedy across large variety of data sets and content sets. But first, let’s hear more about BlueLabs. Tell me about your company, how it started, why you started it, maybe a bit about yourself as well.

Dyskant: Of the four of us who started BlueLabs, some of us met in the 2008 elections and some of us met during the 2010 midterms working at the Democratic National Committee (DNC). Throughout that pre-2012 experience, we had the opportunity as practitioners to try a lot of things, sometimes just once or twice, sometimes things that we operationalized within those cycles.

Jumping forward to 2012 we had the opportunity to scale all that research and development to say that we did this one thing that was a different way of building models, and it worked for in this congressional array. We decided to make this three people’s full-time jobs and scale that up.

Moving past 2012, we got to build potentially one of the fastest-growing startups, one of the most data-driven organizations, and we knew that we built a special team. We wanted to continue working together with ourselves and the folks who we worked with and who made all this possible. We also wanted to apply the same types of techniques to other areas of social impact and other areas of commerce. This individual-level approach to identifying conversations is something that we found unique in the marketplace. We wanted to expand on that.
Join myVertica
To Get the Free
HPE Vertica Community Edition
Increasingly, what we're working on is this segmentation-of-media problem. It's this idea that some people watch only TV, and you can't ignore a TV. It has lots of eyeballs. Some people watch only digital and some people consume a mix of media. How is it that you can build media plans that are aware of people's cross-channel media preferences and reach the right audience with their preferred means of communications?

Gardner: That’s fascinating. You start with the rigors of the demands of a political campaign, but then you can apply in so many ways, answering the types of questions anticipating the type of questions that more verticals, more sectors, and charitable organizations would want to be involved with. That’s very cool.

Let’s go back to the data science. You have this vast pool of data. You have a snappy analytics platform to work with. But, one of the things that I am interested in is how you get more people whether it's in your organization or a campaign, like the Hillary Clinton campaign, or the DNC to then be able to utilize that data to get to these inferences, get to these insights that you want.

What is it that you look for and what is it that you've been able to do in that form of getting more people able to query and utilize the data?

Dyskant: Data science happens when individuals have direct access to ask complex questions of a large, gnarly, but well-integrated data set. If I have 30 terabytes of data across online contacts, off-line contacts, and maybe a sample of clickstream data, and I want to ask things like of all the people who went to my online platform and clicked the password reset because they couldn't remember their password, then never followed up with an e-mail, how many of them showed up at a retail location within the next five days? They tried to engage online, and it didn't work out for them. I want to know whether we're losing them or are they showing up in person.

That type of question maybe would make it into a business-intelligence (BI) report a few months from that, but people who are thinking about what we do every day, would say, "I wonder about this, turn it into a query, and say, "I think I found something." If we give these customers phone calls, maybe we can reset their passwords over the phone and reengage them.

Human intensive

That's just one tiny, micro example, which is why data science is truly a human-intensive exercise. You get 50-100 people working at an enterprise solving problems like that and what you ultimately get is a positive feedback loop of self-correcting systems. Every time there's a problem, somebody is thinking about how that problem is represented in the data. How do I quantify that. If it’s significant enough, then how is it that the organization can improve in this one specific area?

All that can be done with business logic is the interesting piece. You need very granular data that's accessible via query and you need reasonably fast query time, because you can’t ask questions like that when you're going to get coffee every time you run a query.

Layering predictive modeling allows you to understand the opportunity for impact if you fix that problem. That one hypothesis with those users who cannot reset their passwords is that maybe those users aren't that engaged in the first place. You fix their password but it doesn’t move the needle.

The other hypothesis is that it's people who are actively trying to engage with your server and are unsuccessful because of this one very specific barrier. If you have a model of user engagement at an individual level, you can say that these are really high-value users that are having this problem, or maybe they aren’t. So you take data science, align it with really smart individual-level business analysis, and what you get is an organization that continues to improve without having to have at an executive-decision level for each one of those things.

Gardner: So a great deal of inquiry experimentation, iterative improvement, and feedback loops can all come together very powerfully. I'm all for the data scientist full-employment movement, but we need to do more than have people have to go through data scientist to use, access, and develop these feedback insights. What is it about the SQL, natural language, or APIs? What is it that you like to see that allows for more people to be able to directly relate and engage with these powerful data sets?
It's taking that hypothesis that’s driven from personal stories, and being able to, through a relatively simple query, translate that into a database query, and find out if that hypothesis proves true at scale.

Dyskant: One of the things is the product management of data schemas. So whenever we build an analytics database for a large-scale organization I think a lot about an analyst who is 22, knows VLOOKUP, took some statistics classes in college, and has some personal stories about the industry that they're working in. They know, "My grandmother isn't a native English speaker, and this is how she would use this website."

So it's taking that hypothesis that’s driven from personal stories, and being able to, through a relatively simple query, translate that into a database query, and find out if that hypothesis proves true at scale.

Then, potentially take the result of that query, dump them into a statistical-analysis language, or use database analytics to answer that in a more robust way. What that means is that our schemas favor very wide schemas, because I want someone to be able to write a three-line SQL statement, no joins, that enters a business question that I wouldn't have thought to put in a report. So that’s the first line -- is analyst-friendly schemas that are accessed via SQL.

The next line is deep key performance indicators (KPIs). Once we step out of the analytics database, consumers drop into the wider organization that’s consuming data at a different level. I always want reporting to report on opportunity for impact, to report on whether we're reaching our most valuable customers, not how many customers are we reaching.

"Are we reaching our most valuable customers" is much more easily addressable; you just talk to different people. Whereas, when you ask, "Are we reaching enough customers," I don’t know how find out. I can go over to the sales team and yell at them to work harder, but ultimately, I want our reporting to facilitate smarter working, which means incorporating model scores and predictive analytics into our KPIs.

Getting to the core

Gardner: Let’s step back from the edge, where we engage the analysts, to the core, where we need to provide the ability for them to do what they want and which gets them those great results.

It seems to me that when you're dealing in a campaign cycle that is very spiky, you have a short period of time where there's a need for a tremendous amount of data, but that could quickly go down between cycles of an election, or in a retail environment, be very intensive leading up to a holiday season.

Do you therefore take advantage of the cloud models for your analytics that make a fit-for-purpose approach to data and analytics pay as you go? Tell us a little bit about your strategy for the data and the analytics engine.

Dyskant: All of our customers have a cyclical nature to them. I think that almost every business is cyclical, just some more than others. Horizontal scaling is incredibly important to us. It would be very difficult for us to do what we do without using a cloud model such as Amazon Web Services (AWS).

Also, one of the things that works well for us with HPE Vertica is the licensing model where we can add additional performance with only the cost of hardware or hardware provision through the cloud. That allows us to scale up our cost areas during the busy season. We'll sometimes even scale them back down during slower periods so that we can have those 150 analysts asking their own questions about the areas of the program that they're responsible for during busy cycles, and then during less busy cycles, scale down the footprint of the operation.
I do everything I can to avoid aggregation. I want my analysts to be looking at the data at the interaction-by-interaction level.

Gardner: Is there anything else about the HPE Vertica OnDemand platform that benefits your particular need for analysis? I'm thinking about the scale and the rows. You must have so many variables when it comes to a retail situation, a commercial situation, where you're trying to really understand that consumer?

Dyskant: I do everything I can to avoid aggregation. I want my analysts to be looking at the data at the interaction-by-interaction level. If it’s a website, I want them to be looking at clickstream data. If it's a retail organization, I want them to be looking at point-of-sale data. In order to do that, we build data sets that are very frequently in the billions of rows. They're also very frequently incredibly wide, because we don't just want to know every transaction with this dollar amount. We want to know things like what the variables were, and where that store was located.

Getting back to the idea that we want our queries to be dead-simple, that means that we very frequently append additional columns on to our transaction tables. We’re okay that the table is big, because in a columnar model, we can pick out just the columns that we want for that particular query.
Join myVertica
To Get the Free
HPE Vertica Community Edition
Then, moving into some of the in-database machine-learning algorithms allows us to perform more higher-order computation within the database and have less data shipping.

Gardner: We're almost out of time, but I wanted to do some predictive analysis ourselves. Thinking about the next election cycle, midterms, only two years away, what might change between now and then? We hear so much about machine learning, bots, and advanced algorithms. How do you predict, Erek, the way that big data will come to bear on the next election cycle?

Behavioral targeting

Dyskant: I think that a big piece of the next election will be around moving even more away from demographic targeting, toward even more behavioral targeting. How is it that we reach every voter based on what they're telling us about them and what matters to them, how that matters to them? That will increasingly drive our models.

To do that involves probably another 10X scale in the data, because that type of data is generally at the clickstream level, generally at the interaction-by-interaction level, incorporating things like Twitter feeds, which adds an additional level of complexity and laying in computational necessity to the data.

Gardner: It almost sounds like you're shooting for sentiment analysis on an issue-by-issue basis, a very complex undertaking, but it could be very powerful.

Dyskant: I think that it's heading in that direction, yes.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

Thursday, October 27, 2016

ServiceMaster's path to an agile development twofer: Better security and DevOps business benefits

The next BriefingsDirect Voice of the Customer security transformation discussion explores how home-maintenance repair and services provider ServiceMaster develops applications with a security-minded focus as a DevOps benefit.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript o download a copy.

To learn how security technology leads to posture maturity and DevOps business benefits, we're joined by Jennifer Cole, Chief Information Security Officer and Vice President of IT, Information Security, and Governance for ServiceMaster in Memphis, Tennessee, and Ashish Kuthiala, Senior Director of Marketing and Strategy at Hewlett Packard Enterprise DevOps. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Jennifer, tell me, what are some of the top trends that drive your need for security improvements and that also spurred DevOps benefits?

Cole: When we started our DevOps journey, security was a little bit ahead of the curve for application security and we were able to get in on the front end of our DevOps transformation.

Cole

The primary reason for our transformation as a company is that we are an 86-year-old company that has seven brands under one umbrella, and we needed to have one brand, one voice, and be able to talk to our customers in a way that they wanted us to talk to them.

That means enabling IT to get capabilities out there quickly, so that we can interact with our customers "digital first." As a result of that, we were able to see an increase in the way that we looked at security education and process. We were normally doing our penetration tests after the fact of a release. We were able to put tools in place to test prior to a release, and also teach our developers along the way that security is everyone's responsibility.

ServiceMaster has been fortunate that we have a C-suite willing to invest in DevOps and an Agile methodology. We also had developers who were willing to learn, and with the right intent to deliver code that would protect our customers. Those things collided, and we have the perfect storm.

So, we're delivering quicker, but we also fail faster allowing us to go back and fix things quicker. We're seeing an uptick in what we're delivering being a lot more secure.

Gardner: Ashish, it seems obvious, having heard Jennifer describe it, DevOps and security hand-in-hand -- a whole greater than the sum of the parts. Are you seeing this more across various industries?

Stopping defects

Kuthiala: Absolutely. With the adoption of DevOps increasing more across enterprises, security is no different than any other quality-assurance (QA) testing that you do. You can't let a defect reach your customer base; and you cannot let a security flaw reach your customer base as well.

Kuthiala
If you look at it from that perspective, and the teams are willing to work together, you're treated no differently than any other QA process. This boils not just to the vulnerability of your software that you're releasing in the marketplace, but there are so many different regulations and compliance [needs] -- internal, external, your own company policies -- that you have to take a look at. You don't want to go faster and compromise security. So, it's an essential part of DevOps.

Cole: DevOps allows for continuous improvement, too. Security comes at the front of a traditional SDLC process, while in the old days, security came last. We found problems after they were in production or something had been compromised. Now, we're at the beginning of the process and we're actually getting to train the people that are at the beginning of the process on how and why to deliver things that are safe for our customers.

Gardner: Jennifer, why is security so important? Is this about your brand preservation? Is this about privacy and security of data? Is this about the ability for high performance to maintain its role in the organization? All the above? What did I miss? Why is this so important?

Cole: Depending on the lens that you are looking through, that answer may be different. For me, as a CISO, it's making sure that our data is secure and that our customers have trust in us to take care of their information. The rest of the C-suite, I am sure, feels the same, but they're also very focused on transformation to digital-first, making sure customers can work with us in any way that they want to and that their ServiceMaster experience is healthy.

Our leaders also want to ensure our customers return to do business with us and are happy in the process.  Our company helps customers in some of the most difficult times in their life, or helps them prevent a difficult time in the ownership of their home.

But for me and the rest of our leadership team, it's making sure that we're doing what's right. We're training our teams along the way to do what's right, to just make the overall ServiceMaster experience better and safe. As young people move into different companies, we want to make sure they have that foundation of thinking about security first -- and also the customer.
Learn More About DevOps
Solutions that Unify
Development and Operations
We tend to put IT people in a back room, and they never see the customer. This methodology allows IT to see what they could have released and correct it if it's wrong, and we get an opportunity to train for the future.
Through my lens, it’s about protecting our data and making sure our customers are getting service that doesn't have vulnerabilities in it and is safe.

Gardner: Now, Ashish, user experience is top of mind for organizations, particularly organizations that are customer focused like ServiceMaster. When we look at security and DevOps coming together, we can put in place the requirements to maintain that data, but it also means we can get at more data and use it more strategically, more tactically, for personalization and customization -- and at the same time, making sure that those customers are protected.

How important is user experience and data gathering now when it comes to QA and making applications as robust as they can be?

Million-dollar question

Kuthiala: It's a million-dollar question. I'll give you an example of a client I work with. I happen to use their app very, very frequently, and I happen to know the team that owns that app. They told me about 12 months ago that they had invested -- let’s just make up this number -- $1 million in improving the user experience. They asked me how I liked it. I said, "Your app is good. I only use this 20 percent of the features in your app. I really don’t use the other 80 percent. It's not so useful to me."

That was an eye-opener to them, because the $1 million or so that they would have invested in enriching the user experience -- if they knew exactly what I was doing as a user, what I use, what I did not use, where I had problems -- could have used that toward that 20 percent that I use. They could have made it better than anybody else in the marketplace and also gathered information on what is it that the market wants by monitoring the user experience with people like me.
It's not just the availability and health of the application; it’s the user experience. It's having empathy for the user, as an end user.

It's not just the availability and health of the application; it’s the user experience. It's having empathy for the user, as an end-user. HPE of course, makes a lot of these tools, like HPE AppPulse, which is very specifically designed to capture that mobile user experience and bring it back before you have a flood of calls and support people screaming at you as to why the application isn’t working.

Security is also one of those things. All is good until something goes wrong. You don't want to be in a situation when something has actually gone wrong and your brand is being dragged through mud in the press, your revenue starts to decline, and then you look at it. It’s one of those things that you can't look at after the fact.

Gardner: Jennifer, this strikes me as an under-appreciated force multiplier, that the better you maintain data integrity, security, and privacy, the more trust you are going to get to get more data about your customers that you can then apply back to a better experience for them. Is that something that you are banking on at ServiceMaster?
Learn More About DevOps
Solutions that Unify
Development and Operations
Cole: Absolutely. Trust is important, not only with our customers, but also our employees and leaders. We want people to feel like they're in a healthy environment, where they can give us feedback on that user experience. What I would say to what Ashish was saying is that DevOps actually gives us the ability to deliver what the business wants IT to deliver for our customers.

In the past 25 years, IT has decided what the customer would like to see. In this methodology, you're actually working with your business partners who understand their products and their customers, and they're telling you the features that need to be delivered. Then, you're able to pick the minimum viable product and deliver it first, so that you can capture that 20 percent of functionality.

Also, if you're wrapping security in front of that, that means security is not coming back to you later with the penetration test results and say that you have all of these things to fix, which takes time away from delivering something new for our customers.

This methodology pays off, but the journey is hard. It’s tough because in most companies you have a legacy environment that you have to support. Then, you have this new application environment that you’re creating. There's a healthy balance that you have to find there, and it takes time. But we've seen quicker results and better revenue, our customers are happier, they're enjoying the ServiceMaster experience, instead of our individual brand families, and we've really embraced the methodology.

Gardner: Do you have any examples that you can recall where you've done development projects and you’ve been able to track that data around that particular application? What’s going on with the testing, and then how is that applied back to a DevOps benefit? Maybe you could just walk us through an example of where this has really worked well.

Digital first

Cole: About a year and a half ago, we started with one of our brands, American Home Shield, and looked at where the low hanging fruit -- or minimum viable product -- was in that brand for digital first. Let me describe the business a little bit. Our customers reach out to us, they purchase a policy for their house and we maintain appliances and such in their home, but it is a contractor-based company. We send out a contractor who is not a ServiceMaster associate.

We have to make that work and make our customer feel like they've had a seamless experience with American Home Shield. We had some opportunity in that brand for digital first. We went after it and drastically changed the way that our customers did business with us. Now, it's caught on like wildfire, and we're really trying to focus on one brand and one voice. This is a top-down decision which does help us move faster.

All seven of our brands are home services. We're in 75,000 homes a day and we needed to identify the customers of all the brands, so that we could customize the way that we do business with them. DevOps allows us to move faster into the market and deliver that.

Gardner: Ashish, there aren't that many security vendors that do DevOps, or DevOps vendors that do security. At HPE, how have you made advances in terms of how these two areas come together?
The strengths of HPE in helping its customers lies with the very fact that we have an end-to-end diverse portfolio.

Kuthiala: The strengths of HPE in helping its customers lies with the very fact that we have an end-to-end diverse portfolio. Jennifer talked about taking the security practices and not leaving it toward the end of the cycle, but moving it to the very beginning, which means that you have to get developers to start thinking like security experts and work with the security experts.

Given that we have a portfolio that spans the developers and the security teams, our best practices include building our own customer-facing software products that incorporate security practices, so that when developers are writing code, they can begin to see any immediate security threats as well as whether their code is compliant with any applicable policies or not. Even before code is checked in, the process runs the code through security checks and follows it all the way through the software development lifecycle.

These are security-focused feedback loops. At any point, if there is a problem, the changes are rejected and sent back or feedback is sent back to the developers immediately.

If it makes through the cycle and a known vulnerability is found before release to production, we have tools such as App Defender that can plug in to protect the code in production until developers can fix it, allowing you to go faster but remain protected.

Cole: It blocks it from the customer until you can fix it.

Kuthiala: Jennifer, can you describe a little bit how you use some of these products?

Strategic partnership

Cole: Sure. We’ve had a great strategic partnership with HPE in this particular space. Application security caught on fire about two years ago at RSA, which is one of the main security conferences for anyone in our profession.

The topic of application security has not been focused to CISOs in my opinion. I was fortunate enough that I had a great team member who came back and said that we have to get on board with this. We had some conversations with HPE and ended up in a great strategic partnership. They've really held our hands and helped us get through the process. In turn, that helped make them better, as well as make us better, and that's what a strategic partnership should be about.

Now, we're watching things as they are developed. So, we're teaching the developer in real-time. Then, if something happens to get through, we have App Defender, which will actually contain it until we can fix it before it releases to our customer. If all of those defenses don’t work, we still do the penetration test along with many other controls that are in place. We also try to go back to just grassroots, sit down with the developers, and help them understand why they would want to develop differently next time.
The next step for ServiceMaster specifically is making solid plans to migrate off of our legacy systems, so that we can truly focus on maturing DevOps and delivering for our customer in a safer, quicker way.

Someone from security is in every one of the development scrum meetings and on all the product teams. We also participate in Big Room Planning. We're trying to move out of that overall governing role and into a peer-to-peer type role, helping each other learn, and explaining to them why we want them to do things.

Gardner: It seems to me that, having gone at this at the methodological level with those collaboration issues solved, bringing people into the scrum who are security minded, puts you in a position to be able to scale this. I imagine that more and more applications are going to be of a mobile nature, where there's going to be continuous development. We're also going to start perhaps using micro-services for development and ultimately Internet of Things (IoT) if you start measuring more and more things in your homes with your contractors.

Cole: We reach 75,000 homes a day. So, you can imagine that all of those things are going to play a big part in our future.

Gardner: Before we sign-off, perhaps you have projections as to where you like to see things go. How can DevOps and security work better for you as a tag team?
Learn More About DevOps
Solutions that Unify
Development and Operations
Cole: For me, the next step for ServiceMaster specifically is making solid plans to migrate off of our legacy systems, so that we can truly focus on maturing DevOps and delivering for our customer in a safer, quicker way, and so we're not always having to balance this legacy environment and this new environment.
If we could accelerate that, I think we will deliver to the customer quicker and also more securely.

Gardner: Ashish, last word, what should people who are on the security side of the house be thinking about DevOps that they might not have appreciated?

Higher quality

Kuthiala: This whole approach of adopting DevOps is to deliver your software faster to your customers with higher quality says it. DevOps is an opportunity for security teams to get deeply embedded in the mindset of the developers, the business planners, testers, production teams – essentially the whole software development lifecycle, which earlier they didn’t have the opportunity to do.

They would usually come in before code went to production and often would push back the production cycles by a few weeks because they had to do the right thing and ensure release of code that was secure. Now, they’re able to collaborate with and educate developers, sit down with them, tell them exactly what they need to design and therefore deliver secure code right from the design stage. It’s the opportunity to make this a lot better and more secure for their customers.

Cole: The key is security being a strategic partner with the business and the rest of IT, instead of just being a governing body.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript o download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in: