The Brandeis GPS blog

Insights on online learning, tips for finding balance, and news and updates from Brandeis GPS

Tag: analyzing data

Learn information technology management online at Brandeis

Did you know that Brandeis GPS offers courses for professional development? Enroll in an online course this fall and network with new colleagues in a 10-week, seminar-style online classroom capped at 20 students. Registration is now open and we’re celebrating by profiling our favorite fall courses.

Get an introduction to the “nuts and bolts” that span all areas of information technology. With this 10-week, graduate-level course, you’ll learn enough foundational information about each key area to assess and evaluate when and how each technology should be appropriately deployed to solve organizational challenges. Topics include:

  • An overview of the history of information technology
  • Telecommunications and networking
  • Data and transactional databases/enterprise systems (ERP)
  • Data warehousing and business intelligence
  • E-commerce and B2B systems
  • Security and compliance

Capture-2

Fall courses run Sept. 14-Nov. 22. Whether you’re looking to complete a full degree or advance your career through professional development, this course is designed to equip you with the necessary skills for making an impact in any industry or organization.

How it works:
Take a part-time, online course this fall without enrolling in one of our graduate programs. If you like what you learn and want to continue your education, you can apply your credits from this fall toward a future degree. Questions? Contact our enrollment team at gps@brandeis.edu or 781-736-8787 or fill out our first-time registration form and we’ll be in touch.

The Opportunities in Big Data Still Ripe for Innovation

– Associate Editor, BostInno Tech

Big data is the “new currency” — an innovation that can boost or bust a business when not properly taken advantage of. Smart startups have been dipping into the deluge of data to draw out audience analytics, predict maintenance before costly breakdowns or better deliver targeted treatments to their consumers.

With innovation naturally comes a surge of yet-to-be explored opportunities other companies should have the foresight to capitalize on.

“More big data disruption is coming,” said Ryan Betts, CTO of Bedford-based VoltDB, in an email to BostInno. “And it will be around real-time, interactive experiences.”

The space is one VoltDB has been able to establish itself in, by providing an in-memory relational database that combines massive data ingest with real-time analytics and decisioning, so that organizations can act on data at its greatest point of value.

Betts pointed to big-name behemoths, such as Google, Amazon, IBM, Oracle and Microsoft, that are also establishing themselves in the space. He noted “unlimited Internet-attached storage space can be purchased at very cost competitive prices,” which, when combined with “ubiquitous computing,” are creating a network effect that’s become increasingly beneficial to consumers.

“In the same way that social networks become more powerful and offer greater utility as members join and build connections,” Betts explained, “these devices will connect to share data, to cooperate with one another and to interact with us in our environment.”

Betts menCloud-Computing-captioned Nest, a company reinventing the thermostat and smoke alarm by connecting to the Internet and syncing up to apps in a way that’s reinventing climate control. The collision Betts’ described is even more evident in individuals’ “smartphone on the coffee table” or “tablet a family member uses for Facebook.”

He added, “For the consumer, the automation and the disruptive potential of these devices communicating and interacting with one another will create relevant, micro-personalized experiences.”

To Atlas Venture Partner Chris Lynch, co-founder and board member of Kendall Square’s big data hackerspace hack/reduce, the future is, indeed, in “automation, simplification and integration.” Lynch broke each element down in an email to BostInno, saying:

Automation of the process of analyzing data, simplification of the user interface to allow non-data scientists to participate in the big data revolution and integration of next generation analytics into legacy applications people already know how to use.

Lynch acknowledged big data’s downfalls, adding, “Platform and tool companies are largely played out.”

His comment was reminiscent of that of Google Ventures’ Rich Miner, who, at Harvard Business School’s recent Cyberposium, argued, “Big data is a very overused word.” He added that big data is often “a layer, not a startup itself.” Yet, he had formerly singled out Nest for taking “mundane devices” and making it work on users’ behalf, noting there’s “a huge amount of innovation” in the connected devices space — which all circles back to big data.

“From a pure technology perspective, we need to deliver scale, security and simplicity,” Lynch said. “[We need to] make it easy for people to absorb the technology and increase the time to value.”

To Betts, the industry can see immense value from interconnections, as well. As he posited:

Interconnections will impact factory manufacturing plants; impact how predictive maintenance is scheduled and executed on high-end industrial equipment; create connected Internet services that must scale authorization and authentication, detect and prevent financial, telephone and even online-game fraud, and make construction sites better monitored, safer and more efficient. And that’s not all. It will also participate in building a smarter electric grid that is cheaper, less wasteful, more reliable and designed to supply power to electric vehicles while generating power through broadly distributed residential solar panels and other alternative sources.

Now it’s up to innovators to seize the opportunities.

Click here to subscribe to our blog!

Footerindesign

How Companies Can Use Big Data to Make Better Decisions

By:  – Associate Editor, BostInno

Big Data has swiftly earned a lasting place in our lexicon, because its potential is real and impact undeniable. Companies can collectively scoff and brush big data off as just another trend, but that decision could lead to worse decisions down the road.

how-predictive-analytics-can-make-money-for-social-networks-46ce73d0c0“Every era has a bold new innovation that emerges as a defining advantage for those who get out ahead of the curve,” said Ali Riaz, CEO of enterprise software company Attivio, referencing the industrial revolution and, later, the information age. Giants of industry who took advantage of new machinery or market leaders who learned to leverage relational databases have historically had the upperhand.

“Today’s advantage — the new currency, if you will — is big data,” Riaz added. “Companies that don’t get ahead of this tsunami by using big data to their advantage will be crushed by it.”

Yet, this deluge of data isn’t new, it’s just been given a catchy two-word title.

When asked to define big data, Ely Kahn, co-founder and VP of business development for big data start-up Sqrrl, described it as massive amounts — tera- and petabytes’ worth — of unstructured and semi-structured data “organizations have historically been unable to analyze because it was too expensive or difficult.” With technologies like Hadoop and NoSQL databases surfacing, however, Kahn claimed those same organizations can now make sense of this type of data “cost effectively.”

To Marilyn Matz, CEO of fellow big data startup Paradigm4, the revolution goes beyond just high volumes of information, though.

“It is about integrating and analyzing data collected from new sources,” Matz said. “A central capability this enables is hyper-personalization and micro-targeting — including recommendation engines, location-based services and offers, personalized pricing,
precision medicine
and predictive equipment maintenance schedules.”

No matter the industry, big data has a key role to play in moving the needle for companies,mobile-app whether large or small. And that goes for companies currently unable to determine what their “big data” is. The unrecognizable could be customer sentiment in social media, server logs or clickstream data.

“Once you have identified untapped sources of data,” Kahn said, “you can use tools like Hadoop and NoSQL to analyze it.”

Matz broke down, by industry, what that ability to analyze could mean.

In the Commercial Sphere

In the commercial sphere, if a company knows 10 or 100 things about you and your situational context, then that company can do a far better job offering you something relevant to exactly where you are and what you might be interested in, increasing their opportunity to capture your respect, attention and dollars.

In the Industrial World

In the industrial world, if a manufacturing company knows where equipment is operated (hot and harsh climates versus moderate climates), as well as how that equipment is being used (lots of hard-braking) and collects data across a large fleet, then it can predict maintenance before costly breakdowns, saving millions of dollars — and it can price warranties more accurately, as well as improve designs and manufacturing processes.

In Pharma and Healthcare

In pharma and healthcare, evidence-based outcome studies that integrate genomic data, phenotypic data, clinical data, behavioral data, daily sensor data, et al., can lead to more targeted and effective treatment and outcomes for both wellness and illness.

Attivio has been using big data in one of the most vital ways by focusing on detecting military personnel who are at risk for suicide.

But, of course, big data still comes with challenges. Riaz acknowledged the reality, which is that every large organization is comprised of disconnected silos of information that come in all different formats; let alone the various business units, applications, protocols, information repositories, terminologies and schemas that doesn’t always mesh.

program-hero-strategic-analytics“Just dumping data into these unorganized but separate systems is anarchy and an egregious waste of time and money,” Riaz said. “Yet, this is how many technologies address the problem. It essentially just creates another big silo for the information to live in.”

Moving forward, additional ways to combine structured and unstructured data, as well as merge data from within an enterprise to data from outside of it, will need to emerge. And when it does, the impact will be glaringly obvious.

As Riaz posited:

The time to solve big problems with extreme information is upon us. Businesses, organizations and governments are putting a lot of faith – and money – into technology solutions to help them make sense of it all. As a technology industry, we owe it to these companies to deliver real products that deliver real results to real problems, not just create more work.

So, let’s start by making that first big decision: Understanding big data’s importance, no matter how big of a buzzword it’s become.

Click here to subscribe to our blog!

Footerindesign

Is an Average of Averages Accurate? (Hint: NO!)

by: Katherine S Rowell author of “The Best Boring Book Ever of Select Healthcare Classification Systems and Databases” available now!

Originally posted: http://ksrowell.com/blog-visualizing-data/2014/05/09/is-an-average-of-averages-accurate-hint-no/

Today a client asked me to add an “average of averages” figure to some of his performance reports. I freely admit that a nervous and audible groan escaped my lips as I felt myself at risk of tumbling helplessly into the fifth dimension of “Simpson’s Paradox”– that is, the somewhat confusing statement that averaging the averages of different populations produces the average of the combined population. (I encourage you to hang in and keep reading, because ignoring this concept is an all too common and serious hazard of reporting data, and you absolutely need to understand and steer clear of it!)

hand drawing blue arrowImagine that we’re analyzing data for several different physicians in a group. We establish a relation or correlation for each doctor to some outcome of interest (patient mortality, morbidity, client satisfaction). Simpson’s Paradox states that when we combine all of the doctors and their results, and look at the data in aggregate form, we may discover that the relation established by our previous research has reversed itself. Sometimes this results from some lurking variable(s) that we haven’t considered. Sometimes, it may be due simply to the numerical values of the data.

First, the “lurking variable” scenario. Imagine we are analyzing the following data for two surgeons:

  1. Surgeon A operated on 100 patients; 95 survived (95% survival rate).
  1. Surgeon B operated on 80 patients; 72 survived (90% survival rate).

At first glance, it would appear that Surgeon A has a better survival rate — but do these figures really provide an accurate representation of each doctor’s performance?

Deeper analysis reveals the following: of the 100 procedures performed by Surgeon A,

  • 50 were classified as high-risk; 47 of those patients survived (94% survival rate)
  • 50 procedures were classified as routine; 48 patients survived (96% survival rate)

Of the 80 procedures performed by Surgeon B,

  • 40 were classified as high-risk; 32 patients survived (80% survival rate)
  • 40 procedures were classified as routine; 40 patients survived (100% survival rate)

When we include the lurking classification variable (high-risk versus routine surgeries), the results are remarkably transformed.

Now we can see that Surgeon A has a much higher survival rate in the high-risk category (94% v. 80%), while Surgeon B has a better survival rate in the routine category (100% v. 96%).

Let’s consider the second scenario, where numerical values can change results.

First, imagine that every month, the results of a patient satisfaction survey are exactly the same (Table 1).

patient-satisfaction-survey-table1

The Table shows that calculating an average of each month’s result produces the same result (90%) as calculating a Weighted Average (90%). This congruence exists because each month, the denominator and numerator are exactly the same, contributing equally to the results.

Now consider Table 2, which also displays the number of responses received from a monthly patient-satisfaction survey, but where the number of responses and the number of patients who report being satisfied differ from month to month. In this case, taking an average of each month’s percentage allows some months to contribute to or affect the final result more than others. Here, for example, we are led to believe that 70% of patients are satisfied.

patient-satisfaction-survey-table2

All results should in fact be treated as the data-set of interest, where the denominator is Total Responses (2,565) and the numerator is Total Satisfied (1,650). This approach correctly accounts for the fact that there is a different number of values each month, weights them equally, and produces a correct satisfaction rate of 64%. That is quite a difference from our previous answer of 6% — almost 145 patients!

How we calculate averages really does matter if we are committed to understanding our data and reporting it correctly. It matters if we want to identify opportunities to improve, and are committed to taking action.

As a final thought about averages, here is a wryly amusing bit of wisdom on the topic that also has the virtue of being concise. “No matter how long he lives, a man never becomes as wise as the average woman of 48.” -H. L. Mencken.

I’d say that about sums up lurking variables and weighted averages — wouldn’t you?

– See more at: http://ksrowell.com/blog-visualizing-data/2014/05/09/is-an-average-of-averages-accurate-hint-no/#sthash.WCltUtKb.dpuf

Untitled-1

Protected by Akismet
Blog with WordPress

Welcome Guest | Login (Brandeis Members Only)