“The world’s entire scientific … heritage … is increasingly being digitized and locked up by a handful of private corporations… The Open Access Movement has fought valiantly to ensure that scientists do not sign their copyrights away but instead ensure their work is published on the Internet, under terms that allow anyone to access it.” – Aaron Swartz
Information, in the context of scholarly articles by research at universities and think-tanks, is not a zero sum game. In other words, one person cannot have more without having someone have less. When you start creating “Berlin” walls in the information arena within the halls of learning, then learning itself is compromised. In fact, contributing or granting the intellectual estate into the creative commons serves a higher purpose in society – an access to information and hence, a feedback mechanism that ultimately enhances the value to the end-product itself. How? Since now the product has been distributed across a broader and diverse audience, and it is open to further critical analyses.
The universities have built a racket. They have deployed a Chinese wall between learning in a cloistered environment and the world who are not immediate participants. The Guardian wrote an interesting article on this matter and a very apt quote puts it all together.
“Academics not only provide the raw material, but also do the graft of the editing. What’s more, they typically do so without extra pay or even recognition – thanks to blind peer review. The publishers then bill the universities, to the tune of 10% of their block grants, for the privilege of accessing the fruits of their researchers’ toil. The individual academic is denied any hope of reaching an audience beyond university walls, and can even be barred from looking over their own published paper if their university does not stump up for the particular subscription in question.
This extraordinary racket is, at root, about the bewitching power of high-brow brands. Journals that published great research in the past are assumed to publish it still, and – to an extent – this expectation fulfils itself. To climb the career ladder academics must get into big-name publications, where their work will get cited more and be deemed to have more value in the philistine research evaluations which determine the flow of public funds. Thus they keep submitting to these pricey but mightily glorified magazines, and the system rolls on.”
JSTOR is a not-for-profit organization that has invested heavily in providing an online system for archiving, accessing, and searching digitized copies of over 1,000 academic journals. More recently, I noticed some effort on their part to allow public access to only 3 articles over a period of 21 days. This stinks! This policy reflects an intellectual snobbery beyond Himalayan proportions. The only folks that have access to these academic journals and studies are professors, and researchers that are affiliated with a university and university libraries. Aaron Swartz noted the injustice of hoarding such knowledge and tried to distribute a significant proportion of JSTOR’s archive through one or more file-sharing sites. And what happened thereafter was perhaps one of the biggest misapplication of justice. The same justice that disallows asymmetry of information in Wall Street is being deployed to preserve the asymmetry of information at the halls of learning.
MSNBC contributor Chris Hayes criticized the prosecutors, saying “at the time of his death Aaron was being prosecuted by the federal government and threatened with up to 35 years in prison and $1 million in fines for the crime of—and I’m not exaggerating here—downloading too many free articles from the online database of scholarly work JSTOR.”
The Associated Press reported that Swartz’s case “highlights society’s uncertain, evolving view of how to treat people who break into computer systems and share data not to enrich themselves, but to make it available to others.”
Chris Soghioian, a technologist and policy analyst with the ACLU, said, “Existing laws don’t recognize the distinction between two types of computer crimes: malicious crimes committed for profit, such as the large-scale theft of bank data or corporate secrets; and cases where hackers break into systems to prove their skillfulness or spread information that they think should be available to the public.”
Kelly Caine, a professor at Clemson University who studies people’s attitudes toward technology and privacy, said Swartz “was doing this not to hurt anybody, not for personal gain, but because he believed that information should be free and open, and he felt it would help a lot of people.”
And then there were some modest reservations, and Swartz actions were attributed to reckless judgment. I contend that this does injustice to someone of Swartz’s commitment and intellect … the recklessness was his inability to grasp the notion that an imbecile in the system would pursue 35 years of imprisonment and $1M fine … it was not that he was not aware of what he was doing but he believed, as does many, that scholarly academic research should be available as a free for all.
We have a Berlin wall that needs to be taken down. Swartz started that but he was unable to keep at it. It is important to not rest in this endeavor and that everyone ought to actively petition their local congressman to push bills that will allow open access to these academic articles.
John Maynard Keynes had warned of the folly of “shutting off the sun and the stars because they do not pay a dividend”, because what is at stake here is the reach of the light of learning. Aaron was at the vanguard leading that movement, and we should persevere to become those points of light that will enable JSTOR to disseminate the information that they guard so unreservedly.
Facebook began with a simple thesis: Connect Friends. That was the sine qua non of its existence. From a simple thesis to an effective UI design, Facebook has grown over the years to become the third largest community in the world. But as of the last few years they have had to resort to generating revenue to meet shareholder expectations. Today it is noon at Facebook but there is the long shadow of darkness that I posit have fallen upon perhaps one of the most influential companies in history.
The fact is that leaping from connecting friends to managing the conversations allows Facebook to create this petri dish to understand social interactions at large scale eased by their fine technology platform. To that end, they are moving into alternative distribution channels to create broader reach into global audience and to gather deeper insights into the interaction templates of the participants. The possibilities are immense: in that, this platform can be a collaborative beachhead into discoveries, exploration, learning, education, social and environmental awareness and ultimately contribute to elevated human conscience. But it has faltered, perhaps the shareholders and the analysts are much to blame, on account of the fangled existence of market demands and it has become one global billboard for advertisers to promote their brands. Darkness at noon is the most appropriate metaphor to reflect Facebook as it is now.
Let us take a small turn to briefly look at some of other very influential companies that have not been as much derailed as has Facebook. The companies are Twitter, Google and LinkedIn. Each of them are the leaders in their category, and all of them have moved toward monetization schemes from their specific user base. Each of them has weighed in significantly in their respective categories to create movements that have or will affect the course of the future. We all know how Twitter has contributed to super-fast news feeds globally that have spontaneously generated mass coalescence around issues that make a difference; Google has been an effective tool to allow an average person to access information; and LinkedIn has created professional and collaborative environment in the professional space. Thus, all three of these companies, despite supplementing fully their appetite for revenue through advertising, have not compromised their quintessence for being. Now all of these companies can definitely move their artillery to encompass the trajectory of FB but that would be a steep hill to climb. Furthermore, these companies have an aura associated within their categories: attempts to move out of their category have been feeble at best, and in some instances, not successful. Facebook has a phenomenal chance of putting together what they have to create a communion of knowledge and wisdom. And no company exists in the market better suited to do that at this point.
One could counter that Facebook sticks to its original vision and that what we have today is indeed what Facebook had planned for all along since the beginning. I don’t disagree. My point of contention in this matter is that though is that Facebook has created this informal and awesome platform for conversations and communities among friends, it has glossed over the immense positive fallout that could occur as a result of these interactions. And that is the development and enhancement of knowledge, collaboration, cultural play, encourage a diversity of thought, philanthropy, crowd sourcing scientific and artistic breakthroughs, etc. In other words, the objective has been met for the most part. Thank you Mark! Now Facebook needs to usher in a renaissance in the courtyard. Facebook needs to find a way out of the advertising morass that has shed darkness over all the product extensions and launches that have taken place over the last 2 years: Facebook can force a point of inflection to quadruple its impact on the course of history and knowledge. And the revenue will follow!
“Creativity is just connecting things. When you ask creative people how they did something, they feel a little guilty because they didn’t really do it, they just saw something. It seemed obvious to them after a while. That’s because they were able to connect experiences they’ve had and synthesize new things. And the reason they were able to do that was that they’ve had more experiences or they have thought more about their experiences than other people.”
– Steve Jobs
What is the Medici Effect?
Frans Johanssen has written a lovely book on the Medici Effect. The term “Medici” relates to the Medici family in Florence that made immense contributions in art, architecture and literature. They were pivotal in catalyzing the Renaissance, and some of the great artists and scientists that we revere today – Donatello, Michelangelo, Leonardo da Vinci, and Galileo were commissioned for their works by the family.
Renaissance was the resurgence of the old Athenian democracy. It merged distinctive areas of humanism, philosophy, sciences, arts and literature into a unified body of knowledge that would advance the cause of human civilization. What the Medici effect speaks to is the outcome that is the result of creating a system that would incorporate what on first glance, may seem distinctive and discrete disciplines, into holistic outcomes and a shared simmering of wisdom that permeated the emergence of new disciplines, thoughts and implementations.
Supporting the organization to harness the power of the Medici Effect
We are past the industrial era, the Progressive era and the Information era. There are no formative lines that truly distinguish one era from another, but our knowledge has progressed along gray lines that have pushed the limits of human knowledge. We are now wallowing in a crucible wherein distinct disciplines have crisscrossed and merged together. The key thesis in the Medici effect is that the intersections of these distinctive disciplines enable the birth of new breakthrough ideas and leapfrog innovation.
So how do we introduce the Medici Effect in organizations?
Some of the key ways to implement the model is really to provide the support infrastructure for
1. Connections: Our brains are naturally wired toward associations. We try to associate a concept with contextual elements around that concept to give the concept more meaning. We learn by connecting concepts and associating them, for the most part, with elements that we are conversant in. However, one can create associations within a narrow parameter, constrained within certain semantic models that we have created. Organizations can hence channelize connections by implementing narrow parameters. On the other hand, connections can be far more free-form. That means that the connector thinks beyond the immediate boundaries of their domain or within certain domains that are “pre-ordained”. In those cases, we create what is commonly known as divergent thinking. In that approach, we cull elements from seemingly different areas but we thread them around some core to generate new approaches, new metaphors, and new models. Ensuring that employees are able to safely reach out to other nodes of possibilities is the primary implementation step to generate the Medici effect.
2. Collaborations: Connecting different streams of thought in different disciplines is a primary and formative step. To advance this further, organization need to be able to provide additional systems wherein people can collaborate among themselves. In fact, the collaboration impact accentuates the final outcome sooner. So enabling connections and collaboration work in sync to create what I would call – the network impact on a marketplace of ideas.
3. Learning Organization: Organizations need to continuously add fuel to the ecosystem. In other words, they need to bring in speakers, encourage and invest in training programs, allow exploration possibilities by developing an internal budget for that purpose and provide some time and degree of freedom for people to mull over ideas. This enables collaboration to be enriched within the context of diverse learning.
4. Encourage Cultural Diversity: Finally, organizations have to invest in cultural diversity. People from different cultures have varied viewpoints and information and view issues from different perspectives and cultures. Given the fact that we are more globalized now, the innate understanding and immersion in cultural experience enhances the Medici effect. It also creates innovation and ground-breaking thoughts within a broader scope of compassion, humanism, social and shared responsibilities.
Implementing systems to encourage the Medici effect will enable organizations to break out from legacy behavior and trammel into unguarded territories. The charter toward unknown but exciting possibilities open the gateway for amazing and awesome ideas that engage the employees and enable them to beat a path to the intersection of new ideas.
We are entering into a new age wherein we are interested in picking up a finer understanding of relationships between businesses and customers, organizations and employees, products and how they are being used, how different aspects of the business and the organizations connect to produce meaningful and actionable relevant information, etc. We are seeing a lot of data, and the old tools to manage, process and gather insights from the data like spreadsheets, SQL databases, etc., are not scalable to current needs. Thus, Big Data is becoming a framework to approach how to process, store and cope with the reams of data that is being collected.
According to IDC, it is imperative that organizations and IT leaders focus on the ever-increasing volume, variety and velocity of information that forms big data.
- Volume. Many factors contribute to the increase in data volume – transaction-based data stored through the years, text data constantly streaming in from social media, increasing amounts of sensor data being collected, etc. In the past, excessive data volume created a storage issue. But with today’s decreasing storage costs, other issues emerge, including how to determine relevance amidst the large volumes of data and how to create value from data that is relevant.
- Variety. Data today comes in all types of formats – from traditional databases to hierarchical data stores created by end users and OLAP systems, to text documents, email, meter-collected data, video, audio, stock ticker data and financial transactions. By some estimates, 80 percent of an organization’s data is not numeric! But it still must be included in analyses and decision making.
- Velocity. According to Gartner, velocity “means both how fast data is being produced and how fast the data must be processed to meet demand.” RFID tags and smart metering are driving an increasing need to deal with torrents of data in near-real time. Reacting quickly enough to deal with velocity is a challenge to most organizations.
SAS has added two additional dimensions:
- Variability. In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something big trending in the social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage – especially with social media involved.
- Complexity. When you deal with huge volumes of data, it comes from multiple sources. It is quite an undertaking to link, match, cleanse and transform data across systems. However, it is necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control. Data governance can help you determine how disparate data relates to common definitions and how to systematically integrate structured and unstructured data assets to produce high-quality information that is useful, appropriate and up-to-date.
So to reiterate, Big Data is a framework stemming from the realization that the data has gathered significant pace and that it’s growth has exceeded the capacity for an organization to handle, store and analyze the data in a manner that offers meaningful insights into the relationships between data points. I am calling this a framework, unlike other materials that call Big Data a consequent of the inability of organizations to handle mass amounts of data. I refer to Big Data as a framework because it sets the parameters around an organizations’ decision as to when and which tools must be deployed to address the data scalability issues.
Thus to put the appropriate parameters around when an organization must consider Big Data as part of their analytics roadmap in order to understand the patterns of data better, they have to answer the following ten questions:
- What are the different types of data that should be gathered?
- What are the mechanisms that have to be deployed to gather the relevant data?
- How should the data be processed, transformed and stored?
- How do we ensure that there is no single point of failure in data storage and data loss that may compromise data integrity?
- What are the models that have to be used to analyze the data?
- How are the findings of the data to be distributed to relevant parties?
- How do we assure the security of the data that will be distributed?
- What mechanisms do we create to implement feedback against the data to preserve data integrity?
- How do we morph the big data model into new forms that accounts for new patterns to reflect what is meaningful and actionable?
- How do we create a learning path for the big data model framework?
Some of the existing literature have commingled Big Data framework with analytics. In fact, the literature has gone on to make a rather assertive statement i.e. that Big Data and predictive analytics be looked upon in the same vein. Nothing could be further from the truth!
There are several tools available in the market to do predictive analytics against a set of data that may not qualify for the Big Data framework. While I was the CFO at Atari, we deployed business intelligence tools using Microstrategy, and Microstrategy had predictive modules. In my recent past, we had explored SAS and Minitab tools to do predictive analytics. In fact, even Excel can do multivariate, ANOVA and regressions analysis and best curve fit analysis. These analytical techniques have been part of the analytics arsenal for a long time. Different data sizes may need different tools to instantiate relevant predictive analysis. This is a very important point because companies that do not have Big Data ought to seriously reconsider their strategy of what tools and frameworks to use to gather insights. I have known companies that have gone the Big Data route, although all data points ( excuse my pun), even after incorporating capacity and forecasts, suggest that alternative tools are more cost-effective than implementing Big Data solutions. Big Data is not a one-size fit-all model. It is an expensive implementation. However, for the right data size which in this case would be very large data size, Big Data implementation would be extremely beneficial and cost effective in terms of the total cost of ownership.
Areas where Big Data Framework can be applied!
Some areas lend themselves to the application of the Big Data Framework. I have identified broadly four key areas:
- Marketing and Sales: Consumer behavior, marketing campaigns, sales pipelines, conversions, marketing funnels and drop-offs, distribution channels are all areas where Big Data can be applied to gather deeper insights.
- Human Resources: Employee engagement, employee hiring, employee retention, organization knowledge base, impact of cross-functional training, reviews, compensation plans are elements that Big Data can surface. After all, generally over 60% of company resources are invested in HR.
- Production and Operational Environments: Data growth, different types of data appended as the business learns about the consumer, concurrent usage patterns, traffic, web analytics are prime examples.
- Financial Planning and Business Operational Analytics: Predictive analytics around bottoms-up sales, marketing campaigns ROI, customer acquisitions costs, earned media and paid media, margins by SKU’s and distribution channels, operational expenses, portfolio evaluation, risk analysis, etc., are some of the examples in this category.
Hadoop: A Small Note!
Hadoop is becoming a more widely accepted tool in addressing Big Data Needs. It was invented by Google so they could index the structural and text information that they were collecting and present meaningful and actionable results to the users quickly. It was further developed by Yahoo that tweaked Hadoop for enterprise applications.
Hadoop runs on a large number of machines that don’t share memory or disks. The Hadoop software runs on each of these machines. Thus, if you have for example – over 10 gigabytes of data – you take that data and spread that across different machines. Hadoop tracks where all these data resides! The servers or machines are called nodes, and the common logical categories around which the data is disseminated are called clusters. Thus each server operates on its own little piece of the data, and then once the data is processed, the results are delivered to the main client as a unified whole. The method of reducing the disparate sources of information residing in various nodes and clusters into one unified whole is the process of MapReduce, an important mechanism of Hadoop. You will also hear something called Hive which is nothing but a data warehouse. This could be a structured or unstructured warehouse upon which the Hadoop works upon, processes data, enables redundancy across the clusters and offers a unified solution through the MapReduce function.
Personally, I have always been interested in Business Intelligence. I have always considered BI as a stepping stone, in the new age, to be a handy tool to truly understand a business and develop financial and operational models that are fairly close to the trending insights that the data generates. So my ear is always to the ground as I follow the developments in this area … and though I have not implemented a Big Data solution, I have always been and will continue to be interested in seeing its applications in certain contexts and against the various use cases in organizations.
Innovation is happening at a rapid pace. An organization is being pummeled with new pieces of information, internally and externally, that is forcing pivots to accommodate changing customer needs and incorporating emerging technologies and solutions. Thus, the traditional organization structures have been fairly internally focused. Ronald Coase in his famous paper The Nature of the Firm (1937) had argued that organizations emerge to arrest transactional costs of managing multiple contracts with multiple service providers; the organization represents an efficient organizational unit given all other possible alternatives to coexist in an industrial ecosystem. Read the rest of this entry →