Cloudera Research Report

Driving Business Transformation

Enterprises of all sizes are looking for new flexible and affordable data management solutions to harness new and existing data. Low upfront hardware costs and scalability are attributes atop all shopping lists. Apache Hadoop open source software framework has emerged as the platform of choice.  Cloudera, a pioneer in Apache Hadoop space, was the first company to develop and commercialize enterprise-grade solutions built on this open source technology. We are initiating coverage with a positive view on the outlook and a fair valuation on the high end of the valuation range of $2.8 billion to $4.7 billion. Open source and distributed solutions are neither new, nor few. The experience, reputation and large client traction at Cloudera position this firm competitively for the near term future.

METHODOLOGY

Our views on Cloudera are derived from our research, proprietary channel checks with users, competitors, and experts in the technology space.

KEY POINTS

  • Databases are Evolving. Enterprises are hungry for data management solutions that are more scalable, flexible, and capable of handling Big Data. Technological innovations combined with falling prices of infrastructure elements have set the stage for innovation in the consumption, building, storage and delivery of data. Databases are evolving to meet the needs of enterprises.
  • Hadoop Software Framework Driving Transformation. Apache Hadoop is an open source software framework that enables distributed processing of large data sets across clusters of commodity servers. The software allows enterprises to cost-effectively store Big Data and solve business problems through advanced analytics that can be enabled by accessing large data volumes and new data sources. Distribution of tasks to lower cost and in place servers allows for rapid scaling and limited upfront commitment of resources. These are the two attributes most commonly sought in today’s market.
  • Cloudera Pioneering Hadoop-based Data-Driven Enterprise Transformations. Nearly every industry has large, emerging data needs that drive interest in Hadoop offerings. Cloudera is the biggest pure-play Hadoop vendor, ahead of its closest peers Hortonworks and MapR. The company’s hybrid business model allows it to service enterprises of all sizes. Additionally the company has a diverse base of partners. These two factors combined differentiate Cloudera from its competitors.
  • Growing Addressable & Attainable Markets. Gartner estimates the market for data management infrastructure (database management systems including data warehousing, storage management, BI, ECM and data integration, and related systems as $74 billion in 2014, growing to $94 billion by 2017. Hadoop-related products and services are expected to grow from $4.2 billion in 2015 to $50 billion in 2020.
  • Growth and Revenue, Not Margin. The company is growing top line at a robust pace. We have modeled $107 million in revenues in 2014 (+88% Y/Y), rising to $199 million in 2015 (+86% Y/Y) and $329 million in 2016 (+65% Y/Y). The key drivers include robust customer growth from a base of 525 in 2014.  We don’t expect operating breakeven until 2018.

Valuation on the High End of $2.8 – $4.7 Billion. Based on our proprietary valuation methodology, Cloudera should be valued on the high end of its valuation range of $2.8 billion to $4.7 billion. This is a premium to Hortonwork’s valuation of $743 million, MapR’s last private round valuation of $488 million, and above the last private funding round valuation of $4.3 billion.

screen-shot-2016-10-25-at-3-17-10-pm

 

 

 

 

 

screen-shot-2016-10-25-at-3-19-24-pm

screen-shot-2016-10-25-at-3-20-22-pm

 

 

 

 

 

 

 

 

Executive Summary

The explosive growth of data from multiple sources and its increasing complexity underpin the Big Data phenomena. Mobile computing, smart devices that contain sensors and social media interactions define our lives and contribute exponential increases to data volumes. Sector after sector and firm after firm are embracing the need and power to harness, store, process and draw insight from this data. This creates the opportunity that Cloudera deploys its proprietary Hadoop solutions to solve.

The volume of data produced globally doubles every two years, according to IDC. In the year 2013, for instance, 4.4 ZB (zettabyte) of data was generated; that number is expected to grow to 44 ZB by 2020, a seven-year CAGR of 39%, according to IDC. Twitter alone generates more than 8 terabytes (TB) of data every day, Facebook 25 TB, and some generate terabytes of data every hour of every day of the year. [1 trillion bytes = 1 terabyte; 1000 terabytes = 1 petabyte; 1 million petabytes = 1 zettabyte].

Data sets have become increasingly complex

Quantity and qualities of today’s data drive interest in core product offerings. Data sets have become increasingly complex. Today, roughly 95% of data generated is unstructured, driven by humans and machines – a stark contrast to the data generated from the traditional enterprise business applications (i.e. ERP, CRM, SCM). More importantly, most firms are only analyzing 12% of the data they have, leaving 88% untouched. The key drivers of this change have been a combination of machine data, mobile traffic growth, and social media. Traditional relational databases and IT infrastructure, which were designed for structured data, cannot process the enormous volume and complexity of today’s data. Mountains of data pile up and are missed. Platinum needles hide in these haystacks. Competitive advantage requires a growing number of firms and sectors to pay attention.

New and innovative database technologies are transforming the ability to collect and process data. The objectives are not new (i.e., data warehousing, mining, analytics); they have been around for a long time. What has changed is the technology that is driving this adoption. Open-source software frameworks like Hadoop, and technologies like In-Memory databases and NoSQL have made infrastructure elements affordable without sacrificing their effectiveness.

Hadoop, in particular, has become synonymous with Big Data, given its growing eco-system and strong validation from marquee users such as Facebook, Google, LinkedIn, and Yahoo. Demand for Hadoop should continue to increase as enterprises demand data management solutions that are more scalable, flexible, and capable of handling Big Data than traditional management systems (DBMS). More importantly, unlike the Linux market, which took share from UNIX, Hadoop workload is additive and even complementary to legacy workloads as organizations discover new use cases for data management.

Growing Market Opportunity

Hadoop market opportunity is compelling

The market for Big Data technology and services is growing at a robust pace, as enterprises become more data-centric. Gartner estimates the market for data management infrastructure (database management systems including data warehousing, storage management, BI, ECM and data integration, and related systems as $74 billion in 2014, growing to $94 billion by 2017. According to IDC, the Big Data technology and services market is expected to grow to $41.5 billion through 2018, a 5-year CAGR of 26.4%, or about six times the growth rate of the overall information technology market. Furthermore, IDC expects all-encompassing Big Data market (includes all NoSQL, Hadoop, machine learning software, services, and hardware) to grow at a 21% CAGR, to $100 billion in 2020E from $38 billion in 2015E.

Looking deeper, the Hadoop market opportunity is very compelling. According to Allied Market Research, demand for Hadoop software and total Hadoop-related products and services are expected to grow from $0.9 billion and $4.2 billion, respectively, in 2015, to $10 billion and $50 billion, respectively in 2020. This translates to an impressive 5-year compounded annual growth rate of 62% for Hadoop software and 64% for Hadoop-related products and services.

Cloudera – A Pioneer in Hadoop Space

Cloudera was founded in 2008 by engineers Christophe Bisciglia, Amir Awadallah, and Jeff Hammerbacher together with former Oracle executive Mike Olson. Doug Cutting, who joined Cloudera in 2009 as the Chief Architect, founded open source Apache Hadoop project and was a key member of the team that built and deployed a production Hadoop storage and analysis cluster for mission-critical business analytics. Cloudera provides Big Data analytics leveraging the power of Hadoop.  The company released its first product in 2009 and is recognized as the leading innovator in and the largest contributor to, the Hadoop software community. Among competitors and clients Cloudera is recognized as the standard in enterprise Hadoop solution systems.

Bullish on Cloudera’s Growth Outlook

Intel investment into Cloudera is a win-win for both

Our positive investment thesis centers on Cloudera’s strong competitive position in a growing addressable market with rapid growth in attainable market share. The company has leveraged its first-to-market position in commercial Hadoop software distribution to build a strong and diverse stable of customers and partners. Furthermore, Intel’s strategic investment in Cloudera for an 18% equity stake should provide pole position for Cloudera in the enterprise database market.

Intel Investment & Partnership – A Win, Win

We believe Intel’s investment into Cloudera is a win, win for both companies. Intel commands roughly 95% market share of the chip sets sold in the data centers. One of the themes of Brian Krzanich since becoming CEO in 2013 has been to grow Intel’s sales through non-PC areas, with data center sales as the biggest initial driver of growth. The company’s Data Center Group profits are now roughly 65% of earnings, and growth areas include cloud, Big Data analytics, and phone industry servers, among others.

Accordingly, Intel has a vested interest in seeing enterprise adoption of Hadoop continue to accelerate at the current pace. The open source batch processing framework is built from the ground up to run on low-cost servers that incorporate its x86 chip architecture. And the explosive growth of unstructured data is only driving more demand for commodity hardware, with many production clusters already numbering in the thousands of nodes. Added up, that presents a compelling market opportunity for Intel and its partners.

The investment in Cloudera, therefore, is about much more than just equity. In fact, it was announced as part of a “broad strategic technology and business collaboration” between the two companies meant to, among other things,  further cement the dominant place of the chip maker’s Xeon  server processor family in the data center amid increased competition from ARM.

The partnership is just as significant for Cloudera. While it is not a complete reseller agreement, Intel has discontinued the development of its in-house Hadoop platform and started promoting the Cloudera’s platform instead, with the hope that it spurs adoption among traditional enterprises, both in the domestic and international markets.

First Unified Platform for Enterprises

Cloudera is tranforming enterprise data management

Enterprises are looking for a fundamentally new data management solutions to harness existing and new data types. New demands and opportunities are emerging fast. Budgets and experienced personnel are expensive and hard to insource. Cloudera is transforming enterprise data management by offering the first unified Platform for Big Data, an enterprise data hub built on Apache Hadoop. The company offers enterprises one place to store, access, process, secure, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data.

Cloudera’s open source Big Data platform is the most widely adopted in the world, and Cloudera is a prolific contributor to the open source Hadoop ecosystem. As the leading educator of Hadoop professionals, Cloudera has trained over 30,000 individuals worldwide. Furthermore, over 1,450 partners combined with the company’s professional services and support teams provide proactive and predictive support to run an enterprise data hub and ensure its performance. Leading organizations across multiple industry verticals plus premier public sector organizations globally run Cloudera in production.

Strong Top Line Growth & Premium Valuation to Peers

The company’s revenue growth has been impressive. We believe the company exited 2014 with $107 million in revenues, or 88% year/year growth. We are forecasting 86% revenue growth in 2015, to $199 million, and 65% in 2016, to $329 million. At the current run rate, we expect operating profit in 2018 at the earliest.

We are using an arc approach to valuation as opposed to a point approach to value Cloudera. The fast growth and recently elevated market multiples are considered in our higher valuation creating a robust range of estimated values.

Cloudera is relatively early-stage in its profit potential. We believe that a comparative valuation methodology based on the Enterprise Value-to-Revenue (EV/Rev) multiples of publicly traded peer companies is appropriate. We approached the comparative valuation from two angles: (1) Deriving an implied valuation based on unadjusted EV/Rev multiple of the peer group; (2) Deriving an implied valuation based on a growth-adjusted EV/Rev multiple of the peer group.

Accordingly, the unadjusted EV/Revenue multiple indicates a valuation of $2.84 billion, or $20.45 per share, and the growth adjusted multiple indicates a value of $4.73 billion, or $34.03 per share. Given Cloudera’s strong competitive position in the Hadoop market, Intel’s last funding round at a $4.3 billion post-money valuation in 2014, we believe Cloudera should be valued closer to the growth adjusted value of $4.73 billion, or $34.03 per share.

Prominent Private Companies in Big Data

The Big Data landscape is populated with a number of legacy software companies and a number of promising, emerging and still private companies. We believe the established vendors, with advanced integration and analytics offerings, will benefit from the demand for new Big Data-related architectures and tools, and maintain a strong presence in the landscape. At the same time, specialized companies with innovative and disruptive technologies will stake a claim in the competitive arena. We have highlighted these private companies in the later part of the report. Larger and legacy companies have been devoting a large and growing portion of their considerable cash hordes to investing in and purchasing emerging companies. We see significant reason to expect this pattern to intensify in the quarters ahead.

Investment Thesis

We are bullish on Cloudera’s growth prospects

We are bullish on Cloudera’s growth outlook given the prominence of Hadoop and Cloudera’s strong competitive position in the Big Data market powered by Hadoop. Leveraging its early lead in the market – ahead of Hortonworks by three years – the company has built a broad customer base across multiple verticals, on the back of a differentiated hybrid product strategy. Additionally, Cloudera’s active leadership in the open source community has positioned it well on the cutting edge of Hadoop developments. We like the company’s growth trajectory, with revenues growing at a strong plus-80% run rate. We see Cloudera continuing as a scalable and affordable force as Big Data becomes smart data. Storage and analysis of sensor, internet of things, mobile and social data will growing rapidly for the rest of this decade.

Two risk factors must be noted. First, any slowdown in Hadoop adoption due to a competing software framework emerging is a significant risk factor. While we give a low probability to this happening in the near term, the pace of innovation in software has been very brisk and the risk of obsolescence is always substantial. Second, Cloudera’s hybrid strategy is a double-edged sword: on one hand the company enjoys higher margins due to proprietary elements; but on the other it will shut out potential customers who want to move away from getting locked in with one vendor – something the major commercial software companies do. Having said that, we believe the risk of that happening is low at this point given that the Hadoop market is still fragmented and the framework is still evolving.

The following section elaborates on the key points of our investment thesis.

Investment Positives

Differentiated Offering

Enterprises are adopting and building out their Big Data strategies. They are overwhelmed and unable to manage cluttered, complex and confusing data management infrastructures. They want to be able to manage multiple types of workloads and application types on a central architecture. Cloudera is well positioned for this demand with its differentiated product offering.

Cloudera has adopted a hybrid product strategy

Cloudera has adopted a hybrid product strategy that appeals to a broad range of enterprises across multiple verticals. Using open source Hadoop at the core, Cloudera wraps proprietary software around it, including Cloudera Enterprise (an HDFS) and Cloudera Manager (Cluster Management). Compared to its closest open source peers – Hortonworks and MapR – Cloudera takes a hybrid approach, contributing to the Apache Hadoop project with Hive and Hbase and selling Manager, Navigator, and Director products to enterprises.

Cloudera’s Hadoop distribution, called CDH (“Cloudera’s Distribution Including Apache Hadoop”), is 100% Apache Hadoop. The company packages CDH with varying levels of its own proprietary Hadoop management software into two forms of its platform. Cloudera Standard is made up of CDH plus a limited version of Cloudera’s proprietary management software and is available for free download. Cloudera Enterprise is made up of CDH plus the full version of Cloudera’s proprietary management software and requires an annual for-pay subscription.

Additionally, the company has been steadily developing add-on modules to its Hadoop distribution intended to extend the functionality of its platform, which it is marketing as an Enterprise Data Hub. These add-on modules include Impala for SQL-like interactive queries, Search for Google-like search functionality, and Navigator for data lineage, access management and auditing capabilities. Some of these add-on modules are open source and free to use (Impala and Search), but support from Cloudera requires the purchase of additional subscriptions.

In essence, Cloudera, unlike Hortonworks (the closest peer), has differentiated itself by making its Hadoop-based platform a comprehensive proprietary data management and analytics platform. Accordingly, its model requires customers to pay for its fully functional platform and related services, as well as for-pay support subscriptions for add-ons such as Impala, giving the company more diverse and higher margin revenue streams.

 

Leader in Open Source Software Development

Cloudera has more committer seats across Hadoop ecosystem projects

Cloudera was the first commercial Hadoop vendor to hit the market, roughly three years ahead of Hortonworks, its closest peer. As the market share leader in enterprise Hadoop, Cloudera is regarded by many industry experts as the benchmark for Hadoop adoption and commercial success. A number of reasons underlie the company’s dominant position. The key among them is its leadership role in the open source software development.

Cloudera has more committer seats across all the Hadoop ecosystem projects than any other vendor, and continues to attract the industry’s leading open source developers. In 2014, for instance, the open source community focused especially hard on security and enabling real-time, interactive and streaming applications to function elegantly with Hadoop, such that enterprises can execute multi-workload, multi-application analytics seamlessly and securely. Cloudera contributed to these advancements and is dedicated to continued innovation and leadership in Hadoop open source software development, working with the Apache Software Foundation to foster the Hadoop ecosystem.

Driving the importance of security, Cloudera continued to innovate on Apache Sentry, and through its acquisition of Gazzang and its partnership with Intel, led the optimization of Hadoop for chip-level encryption, making it possible for all data in Hadoop to be encrypted with very little performance degradation. This innovation enabled MasterCard to meet Payment Card Industry (PCI) Data Security Standards and achieve PCI compliance on Cloudera’s enterprise data hub, the first Hadoop installation to earn this certification.

Similarly, Cloudera led the introduction of a new version of Impala, a 100% open source interactive SQL engine that allows end users to query data stored in Hadoop using industry standard applications. As an example, Cerner, a leader in healthcare IT, recently implemented an EDH to create a holistic understanding of the healthcare system to improve patient outcomes. Cerner indicated that insight from its Hadoop-powered Big Data initiatives had saved hundreds of lives.

Since 2008, Cloudera has founded 20 Hadoop ecosystem projects and plays a large ongoing role in many others, including Spark and YARN.

Growing Addressable Market Opportunity Growing Attainable Market

Cloudera has a global presence across geographies and verticals

The market for Big Data technology and services is growing at a robust pace, as enterprise decision making has become increasingly data-centric. Gartner estimates the market for data management infrastructure (database management systems including data warehousing, storage management, BI, ECM and data integration, and related systems as $74 billion in 2014, growing to $94 billion by 2017. According to IDC, the Big Data technology and services market is expected to grow to $41.5 billion through 2018, a 5-year CAGR of 26.4%, or about six times the growth rate of the overall information technology market. Furthermore, IDC expects all-encompassing Big Data market (includes all NoSQL, Hadoop, machine learning software, services, and hardware) to grow at a 21% CAGR, to $100 billion in 2020E from $38 billion in 2015E.

Looking deeper, the Hadoop market opportunity is very compelling. According to Allied Market Research, demand for Hadoop software and total Hadoop-related products and services are expected to grow from $0.9 billion and $4.2 billion, respectively, in 2015, to $10 billion and $50 billion, respectively in 2020. This translates to an impressive 5-year compounded annual growth rate of 62% for Hadoop software and 64% for Hadoop-related products and services.

We see Cloudera as poised to take a rising share of a fast growing market. The low upfront cost and inbuilt scalability of Cloudera offerings is creating a very rich client pipeline. As results and competitive pressures mount to harness and utilize vast troves of data Hadoop emerges as the most likely framework. As firms, agencies, institutions adopt Hadoop frameworks Cloudera will likely be a first stop. The size, prestige and client list that Cloudera boasts puts them in an advantageous position to outgrow even their fast growing space.

Robust Revenue Growth

Cloudera has a global presence across geographies and verticals. Roughly 550 customers pay a recurring subscription fee for the software and support services, and additional, albeit smaller, for professional services. We believe subscription software and support accounted for 65-70% of total revenue, and professional services accounted for the rest.

Accordingly, based on the revenue drivers in our model, we believe Cloudera exited 2014 with $107 million in revenue, an increase of 88% over 2013’s $57 million, and a customer base of 550 customers. [Our revenue projection was validated by Cloudera CEO’s recent comments: “Annual recurring subscription software revenue growth accelerated year over year to approximately 100%, and preliminary unaudited total revenue surpassed $100 million”].

Looking ahead, we expect the company to maintain its robust growth trajectory into 2015 (+86% Y/Y) and 2016 (+65%), well above its pure-play peer group’s growth rates. This is justifiable given the company’s strong competitive position in growth markets.

Investment Risks

Slow Enterprise Adoption of Hadoop

Cloudera derives the majority of its revenues and cash flow from its open-source Hadoop distribution, a relatively new and evolving technology platform. Accordingly, the company’s growth and profitability trajectory is a function of the rate of enterprise adoption of Hadoop.  Until recently, Hadoop solved a very specific set of enterprise problems and use cases for managing Big Data, and was not a comprehensive data management solution. Enterprise usage of Hadoop was primarily for MapReduce jobs.

That said, Hadoop is evolving with the introduction of Hadoop 2.0. The enhancement from Hadoop 1.0’s more restricted processing model of batch-oriented MapReduce jobs, to more interactive and specialized processing models of Hadoop 2.0 will only further position the Hadoop ecosystem as the dominant Big Data analysis platform.

Therefore, we believe it is critical that Hadoop continues to mature to solve real business problems. And Cloudera continues to be at the forefront in driving the innovation and applications of Hadoop.

Limited Commercial Success of Open Source Models  

While open source software has been well accepted beyond Linux, Red Hat is the only commercial vendor to make money at scale selling support for free software. Many have failed, while others were acquired before reaching scale. That said, Cloudera

Cloudera’s hybrid product strategy broadens the competitive landscape

The challenge for the Hadoop distributors is to create a profitable business selling subscription for software and support for an open source core that can be downloaded for free. The advantage that Cloudera and the others have at this point is that Hadoop is complicated and very early in its maturation. Many enterprise customers look to vendors such as Cloudera to help define their journey as they reconfigure their data management architecture.

In the long term, however, we believe there exist existential risks on all technologies associated with the latest phase of the Big Data revolution. It has been, and remains, compelling to use distributed storage and retrieval to manage vast floods of data and perform basic analytics. Data storage and analysis remains a very dynamic and fast changing corner of computing. Just as Hadoop has facilitated major change, so too, unforeseen technology changes could restrict or cause Cloudera’s growth prospects to run in reverse.

Competition

Cloudera competes directly with two other pure-play Hadoop distribution vendors – Hortonworks and MapR. Additionally, increased competition from both large legacy peers as well as new and emerging competitors is a looming threat. While open source technology solutions have distinct advantages over proprietary solutions, most notably in terms of cost, new solutions from commercial peers that address key enterprise needs and add value over open source solutions can negatively affect the demand for Cloudera’s support subscriptions and services.

The company’s hybrid approach – open source and commercial licenses – has its own issues. On one hand it offers higher profitability, which is always good, but on the other exposes the company to competition from the traditional software companies. That said, enterprise technology majors such as IBM, Pivotal and Oracle, are beginning to offer their own Hadoop distributions. Furthermore, traditional data warehouse vendors such as Teradata, EMC, and SAP, and NoSQL database providers, such as DataStax and others, could expand their offerings to compete for data management spending in place of Hadoop. Additionally, new competing technologies that address limitations of Hadoop, e.g. Apache Spark for streaming in-memory analytics on Big Data, are in development and could eventually inhibit enterprise adoption of Hadoop.

Valuation

We are using an arc approach to valuation as opposed to a point approach to value Cloudera. The fast growth and recently elevated market multiples are considered in our higher valuation creating a robust range of estimated values.

Cloudera is relatively early-stage in its profit potential. We believe that a comparative valuation methodology based on the Enterprise Value-to-Revenue (EV/Rev) multiples of publicly traded peer companies is appropriate. We approached the comparative valuation from two angles: (1) Deriving an implied valuation based on unadjusted EV/Rev multiple of the peer group; (2) Deriving an implied valuation based on a growth-adjusted EV/Rev multiple of the peer group. To derive an equity value, we deducted the debt outstanding, which is zero, and added back $350 million of cash on hand. [Cloudera raised $530 million in May 2014 and had 139 million shares outstanding as per the last filing in May 2014].

For Cloudera’s peer group, we used a combination of pure-play Big Data companies and high-growth enterprise software companies. In the pure-play group, we believe Hortonworks, Qlik Technologies, Splunk , and Tableau Software, are the closest comparables to Cloudera given the focus on Big Data tools and analytics: Hortonworks provides open source Hadoop software and support services; Qlik provides user-driven business intelligence solutions that enable its customers to make data-driven business decisions; Splunk provides software for searching, monitoring, and analyzing machine-generated Big Data; and Tableau provides a family of interactive data visualization products focused on business intelligence. All of these fast growing and emerging companies are positioning for prominence in an expanding space where no clear leader has emerged. While methodology and area of focus are different, these firms are providing solutions and systems to capture, harness and use Big Data and analytical results derived therefrom.

Our high-growth enterprise software peer group is comprised of companies that are fast growing, completed their initial public offering (IPO) in the last 1 – 3 years, and have proven to be innovative and disruptive. These include Workday, FireEye, PaloAlto Networks, Service Now, Veeva Systems and Zendesk.

Accordingly, the unadjusted EV/Rev multiple indicates a valuation of $2.84 billion, or $20.45 per share, and the growth adjusted multiple indicates a value of $4.73 billion, or $34.03 per share. Given Cloudera’s strong competitive position in the Hadoop market, Intel’s last funding round at a $4.3 billion post-money valuation in 2014, we believe Cloudera should be valued closer to $4.73 billion, or $34.03 per share.

 

 

Figure 1: Cloudera Valuation Spectrum

picture1

 

 

 

 

 

Source: Manhattan Venture Research

 

Below we highlight the three methodologies and the resulting valuation:

Valuation Based on Unadjusted Mean Revenue Multiple

The mean EV/Rev multiple for 2016 in our select universe of companies is 7.6x. This implies a forward Cloudera enterprise value of $2.5 billion on our projected revenue estimate of $329 million, and $2.8 billion, or $20.45/share, of equity value after accounting for cash of $350 million and 139 million of shares outstanding.

 

Figure 2: Cloudera Valuation – Based on Unadjusted Mean Multiple

picture1

 

 

 

 

 

 

 

 

 

 

Cloudera Implied Valuation

picture1

 

 

 

 

 

 

 

Notes: * Stock prices as of April 24, 2015 close; ** Estimates for fiscal year ending January

*** PANW estimates are calendar estimates prorated from fiscal estimates ending July

Source: Company reports, Thompson Reuters, Manhattan Venture Research

 

Valuation Based on Growth Adjusted Mean Revenue Multiple

Recognizing Cloudera’s strong growth profile, we adjusted the EV/Rev ratio for growth. The mean EV/Rev in 2015 for our select universe of companies is 10.2x. Adjusting this for the mean peer growth rate of 42.2%, results in an adjusted peer group multiple of 7.2x. Adjusting this to Cloudera’s growth estimate of 86% results in a multiple of 13.3x and a forward enterprise value of $4.4 billion. Factoring in the cash balance of $350 million and 139 million shares outstanding, produces an equity value of $4.73 billion, or $34.03 per share.

 

Figure 3: Cloudera Valuation – Based on Growth Adjusted Multiple

picture1

 

 

 

 

 

 

 

 

 

 

 

 

 

Notes: * Stock prices as of April 24, 2015 close; ** Estimates for fiscal year ending January

*** PANW estimates are calendar estimates prorated from fiscal estimates ending July

Source: Company reports, Thompson Reuters, Manhattan Venture Research

 

Company Overview

Cloudera was founded in 2008 and released its first product in 2009

Cloudera is a leading provider of software and services based on the Hadoop platform. The company’s Hadoop-based platform provides organizations the ability to store all their structured and unstructured data while providing the capability to view the data and analyze it. The basic platform is offered for free while charging for the more sophisticated platforms. In addition to providing platform services, Cloudera provides a Hadoop training program and provides free online resources for learning about the various aspects of the open source framework.

Cloudera has a diversified customer base across multiple industries. Notable names include: Expedia, Experian, J.P. Morgan, Qualcomm, and Samsung. The company also provides services to government entities such as the Australian Department of Defense.

Founded in 2008 by engineers Christophe Bisciglia, Amir Awadallah, and Jeff Hammerbacher together with Oracle executive Mike Olson, the company released its first product in 2009. Cloudera has approximately 750 employees, operating in 15 countries, and is headquartered in Palo Alto, California. The company has raised a total of $672.4 million in primary capital from investors such as Intel Corporation, T. Rowe Price, Accel Partners and Greylock Partners.

What is Hadoop? A Primer

picture1

 

 

Apache Hadoop is an open source software project that enables distributed processing of large data sets across clusters of commodity servers. The software framework accomplishes two tasks: massive data storage and faster processing. It is designed to scale up from a single server to thousands of machines, with very high degree of fault tolerance. Rather than relying on high-end hardware, the resiliency of these clusters comes from the software’s ability to detect and handle failures at the application layer.

Apache Hadoop is currently in its second iteration. Hadoop 1.0, for all its useful attributes, was limited in its ability to address a number of pressing needs. Hadoop 2.0 addressed the limitations of 1.0.  The updated version adds support for running non-batch applications through the introduction of YARN (“Yet Another Resource Negotiator”), a redesigned cluster resource manager that eliminates Hadoop’s sole reliance on the MapReduce programming model.

YARN puts resource management and job scheduling functions in a separate layer beneath the data processing one, enabling Hadoop 2.0 to run a variety of applications. Overall, the changes made in Hadoop 2 position the framework for wider use in Big Data analytics and other enterprise applications. For example, it is now possible to run event processing as well as streaming, real-time and operational applications. The capability to support programming frameworks other than MapReduce also means that Hadoop can serve as a platform for a wider variety of analytical applications.

The Apache Software Foundation offers the core Hadoop and periphery modules for free, thus enabling users to build a complete Hadoop solution based off of the open source platform While the core components of Hadoop are free, commercial software often start with the free core and then develop, support, or integrate periphery and customized applications that fill solution gaps for the enterprise.

 

Figure 4: Evolution of Apache Hadoop Architecture – 2005 to Today

picture1

 

 

 

 

 

 

 

Source: Manhattan Venture Research

Hadoop – Key Attributes and Components

Hadoop has the following five distinguishing attributes:

  • Low cost: The open-source framework is free and uses commodity hardware to store large quantities of data;
  • Computing power: Its distributed computing model can quickly process very large volumes of data. The more computing nodes you use, the more processing power you have;
  • Scalability: Users can grow their system simply by adding more nodes. Little administration is required;
  • Storage flexibility: Unlike traditional relational databases, users don’t have to preprocess data before storing it. And that includes unstructured data like text, images and videos. Unlimited data can be stored and used when needed in the future;
  • Inherent data protection and self-healing capabilities: Data and application processing are protected against hardware failure. If a node goes down, jobs are automatically redirected to other nodes to make sure the distributed computing does not fail. And it automatically stores multiple copies of all data.

Hadoop framework download from the Apache Software Foundation comes with three core components – HDFS, MapReduce, YARN – and a host of other data processing components. The following figure highlights the components:

 

Figure 5: Hadoop Components

picture1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Source: Apache Hadoop Foundation, Manhattan Venture Research

 

Products & Services

Cloudera provides a hybrid product suite. Customers have access to basic open source software for free and pay a premium for access to enterprise grade premium features and support. The basic application comes with features such as access to Cloudera’s platform, service management, host monitoring, security management, diagnostics, and access to APIs. Premium versions are available at an annual subscription and come with additional features such as rolling updates, additional system integrations, configuration history and rollback capabilities, and premium level support for Cloudera Search, Hadoop, and Director. Where Cloudera see opportunities to fill solution gaps and address customer demand, it will develop free or proprietary enterprise-class applications and add-on modules.

Cloudera’s Hadoop offering, CDH, is 100% based off of Apache Hadoop, but it also includes a full package of proprietary management software, which company uses as its competitive differentiator. CDH is available as a free standard edition and an Enterprise edition, which has annual subscription support. Together CDH and its add-on modules form Cloudera’s platform offering called the Enterprise Data Hub. Notable add-on modules include Cloudera Impala, a faster and more interactive open source SQL engine for Hadoop that uses a massively parallel processing (MPP) architecture, also Cloudera Search, an open source Google-like search functionality, and Cloudera Manager, a licensed tool for managing and monitoring Hadoop.

  • CDH: CDH is a well-known distribution of Apache Hadoop and related projects. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls. CDH comes with the core elements of Apache Hadoop plus several additional key open source projects that, when coupled with customer support, management, and governance through a Cloudera Enterprise subscription, can deliver an enterprise data hub.
  • Key Capabilities: Similar to Linux distribution, which gives more than Linux, CDH delivers the core elements of Hadoop – scalable storage and distributed computing – along with additional components such as a user interface, plus necessary enterprise capabilities such as security, and integration with a broad range of hardware and software solutions. Additionally, all the integration work is done for the customer, and the entire solution is thoroughly tested and fully documented. By taking the guesswork out of building out the Hadoop deployment, CDH allows its users to stay focused on solving real business problems.

 

Figure 6: Cloudera CDH

picture1

 

 

 

 

 

 

Source: Cloudera, Manhattan Venture Research

  • Cloudera Express: Cloudera Express is a free download that combines CDH with Cloudera Manager, which provides robust cluster management capabilities like automated deployment, centralized administration, monitoring, and diagnostic tools. Cloudera Express is a fully optimized platform to demonstrate the value of the technology.
  • Cloudera Enterprise: Cloudera Enterprise combines the best of the open source community with the enterprise capabilities to offer a robust Hadoop platform. Designed specifically for mission-critical environments, Cloudera Enterprise includes CDH, as well as advanced system management and data management tools plus dedicated support and community advocacy from a team of Hadoop developers and experts. Enterprise Data Hub now represents more than 60% of new business for Cloudera.

 

Figure 7: Cloudera Enterprise Hub

picture1

 

 

 

 

 

 

Source: Cloudera, Manhattan Venture Research

  • Cloudera Director: Cloud environments are becoming increasingly popular for critical Apache Hadoop workloads, given their flexibility and elasticity. However, users are often forced to compromise on key aspects of their data platform – such as vendor neutrality or enterprise-grade capabilities – when looking to move to the cloud. With Cloudera Director, users can unlock the full potential of Hadoop in the cloud, without compromise.
  • Cloudera Support, Professional Services & Training: Cloudera Support offers predictive and proactive support capabilities focused, among other areas, on ensuring peak performance of mission-critical applications, and alignment the customer’s architecture to specific use cases to maximize the value of the data. The company offers training for data professionals, developers, and administrators.

Competitive Landscape

Cloudera competes in an increasingly crowded landscape. The offerings and market presence of the Hadoop vendors, particularly the Big-3 pure-play vendors – Cloudera, Hortonworks, and MapR – are very similar, according to Forrester, highlighting the importance of perfect execution and capturing market share.

 

Figure 8: Forrester Wave: Big Data Hadoop Solutions, Q1 2014

picture1

 

 

 

 

 

 

 

 

 

 

 

Source: Forrester

The key players in the Hadoop market can be classified into three buckets:

  • Pure-play vendors, who are competing for grabbing initial market share and imparting their strategies and visions on the marketplace;
  • Large enterprise software vendors, who are developing their own Hadoop solution offerings in order to stay relevant as the platform has become a critical component of the enterprise data architecture;
  • Other cloud, Big Data, and point vendors, who need to make their solutions cost competitive as they address specific functional needs.

Pure-Play Hadoop Distribution Vendors

The three leading pure-play Hadoop distributors include Cloudera, Hortonworks and MapR. While all three share a common goal of developing customized Hadoop distributions, add-ons, and services, and also market and sell their solutions to customers directly and also rely on the indirect sales through other technology partners, they differ in their business strategy and business model.  Each has chosen a different business strategy and operating model to customize the platform. Horton work’s HDP and Cloudera’s CDP solution cores are both based on the open Apache Hadoop distribution and related sub-projects. The differentiation between the offerings lies in the installation and administration of their add-on tools.

  • Hortonworks uses an open source business model. The flagship product, the Hortonworks Data Platform, is downloadable for free and 100% based off of Apache Hadoop. The company follows the typical open source business model and generates revenue from annual customer fees for support subscriptions and professional services. Where there are solution gaps in Hadoop, Hortonworks initiates projects and commits resources in order to drive innovation and expand the open source platform. For example, Apache Ambari is an open source cluster management console founded and supported by Hortonworks. The company had 332 support subscriptions customers as of December 31, 2014, up from 233 customers on September 30. 2014.  Its open model mitigates the risk of commercial vendor lock-in.  However, on the downside, the model is less defensible without proprietary software license fee. Also, the amount of markup Hortonworks can charge for its employees’ time determines the resulting margins, which has its limits.
  • Cloudera’s strategy, as noted earlier in the report, is to build out its own proprietary Hadoop-based platform for data management and analytics. We believe that Cloudera’s proprietary product roadmap enhances its competitive position by meeting market demand for commercial licenses while also generating higher margins. Cloudera currently leads in market awareness and size, in part due to being around the longest and being the first commercial Hadoop vendor to market. It has more than 525 global enterprise and public sector subscription-based customers with 750 employees in 15 countries. We believe subscription software accounted for 65-70% of total revenue, and professional services accounted for 15-25% of total revenue.
  • MapR, similar to Cloudera, follows the strategy to stay loyal to core Hadoop but improve the open source core with proprietary value-add components and services. One distinct advantage of the MapR platform is that data can be ingested as a real-time stream; analysis can be performed directly on the data, and automated responses can be executed. Unlike Cloudera, roughly 90% of the company’s revenues come from licensing. Although MapR has a sizable customer base, over 700 as of December 31, 2014, our checks indicate that it consistently ranks third behind Cloudera and Hortonworks in most contract bids.

Large Enterprise Software Vendors 

All of the large incumbent enterprise software vendors are either developing their own Hadoop-related solution or integrating with existing ones as the platform becomes a more integral piece of the enterprise data architecture. IBM, Microsoft, Pivotal, and Teradata are in different phases of deploying their own solution. Others are partnering with one or more pure-play vendors to only offer a Hadoop distribution. For instance, Oracle and Intel partner with Cloudera and SAP partners with Hortonworks.

Other Hadoop Solution Providers

This group includes cloud, Big Data, and point-solution vendors. Cloud and managed hosting providers offer Hadoop-related services in the cloud, often partnering with at least one of the pure-play vendors. Cloud offerings include Amazon Web Services’ (AWS), Elastic MapReduce (EMR), and Microsoft’s HD Insight in the Azure Cloud. Pont-solution vendors offer specific tools and functionality within the Hadoop ecosystem, including cluster management, data integration and modeling, predictive analytics, and data visualization. These point solutions help make Hadoop a core component of a broader Big Data solution.

Customers & Partners

Cloudera has a global presence across geographies and verticals. With a classic land and expand strategy, the company lands new support subscription customers and expands the existing customers. As customers realize value from integrating legacy data silos with Cloudera’s Enterprise Data Hub (EDH), more data can be brought under management, driving the requirement for more nodes and larger subscription deals. When this occurs and additional applications access the data lake, the company routinely sees more enterprise customers expand via add-on sales on top of their initial EDH investment.

Cloudera has 550 paying customers for its software across multiple verticals. For comparative reference, Hortonworks has roughly 332 customers and MapR has 700 paying customers.

 

Figure 9: Cloudera Customers – A Small Sample

picture1

 

 

 

 

 

 

 

 

Source: Cloudera, Manhattan Venture Research

Partner Program Attracting Logos

Cloudera expanded its partner program, Cloudera Connect, by 78% during 2014, ensuring that its customers have access to complete, integrated solutions that interoperate with their existing data management infrastructure. At the end of 2014, the company had 1,450 partners, an increase of 640 over the prior year. In fiscal 2015, Cloudera announced partnerships with Accenture, Capgemini, Dell, EMC Isilon, Informatica, Intel, Microsoft, MongoDB, NEC, Red Hat, SAP, Teradata, and others to accelerate the deployment of Hadoop.

 

Figure 10: Strategic Partnerships

picture1

 

 

 

 

 

 

 

 

 

 

 

Source: Cloudera, Manhattan Venture Research

 

Funding

Cloudera has raised $672.4 million in right rounds from a wide range of investors. The largest round was Series F in March-May 2014, when the company raised a total of $530 million, including $160 million from Google Ventures, T. Rowe Price, MSD Capital, and $370 million from Intel Capital. Additionally, Intel participated in the round with a $370 million cash purchase of primary shares and an additional $370 million of secondary shares. In all, the Series F round closed at $900 million ($530 million new cash; $370 million secondary) at a post-money valuation of $4.3 billion.

 

Figure 11: Cloudera Funding Rounds

picture1

 

 

 

 

 

 

 

 

 

 

 

Source: VC Experts, Manhattan Venture Research

 

Financials

Cloudera is a private company and does not disclose any financial information. We built the company’s revenue model based on a number of factors including our understanding of the revenue opportunity, growth rates of competitors, the number of reported and projected customers and the implied revenue per customer. Additionally, we have incorporated management’s comments on the company’s revenue run rate and trajectory, and product mix. We believe Cloudera is still a revenue story so we are not putting too much weight on the company’s profitability, which we believe is still negative.

The company’s revenue model is driven by selling proprietary software and support subscription, and professional services. We believe subscription software accounted for 65-70% of total revenue, and professional services accounted for 15-25% of total revenue. Accordingly, we believe Cloudera exited 2014 with $107 million in revenue, an increase of  88% over 2013’s $57 million, and roughly 525 customers [Our revenue projection was validated recently by Cloudera CEO’s recent comments: “Annual recurring subscription software revenue growth accelerated year over year to approximately 100%, and preliminary unaudited total revenue surpassed $100 million”].

Looking ahead, we expect the company to maintain its robust growth trajectory into 2015 (+86% Y/Y) and 2016 (+65%), slightly above its pure-play peer group’s growth rates. This is justifiable, in our opinion, given the company’s strong competitive position in growth markets.

 

Figure 12: Cloudera Revenue: 2012-2016E

picture1

 

 

 

 

 

 

 

Source: Cloudera, Manhattan Venture Research

 

Management Team

Tom Reilly, Chief Executive Officer

Tom has a distinguished 30-year career in the enterprise software market. Prior to Cloudera, his most recent role was as vice president and general manager of enterprise security at HP. Previous to HP, Tom served as CEO of enterprise security company ArcSight, which HP acquired in 2010. Tom led ArcSight through a successful initial public offering and subsequent sale to HP. Before ArcSight, Tom was vice president of business information services for IBM, following the acquisition of Trigo Technologies Inc., a master data management (MDM) software company, where he had served as CEO. Tom currently serves as a Board Member for Jive Software, privately held Ombud Inc., ThreatStream Inc. and Cloudera.

Mike Olson, Chief Strategy Officer

Mike co-founded Cloudera in 2008 and served as its CEO until 2013 when he took on his current role of chief strategy officer (CSO.) As CSO, Mike is responsible for Cloudera’s product strategy, open source leadership, engineering alignment and direct engagement with customers. Prior to Cloudera Mike was CEO of Sleepycat Software, makers of Berkeley DB, the open source embedded database engine. Mike spent two years at Oracle Corporation as vice president for Embedded Technologies after Oracle’s acquisition of Sleepycat in 2006. Prior to joining Sleepycat, Mike held technical and business positions at database vendors Britton Lee, Illustra Information Technologies and Informix Software.

Jim Frankola, Chief Financial Officer

Jim has over 25 years of experience in finance and operational leadership with global technology companies. Prior to joining Cloudera, Jim served as Chief Financial Officer at Yodlee, the leading provider of cloud-based personal financial management solutions. Prior to Yodlee, Jim served as CFO of Ariba where he helped lead the company through several strategic initiatives including establishing Ariba as a leader in spend management solutions and transitioning the company’s business model from a traditional enterprise application vendor to a SaaS/cloud company. Earlier in his career, Jim held various financial and executive positions at Avery Dennison Corporation and IBM. A certified public accountant, Jim has a B.S. degree in accounting from Pennsylvania State University and an M.B.A. in international business and finance from New York University Leonard N. Stern School of Business.

Amr Awadallah, Chief Technology Officer and acting Vice President of Engineering

Before co-founding Cloudera in 2008, Amr was an Entrepreneur-in-Residence at Accel Partners. Prior to joining Accel he served as Vice President of Product Intelligence Engineering at Yahoo!, and ran one of the very first organizations to use Hadoop for data analysis and business intelligence. Amr joined Yahoo after they acquired his first startup, VivaSmart, in July of 2000.

Big Data Overview

Any discussion of the Big Data phenomena has to begin with its definition. There are a number of definitions – some focused on the volume of data generated some on the variety, and others on the speed of data flow. We believe Big Data is a broad term that includes not only the large, multi-structured data sets that comprise 95% of the data generated today, but also all the new technologies and databases required to capture, store, and analyze the complex data sets.

Most accessible data to date has been located in traditional relational databases, which were pioneered by IBM and Oracle, accessed and managed with a certain set of tools, and analyzed and reported with a set of Business Intelligence (BI) software. Proliferation of social networks, real-time consumer behavior, mobility, sensor networks, and others have overwhelmed the current IT infrastructure.

The volume and mix of data generated today has changed. One quintillion bytes of data are created daily. The volume of data doubles every two years, according to IBM. Today unstructured data comprises 77% of all data, up significantly from 36% in 2006, according to IDC.

 

Figure 13: Structured vs. Unstructured Data

picture1

 

 

 

 

 

 

 

Source: IDC

So where is all this data coming from? With the proliferation of smart mobile devices, cloud computing, and social collaboration technologies, the sources of data are ubiquitous. It includes not only traditional structured data, but also raw, semi-structured, and unstructured data from web pages, log files, search indexes, social media forums, e-mail, documents, sensor data from active and passive systems, and so on. These vast sums of data typically may contain mostly irrelevant detail, but some hidden gems may be useful that provide actionable insight.

 

Figure 14: Endless Data Flow

picture1

 

 

 

 

 

 

 

 

 

Source: Manhattan Venture Research

Evolving Database Architecture

The majority of the growth of digital data from the 1980s to the early 2000s can be classified as structured data – or data that had a schema. Structured data is well-defined with columns and rows of information that can be easily maintained, updated, analyzed, and reported. An example of structured data is employee information or customer sales order. As digital data files became the industry norm, the underlying architecture transitioned from the then-prevalent hierarchical databases to relational database architecture – heralding the arrival of a more efficient and cost-efficient method to capture and store data.

 

Figure 15: Evolving Database Architecture

picture1

 

 

 

 

 

 

 

Source: Manhattan Venture Research

A relational database is essentially a collection of data organized as tables with rows and columns, with each row containing relevant information (in the columns) about a particular account or customer, identified by a unique ID making updating, retrieving, and analyzing information about a particular account efficiently and fast.

The hierarchical database, on the other hand, organizes data by a certain hierarchy that resembles an employee organization chart. As the volume of structured data grew rapidly, the hierarchical database became increasingly difficult to update and manage. For example, a change in an employee’s address needed to be updated not only in the general employee information section, but also in many other related sections, such as information about an employee’s benefits. As a result, the hierarchical database was eventually replaced by the relational database.

Over the last decade or so, unstructured data came into prominence and upset the status quo. Driven by machines, mobile and social collaboration technologies, unstructured data is 95% of the data that is generated today, and enterprises and vendors are adapting to the change.

 

Figure 16: Unstructured Data Flow

picture1

 

 

 

 

 

 

 

Source: Manhattan Venture Research

Big Data – Market Size

The Big Data market is difficult to determine due to its various components and definitions. This is evident from the wide range of estimates from various market research firms. IDC expects all-encompassing Big Data market (includes all NoSQL, Hadoop, machine learning software, services, and hardware) to grow at a 21% CAGR, to $100 billion in 2020E from $38 billion in 2015E.

 

Figure 17: Big Data Market Size ($B)

picture1

 

 

 

 

 

 

Source: IDC

 

Three Secular Trends Driving Big Data Growth & Adoption

Before we had Big Data, we had Business Intelligence. The objectives of Big Data are not new and have been in practice for years in select verticals. Its appeal is in its ability to provide predictive analytics over increasingly complex sets of data. The key selling point – the reason it will gain mass commercial adoption – is the lower cost framework available to process, store, and analyze these data sets. With roughly 95% of the data sets being unstructured or semi-structured, a new set of cost-effective tools are required to store and process the data. The traditional relational databases were not designed to handle the large volume and complexity of this new wave of data. New software frameworks are emerging that make the collection and analysis of data possible at substantially lower costs, which ultimately should lead to broad enterprise adoption.

Three secular trends are driving the rapid adoption of Big Data: (1) Explosive growth of unstructured data; (2) The availability of innovative and cost-effective software and hardware tools; and (3) The wide applications of Big Data analytics across verticals that are facilitating proactive and data-driven business decision-making:

Explosive Data Growth

The volume of globally-produced data doubles every two years, and data produced every two days is equal to all the data created through 2003 – according to IBM. Three key drivers are responsible for this massive data generation: a) machine-to-machine (M2M) data i.e. transactional and other machine data; b) mobile devices; and c) social media.

  • Software Applications and Electronic Devices Driving M2M Data. Machine data is produced by a wide range of software applications and electronic devices, which typically contain information regarding the application’s or device’s activity and status. Additionally, IBM recently cited an exponential growth curve in broadband-generated traffic by 2020. Sources of machine data include servers, applications, network devices, personal and mobile computers, smart meters, and RFID scanners, among others.
  • Mobile Device Data Growth. Mobile traffic growth is a function of the proliferation of devices. There are over 4 billion mobile devices worldwide functioning essentially as computers. The pace of data growth from these smart devices has far outpaced the growth from traditional computing devices.
  • Social Media. Social network sites have billions of users – all generating endless amounts of unstructured data. Facebook, for example, sees over 350 million photos uploaded every day and currently stores over 100 petabytes of photos and videos combined. Twitter generates more than 8 terabytes (TB) of data every day. Human-generated data, as a result of the rise of social media, is comprising an increasing part of the total data generated worldwide, with major implications for businesses in terms of IT infrastructure needs – especially storage and processing.

 

Low-Cost Frameworks

 Key drivers of the Big Data architecture have been the availability of open source software frameworks, commodity hardware, and the improvements in the price performance of memory and processing power. Together they have been instrumental in lowering the cost and improving the scale of the architecture.  In particular, the Hadoop platform, as noted earlier in the report, has become synonymous with Big Data and the focal point of the movement and spawned innovation, which is spreading through the broader technology industry. The other open source frameworks include:

  • In-Memory Gaining Renewed Attention. The in-memory architecture stores relevant data in the server memory rather than moving to a database for data retrieval (typically the slowest part of data processing) and consequently can process the data faster, as less data has to be moved for processing. The biggest proponent of in-memory database has been SAP, with its HANA application, which takes advantage of in-memory architecture, while Microsoft’s SQL Server has an in-memory component for its business intelligence function.
  • NoSQL – Growing Rapidly as an Alternative. NoSQL technologies are growing rapidly as an alternative to relational databases to handle Big Data. Vendors including Oracle and Teradata have released versions of a NoSQL database, and many are integrating NoSQL with Hadoop capabilities as an integrated data appliance. Standalone NoSQL approaches, including Cassandra, Couchbase, and Dynamo, are increasingly being adopted by enterprises. The database was designed for large and complex data management needs in ways that relational databases were not. The system can scale over commodity servers (i.e. distributed architecture), whereas a relational database typically scales up with more memory or faster hardware. Additionally, a relational database strictly utilizes relations (or tables) to store data, which is composed in a set of tables in predefined categories. Each table contains data categories by column and each row has a unique instance of the data defined by the column. Data can be accessed without knowing the structure of the database table, but a limitation is that the data has to fit into a table. The NoSQL approach does have its drawbacks in that a separate database instance runs on each server, which compromises relational capabilities, and the overall framework sacrifices data consistency – two limiting factors for outright replacement of relational databases.

In addition to the frameworks noted above, there are a number of new analytical engines, including appliances and column stores, which provide significantly higher price-performance than general-purpose, relational databases.

  • Dynamo – Amazon DynamoDB is a fully-managed, NoSQL database service that provides fast and predictable performance with seamless scalability. With a few clicks in the AWS Management Console, users can launch a new Amazon DynamoDB database table, scale up or down their request capacity for the table without downtime or performance degradation, and gain visibility into resource utilization and performance metrics. The database enables customers to offload the administrative burdens of operating and scaling distributed databases to AWS, so they don’t have to worry about hardware provisioning, setup and configuration, replication, software patching, or cluster scaling. Amazon DynamoDB is designed to address the core problems of database management, performance, scalability, and reliability. Developers can create a database table that can store and retrieve any amount of data, and serve any level of request traffic. DynamoDB automatically spreads the data and traffic for the table over a sufficient number of servers to handle the request capacity specified by the customer and the amount of data stored, while maintaining consistent, fast performance. All data items are stored on Solid State Drives (SSDs) and are automatically replicated across three Availability Zones in a Region to provide built-in high availability and data durability. Finally, the database enables customers to offload the administrative burden of operating and scaling a highly available distributed database cluster, while only paying a low variable price for the resources they consume.
  • Cassandra – Apache Cassandra is an open source, distributed database management system. It is an Apache Software Foundation top-level project designed to handle very large amounts of data spread out across many commodity servers while providing a highly available service with no single point of failure. It is a NoSQL solution that was initially developed by Facebook and powered their Inbox Search feature until late 2010. Cassandra provides a structured key-value store with tunable consistency. Keys map to multiple values, which are grouped into column families. The column families are fixed when a Cassandra database is created, but columns can be added to a family at any time. Furthermore, columns are added only to specified keys, so different keys can have different numbers of columns in any given family. The values from a column family for each key are stored together. This makes Cassandra a hybrid data management system between a column-oriented DBMS and a row-oriented store. Additional features include: Using the BigTable way of modeling, eventual consistency, and the Gossip protocol, a master-master way of serving read and write requests inspired by Amazon’s Dynamo.
  • Hbase – HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java. It is developed as part of Apache Software Foundation’s Apache Hadoop project and runs on top of Hadoop Distributed Filesystem, providing BigTable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data. HBase features compression, in-memory operation, and Bloom filters on a per-column basis as outlined in the original BigTable paper. Tables in HBase can serve as the input and output for MapReduce jobs run in Hadoop, and may be accessed through the Java API, but also through REST, Avro or Thrift gateway APIs. HBase is not a direct replacement for a classic SQL database, although recently its performance has improved, and it is now serving several data-driven websites, including Facebook’s Messaging Platform.
  • Big Table – BigTable is a compressed, high performance, and proprietary data storage system built on Google File System, Chubby Lock Service, SSTable (log-structured storage like LevelDB) and a few other Google technologies. It is not distributed outside Google, although Google offers access to it as part of its Google App Engine.

 

Data – The New Lifeblood of Most Industries

Data, for all practical purposes, has become the lifeblood of almost all verticals, as data-centric decisions become the norm and not the exception. The promise of Big Data is predictive analytics. Companies can unlock significant value by collecting precise information and generating sophisticated analytics to improve decision making. Google, an early proponent of advanced analytics, uses Big Data in tangible ways. The company analyzes the frequency and timing of search queries to predict flu outbreaks and unemployment trends before the release of government statistics.  Applications of Big Data analytics touch a wide range of verticals as noted below:

Retailers are attempting to create “graphs” of social networks to identify influencers, experts, and decision-makers and how they work together to create social buying patterns. Big Data also enables retailers to use real-time transaction data to manage their inventory, restocking popular merchandise and marking down poor sellers more effectively than ever. That same data can be used to predict what the demand will be for new products for the next holiday season.

  • Utilities, Logistics, Transportation, and Distribution. Companies in these verticals are seeking machine data in vast quantities to shave very refined marginal costs and create small parcels of revenue that add up to big money. Telecom carriers are using Big Data analytics to prevent customer churn and network capacity trending and management.
  • Financial Services and Government. Financial services, securities, and banking organizations are using Big Data to reinforce their fraud detection operations and identify early indicators of market behaviors. Insurance and government agencies are using broader data processing capabilities to reduce costs. The FBI estimates 10% of all federal healthcare transactions are fraudulent and cost $150 billion per year of which only $1 billion a year is recovered.
  • Cutting Edge Technology. Big Data can also be applied to video, photos, and even voice. For example, emotion recognition software can photograph someone’s face repeatedly during an interview, offering marketers faster and more accurate insight on their product.  How does Big Data do it?  By analyzing facial muscles from hundreds or even thousands of frames and classifying them into emotion categories.
  • Healthcare. Electronic medical records can be analyzed to uncover complex relationships and unknown variables between medical conditions, treatments, and outcomes that may have previously never been thought to be correlated.

Prominent Big Data Players

The Big Data landscape is populated with a number of legacy software companies and a number of promising private companies. We believe the established vendors, with advanced integration and analytics offerings, will benefit from the demand for new Big Data-related architectures and tools, and maintain a strong presence in the landscape. At the same time, specialized companies with innovative and disruptive technologies will stake a claim in the competitive arena. The following figure illustrates, albeit very narrowly, the Big Data landscape.

 

Figure 18: Big Data Landscape

picture1

 

 

 

 

 

 

 

 

 

 

 

 

Source: Manhattan Venture Research

Leading public players in the space include the traditional vendors including IBM, Oracle, EMC, Microsoft, Teradata, Informatica, and recent public companies including Splunk, Tableau and Hortonworks. Notable private companies besides Cloudera include mongoDB, Palantir, Mu Sigma, and Datastrax, among others. Following is a brief description of the leading public players in the landscape – with a focus on their Big Data initiatives. This is followed by a brief description of the private companies in the Big Data marketplace.

Public Companies

  • EMC, through its Greenplum division, has introduced its Unified Analytics Platform (UAP). This platform brings together EMC’s Big Data portfolio into a singular platform. The platform combines the analytical warehouses within a Hadoop framework. The focus here is to streamline all data analysis into a single point of entry for businesses. EMC also is bringing together some of the storage technologies from their portfolio to overcome some challenges within Hadoop. Clearly, the company is hoping to offer a full service portfolio for businesses interested in Big Data.
  • IBM has made numerous acquisitions over the past few years. The company’s aim is to build out its next generation data warehouses in addition to building out its analytics platforms. The Netezza acquisition seemed specifically geared towards building out the data warehouses. On the analytics side, IBM is utilizing a Hadoop framework with its Infosphere BigInsights. Additionally, the company introduced BigInsights which will run through their SmartCloud. The combination of data warehousing and analytical frameworks, along with the analytics IBM already provides, indicates that the company is focusing on a full service portfolio with both a cloud and non-cloud data warehouse.
  • Microsoft joined the Big Data movement by removing its original Big Data Dryad tool and working more closely with Hadoop. Microsoft partners with Hortonworks in the IaaS offerings of Windows Server and Azure. Microsoft has also partnered with Cloudera to enable quick implementation of Cloudera’s enterprise data hub with the power of Hadoop on demand in Microsoft Azure. The joint solution provides performance optimized configurations for deploying Cloudera Enterprise clusters of varying sizes on Microsoft Azure, resulting in a significant reduction in complexity and increase in value and performance. The recent releases of Windows products such as SQL Server are Hadoop compatible.
  • Oracle has stepped into the Big Data arena and joined the Hadoop believers through a partnership with Cloudera. The partnership’s product is a NoSQL-ready database appliance. Oracle also introduced its Exalytics product, an in-memory data store which is basically a complement to Oracle’s prior Big Data appliance, Exadata. Oracle acquired Endeca, which provides capabilities in analyzing unstructured data.
  • Qlik is focused on the business intelligence arena. The low cost of ownership and quick deployment make Qlik a prime beneficiary of the Big Data movement. Additionally, the high visualization provided by Qlik will prove invaluable as more data is consumed and users will need to see the data in a clean, crisp manner. Finally, while Qlik is not directly in the Big Data space like others, as companies become more reliant on business intelligence, they will find value in Qlik’s products.
  • SAP’s foray into Big Data revolves around its product, HANA. HANA is an in-memory appliance that can be utilized on any hardware. The product provides analytics and would provide an alternative to traditional relational database technologies.
  • Splunk, founded in 2003, is the most recent pure-play public company that offers a platform upon which it collects, stores, and analyzes machine data for businesses. While there is vast amounts of machine data generated every day, Splunk will index the data so that a company may hone in on specific time frames or intervals for which they would like to analyze. The platform is visualized so that any user can gain insight into the data generated and quickly identify points of interest. Additionally, the platform has connectors to Hadoop which allow it to integrate quickly within an organization. Some practical applications of the platform are identifying possible network intrusions and locating inefficiencies within a networked system.
  • Teradata made its move into Big Data by acquiring Aster Data Systems. Teradata’s main business focus was in scalable enterprise data warehouses. Teradata’s newer releases are integrating software from both Teradata’s systems and implementing Aster’s technology. Teradata also partnered with Hortonworks in addition to its partnership with Cloudera. The company is making its move into Big Data, specifically on the Hadoop framework side.
  • Tableau, spun out of Stanford University with VizQL, a technology that completely changes working with data by allowing simple drag and drop functions to create sophisticated visualizations. At the heart of Tableau is a proprietary technology that makes interactive data visualization an integral part of understanding data. A traditional analysis tool forces you to analyze data in rows and columns, choose a subset of your data to present, organize that data into a table, then create a chart from that table. VizQL skips those steps and creates a visual representation of your data right away, giving you visual feedback as you analyze. As a result, you get a much deeper understanding of your data and can work much faster than conventional methods–up to 100 times faster.

Emerging Private Companies in the U.S.

As the opportunity around structured data resulted in the creation and rapid growth of many software companies that we are familiar with today, we believe the opportunity to create innovative applications and infrastructure around multi-structured data has opened up the competitive landscape for a number of innovative private companies. We have identified the following 12 names: Ayasdi, Cask, Datastax, Domo, Karmasphere, Metamarkets, mongoDB, MuSigma, Neo Technology, Palantir, Platfora, Sumologic and Webidata. Following is a brief description of the emerging private companies in the Big Data market.

picture1

 

 

 

Ayasdi offers an insight discovery platform that helps organizations discover and utilize insights from their data. The company’s advanced analytics solution combines machine learning with Topological Data Analysis (TDA), enabling users to extract subtle, often hidden insights from their data. The company has raised $106.3 million to date with its last round (series C) in March 2015for $55 million. Founded in 2008 after a decade of DARPA and NSF funded research at Stanford, Ayasdi is funded by Khosla Ventures, Institutional Venture Partners, GE Ventures, Citi Ventures, and FLOODGATE, and customers include General Electric, Citigroup, Anadarko, and Mount Sinai Hospital among others.

picture1

 

 

 

Founded by developers for developers, Cask is an open source software company bringing virtualization to Hadoop data and apps. Based in Palo Alto, Calif., the company is backed by leading investors including Battery Ventures, Andreessen Horowitz and Ignition Partners. Cask has raised $12.5 million to date with its last funding round of $10 million in November 2014. The company was founded in 2012 by Todd Papaioannou, Jonathan Gray, and Nitin Motgi.

picture1

 

 

Datastax provides a highly scalable, flexible and continuously available Big Data platform built on Apache Cassandra. The company integrates enterprise-ready Cassandra, Apache Hadoop for analytics and Apache Solr for search, across multiple data centers and in the cloud. The company is a pure-play Big Data infrastructure company. DataStax has raised $187.9 million to date with its last round in with its last funding round in September 2014 raising $106 million. The company was founded in 2010 by Jonathan Ellis and Matt Pfell.

picture1

 

 

Domo delivers a SaaS-based platform that helps CEOs and business leaders transform the way they manage business via direct access to data. A Big Data play, Domo offers a business intelligence package that can be viewed on various devices and offered over the web. Domo has raised $448.7 million to date with its last funding round (Series D) in April 2015 for $200 million. The company was founded in 2011 by Josh James, who previously founded Omniture, which was acquired by Adobe Systems in 2009.

picture1

 

 

Karmasphere provides customer analytics by providing deep insights on Big Data to optimize every customer touch point. They have a variety of products on the market, including an Amazon AWS-based cloud analytics solution. The company’s product unlocks the business potential of Hadoop so companies can deliver more personalized and relevant products and experiences to their customers. The company has raised $14.5 million to date with its last round in December 2012 for $3.5 million. Based in Cupertino, California, Karmasphere was founded in 2010 by Martin Hall.

picture1

 

 

Metamarkets offers a cloud-based distributed platform by which companies can have their data analyzed utilizing Metamarkets cloud instances. The company’s product is geared more towards the BI/analytics portion of the Big Data market. Metamarkets has raised $43.5.5 million to date with its last funding in February, 2015 for $15 million.  The company was founded in 2010 by Michael Driscoll and David Soloff.

picture1

 

 

 

MongoDB is one of many cross-platform document-oriented databases classified as a NoSQL database that helps businesses harness the power of data. MongoDB is one of the fast-growing database ecosystems, with over 9 million downloads, thousands of customers, and over 750 technology and service partners. The company has raised $311.1 million to date in eight rounds, with the last round in January 2015 (Series G) raising $80 million. MongoDB was founded in 2007 by Kevin Ryan, Eliot Horowitz and Dwight Merriman and is based in New York, NY.

picture1

 

 

 

Mu Sigma is Decision Sciences and analytics firm. It helps companies institutionalize data-driven decision making and harness Big Data Analytics by providing clients with a holistic ecosystem of proprietary technology platforms, processes and people. The company has raised $25 million to date, with the last round (Series C) of $25 million in June, 2011. Mu Sigma was founded by Dhiraj C. Rajaram and is based in Bangalore, India.

picture1

 

 

Neo Technology researchers pioneered the first graph database back in 2000, and have been instrumental in bringing the power of the graph to numerous organizations worldwide, including more than 40 Global 2000 customers, such as Cisco, Accenture, Telenor, eBay and Walmart. Neo4j is a leading graph database with the largest ecosystem of partners and tens of thousands of successful deployments. The company has raised $44.1 million to date with its last round (Series C) in January 2015 for $20 million. Based in San Mateo, California, Neo Technology was founded in 2007 by Johan Svensson and Emil Elfrem.

picture1

 

 

 

Palantir sells tools for aggregating, analyzing, and visualizing structured and unstructured data from disparate sources. Used by government agencies and commercial enterprises, the company’s flexible and scalable solutions can be applied toward a growing set of multiple use cases including preventing cyber attacks, identifying fraudulent transactions, and assisting in national security programs. Since 2005, Palantir has raised over $1.0 billion in equity and debt funding. Key investors include Peter Thiel, the founder, along with In-Q-Tel, Reed Elsevier Ventures, Glynn Capital, Ulu Ventures and 137 Ventures. The last funding round closed in September 2014 and raised $444 million, for a post-money valuation, according to some market estimates, at $9.0 billion. Palantir was founded in 2004 by a handful of PayPal alumni and Stanford computer scientists.

picture1

 

 

Platfora’s product is a platform upon which companies can transform their data into highly-visualized business intelligence. The platform provides predictive analytics capabilities, unlimited sharing between users, and is based on HTML5, which allows usage on mobile or browser platforms. The platform is utilized in conjunction with a Hadoop framework, such as Cloudera or Amazon Web Services. The company’s product is a combination of both infrastructure and analytics applications. Platfora has raised $65.2 million to date with its last funding round at $38 million in March 2014. The company was founded in 2011 by Ben Werther, who was previously product head at Greenplum, a data analytics company purchased by EMC.

picture1

 

 

SumoLogic provides a cloud-based, log management service by which companies can have all their log data stored and analyzed. One of their key selling points is providing real-time analytics for the data analyzed, while also providing a single point of storage and collection. SumoLogic utilizes many of the current Big Data technologies, such as Hadoop and Cassandra. The company’s product is more geared towards the analytics applications of log data analysis. SumoLogic has raised $75 million to date with its last funding at $30 million in May 2014. The company was founded in 2010 by Kumar Saurabh and Christian Beedgen.

picture1

 

 

WibiData provides Big Data applications for enterprises to deliver personalized experiences across channels. The company’s platform is built on open-source technologies Apache Hadoop, Apache Cassandra, Apache HBase, Apache Avro and the Kiji Project. WebiData has raised $23 million to date with its last round (Series B) in May 2013 for $18 million. WibiData was founded in 2010 by Christophe Bisciglia, who was also the founder of Cloudera.

 

Appendix: Hadoop Use Cases

picture1

 

 

 

LinkedIn is a massive data hoard whose value is connections. It currently computes more than 100 billion personalized recommendations every week, powering an ever growing assortment of products, including Jobs You May be Interested In, Groups You May Like, News Relevance, and Ad Targeting. LinkedIn leverages Hadoop to transform raw data to rich features using knowledge aggregated from LinkedIn’s 300 million member base. LinkedIn then uses Lucene to do real-time recommendations, and also Lucene on Hadoop to bridge offline analysis with user-facing services. The streams of user-generated information, referred to as a “social media feeds”, may contain valuable, real-time information on the LinkedIn member opinions, activities, and mood states.

picture1

 

 

 

CBS Interactive is using Hadoop as the web analytics platform, processing one billion weblogs daily (grown from 250 million events per day) from hundreds of website properties. CBS Interactive is the online division for CBS, the broadcast network. They are a top 10 global web property and the largest premium online content network.  Some of the brands include:  CNET, Last.fm, TV.com, CBS Sports, and 60 Minutes, to name a few. CBS Interactive migrated processing from a proprietary platform to Hadoop to crunch web metrics. The goal was to achieve more robustness, fault-tolerance and scalability, and significant reduction of processing time to reach SLA (over six hours reduction so far). To enable this they built an Extraction, Transformation and Loading ETL framework called Lumberjack, built based on python and streaming.

picture1

 

 

 

Explorys, founded in 2009 in partnership with the Cleveland Clinic, is one of the largest clinical repositories in the United States with 10 million lives under contract. The Explorys healthcare platform is based upon a massively parallel computing model that enables subscribers to search and analyze patient populations, treatment protocols, and clinical outcomes. With billions of clinical and operational events already curated, Explorys helps healthcare leaders leverage analytics for break-through discovery and the improvement of medicine. HBase and Hadoop are at the center of Explorys. Already ingesting billions of anonymized clinical records, Explorys provides a powerful and HIPAA compliant platform for accelerating discovery.

picture1

 

 

Travel – air, hotel, car rentals – is an incredibly competitive space. Orbitz .com generates roughly 1.5 million air searches and roughly 1 million hotel searches a day in 2011. All this activity generates massive amounts of data – over 500 GB/day of log data.  The challenge was expensive and difficult to use existing data infrastructure for storing and processing this data. Orbitz needed an infrastructure that provides (1) long term storage of large data sets; (2) open access for developers and business analysts; (3) ad-hoc quering of data and rapid deploying of reporting applications. They moved to Hadoop and Hive to provide reliable and scalable storage and processing of data on inexpensive commodity hardware. Hive is an open-source data warehousing solution built on top of Hadoop which allows easy data summarization, adhoc querying, and analysis of large datasets stored in Hadoop. Hive simplifies Hadoop data analysis — users can use SQL rather than writing low-level custom code. Highlevel queries are compiled into Hadoop Mapreduce jobs.

picture1

 

 

 

foursquare is a mobile + location + social networking startup aimed at letting your friends in almost every country know where you are and figuring out where they are. As a platform foursquare is now aware of 25+ million venues worldwide, each of which can be described by unique signals about who is coming to these places, when, and for how long. To reward and incent users foursquare allows frequent users to collect points, prize “badges,” and eventually, coupons, for check-ins.  Foursquare is built on enabling better mobile + location + social networking by applying machine learning algorithms to the collective movement patterns of millions of people. The ultimate goal is to build new services which help people better explore and connect with places. Foursquare engineering employs a variety of machine learning algorithms to distil check-in signals into useful data for app and platform.  foursquare is enabled by a social recommendation engine and real-time suggestions based on a person’s social graph.

 

About Manhattan Venture Partners

Our Research Methodology

Manhattan Venture Partners provides clients with accurate, timely and innovative research into the companies and sectors we cover. To that end we have established an experienced team of analysts, researchers, economists and industry veterans that focus exclusively on private companies with a proven track record of success. Producing quality research on a private company is uniquely challenging. Our analysts communicate with employees, ex-employees, early investors, VCs, competitors, suppliers and others to gather valuable information about the company under coverage. This information enables us to create unique financial models that value the underlying company and provide insight to our clients and industry experts, leveraging years of experience working for bulge bracket firms.

Manhattan Venture Partners reports include business and financial aspects of late-stage companies. These reports include but are not limited to industry overviews, competitor analyses, SWOT analysis, products (existing and in development), management and key directors, risks and concerns, other propriety channels, historical financials, revenue projections, valuations (using various matrices and valuation recommendation), waterfall analysis, and a capitalization table.

About the Analysts

Santosh Rao

Santosh Rao has over 16 years of experience in equity research, primarily within the technology and telecommunications space. He started his equity research career at Prudential Securities and later served as a Vice President and Senior Equity Analyst at Broadpoint Capital (Broadpoint Gleacher), where he specialized in the telecommunications equipment and services sectors. Prior to joining Manhattan Venture Partners, he was Managing Director and Head of Research at Greencrest Capital, focusing on private market TMT research, and prior to that at Evercore Partners’ Institutional Equities Group, focusing on telecom and data services companies. Mr. Rao started his career as a Financial Analyst at PaineWebber (UBS) and later at Prudential Securities in the Financial Analysis & Reporting Group. Santosh earned his Bachelor of Arts in Economics and Accounting from Rutgers University and an MBA in Finance from Rutgers Business School.

Max Wolff

Max Wolff is an economist specializing in international finance and macroeconomics. Before joining Manhattan Venture Partners, he was Chief Economist at Greencrest Capital, and prior to that spent four years as the senior hedge fund analyst at the Beryl Consulting Group LLC. Mr. Wolff teaches finance and statistical research methods in the New School University’s Graduate Program in International Affairs. Max’s financial markets and Macro-Economics work appears regularly in Seeking Alpha, The WSJ, Reuters, Bloomberg, The BBC, Russia Today TV, and Al Jazeera English.

Disclaimer

I, Santosh Rao, certify that the views expressed in this report accurately reflect my personal views about the subject, securities, instruments, or issuers, and that no part of my compensation was, is, or will be directly or indirectly related to the specific views or recommendations contained herein.

I, Max Wolff, certify that the views expressed in this report accurately reflect my personal views about the subject, securities, instruments, or issuers, and that no part of my compensation was, is, or will be directly or indirectly related to the specific views or recommendations contained herein.

Manhattan Venture Partners LLC (Hereafter “Manhattan Venture Partners”), the parent company of Manhattan Venture Research, does and seeks to do business with companies covered in this research report. As a result, investors should be aware that the firm may have a conflict of interest that could affect the objectivity of this report. Investors should consider this report as only a single factor in making their investment decision. This document does not contain all the information needed to make an investment decision, including but not limited to, the risks and costs.

Additional information is available upon request. Information has been obtained from sources believed to be reliable but Manhattan Venture Partners or its affiliates and/or subsidiaries do not warrant its completeness or accuracy. All pricing information for the securities discussed is derived from public information unless otherwise stated. Opinions and estimates constitute our judgment as of the date of this material and are subject to change without notice. Past performance is not indicative of future results. Manhattan Venture Partners does not engage in any proprietary trading.  The user is responsible for verifying the accuracy of the data received.  This material is not intended as an offer or solicitation for the purchase or sale of any financial instrument. Manhattan Venture Partners does not have ownership of the subject company’s securities. Manhattan Venture Partners does not have any market making activities in the subject company’s securities. The opinions and recommendations herein do not take into account individual client circumstances, objectives, or needs and are not intended as recommendations of particular securities, financial instruments or strategies to particular clients. The recipient of this report must make its own independent decisions regarding any securities or financial instruments mentioned herein. Periodic updates may be provided on companies/industries based on company specific developments or announcements, market conditions or any other publicly available information.

Copyright 2015 Manhattan Venture Research LLC. All rights reserved. This report or any portion hereof may not be reprinted, sold or redistributed, in whole or in part, without the written consent of Manhattan Venture Research.

Information Access Level Classification System (IALCS)

Manhattan Venture Research uses an Information Access Level Classification System (IALCS) to make clear the degree of access offered by the company(s) covered in all research reports.

Each research report is classified into one of three categories depending on its classification. The categories are:

I++: The company covered by the research report provided substantial disclosures to Manhattan Venture Partners.

I+: The report was prepared following partial disclosure by the company, including publicly available financial statements, and/or is based on conversations with past or present company employees.

I: All reports are prepared using a mosaic research approach. Not all companies are willing and able to provide substantive access to management and information. In I reports no direct access was granted.

Research Reports

Stripe
Palantir Technologies
Alibaba Group