Milestone Logo

MSG Blog >

Media

 

 

 

Milestone Group Quarterly: April 2007

 

Articles

 

  • Face to Face: Mike Stonebraker, Founder and CTO of StreamBase and Vertica
  • Investment Viewpoint: Jean-Louis Gasee, General Partner, Allegis Capital
  • By Invitation: Don Tapscott, Co-author of Wikinomics: How Mass Collaboration Changes Everything
  • Milestone POV: Synchronizing Time Frequencies Between the Large Enterprise and the Innovative Firm by Philippe Bouissou, Milestone Group Managing Partner

 

Face to Face:

Mike Stonebraker, Founder and CTO of StreamBase and Vertica

 

Milestone: You have spent a fair portion of your professional life in the database business and were a key actor in the world of relational database that was originally defined and coined by Ted Codd in 1970. What, if any, disruptive innovations in database have you seen in the past decade or so?

Stonebraker: The thing I find most interesting is that the relational database market was largely developed in the '70s with System R and Ingres and oriented toward bread and butter business data processing. Today, the elephants (meaning the dominant relational database vendors), are still selling roughly the same architecture.

 

In the early '90s, the database warehouse market came into existence, through the efforts of Arbor Software and the rest of the OLAP vendors. The stream processing market has come about because feed volumes have gone through the roof on Wall Street. Search on the Internet has shown a need for large text and semi-structured data repositories, which Google, Yahoo, and others are heavily involved in. More recently, in the aftermath of 9/11, intelligence data is becoming much more important, and obviously biology, chemical and genomic databases are becoming a significant market.

 

These are a collection of new markets that were never envisioned by the architects of relational database systems way back then. And, I would just flatly state that, in any of these markets, a specialized architecture that's vertical market-oriented will outperform the relational elephants by between one and two orders of magnitude.

 

I view the DBMS space as fertile for new ideas for specific markets, which can be astonishingly better than the "one size fits all" to which the elephants are clinging.

 

Milestone: Over the past couple of years or so, you have been involved, and you have founded two companies, StreamBase and Vertica. Can you tell us a little bit on how the businesses came about?

Stonebraker: They came about quite differently. In 2000, I left the West Coast and relocated East. Stan Zdonik and I (Stan is a professor at Brown) had the idea that in order to do business analytics on real time data --- the type electronic traders on Wall Street want to do --- current system software is not sufficient.

 

We started a project to build an academic prototype to do stream processing, calling the prototype Aurora. In 2003, we secured venture capital to commercialize the prototype, and that turned into StreamBase.

 

The gleam in our eye at the time was that the sea change driven by the plummeting cost of micro-sensor technology will cause each object of value on the planet to be sensor tagged so that its location or status can be reported in real time. The number of data "fire hoses" of such reporting data is just going to go through the roof. The downstream processing of that data is what StreamBase is all about.

 

The genesis of Vertica came from my experience at Informix and other consulting assignments, where I learned a lot about the requirements of the data warehouse market. I quickly figured out that it's poorly served by current database systems. The next step was, of course, to have a better idea. Over a couple of years, these core ideas evolved, and then we started building an academic prototype that MIT called C-Store.

 

Basically we showed how C-Store can beat any one of the elephant's code lines by between one or two orders of magnitude, depending on the application. So C-Store was promising enough that we got venture capital to back it, and that turned into Vertica.

 

Milestone: Can you talk a little bit about "data at rest" versus "data streams"?

Stonebraker:Let's just do a simple example. All of the Wall Street brokerage houses have very significant electronic trading operations. Real time data comes over the wall, often from multiple places. So, they need to merge it, clean it, and normalize it into something that they can work with.

 

Then they are all, for the most part, computing some sort of secret sauce on this real time data. For example, they might compute the momentum of IBM over the last five ticks; and then compare it with the momentum of Oracle over the same period. If the delta swings big, one way or the other, then they may arbitrage the two stocks. The thing that makes this doubly interesting is that composite feeds get data from a number of exchanges and news items (for example, Bloomberg and Reuters). So, the latency between when the trade actually occurs and when it is reported by composite feeds is a couple of hundred milliseconds. In current electronic trading environments that is absolutely forever. All of the electronic trading systems want the delay between a tick happening and when they get it to be one millisecond. So traders are all going directly to the individual data sources to get real time feeds directly from the markets in order to get rid of latency.

 

Now, if you were to put incoming feeds into a database engine, you add latency; perhaps hundreds of milliseconds just from the architecture of the current systems. So if you can afford a second or two of latency, then by all means take the incoming data, stick it in the database system using current technology.

 

StreamBaseis focused on applications where traditional solutions have too much latency to make the data valuable, or where the volumes are too high to make it work. So, the answer is: data at rest is great, if you don't care about latency and if you can react to events in seconds to minutes. Also, if the volume of events is hundreds to thousands per second then the current database systems work great. If those numbers are not where you are at, then you need something else.

 

Whenever two orders of magnitude makes a difference, you've got to operate on data while in motion rather than having it come to rest inside a database system.

 

Milestone: You have developed a software structure that processes and analyzes real time streaming data. Give us a sense for the magnitude of the need and the industries that benefit from the technology.

Stonebraker: There is a very large multinational bank, which has bond trading desks all over the world that are re-pricing their inventory of financial bonds in real time, usually based on US Treasuries. A typical rule is when US Treasuries tick up by ten basis points, then five year General Motors Corporate bonds should tick up by fifteen basis points. Each of these desks is run independently. Moreover, the internal (electronic) traders inside this bank can watch the same feeds that the bond desks watch. If they can act on a price change quicker than the bond guys can re-price the bonds, they can reach in, grab and buy their bond before it's re-priced.

 

So, it's basically an arms race of latency between the people who make the markets and the people who arbitrage them - typically hedge funds and internal traders. So, we are happy to sell to both sides of this arms race, and we are doing very well in all aspects surrounding electronic trading.

 

We are also starting doing quite well in network management applications, by monitoring and quickly detecting denial of service attacks, and the like. Another one of our customers is big in the multiplayer Internet gaming market. There may be twenty thousand simultaneous online players and maintaining the state of a very large multiplayer game is again a problem that we are very good at solving.

 

The high volume sensor application market is where we'll play a large role in the future. For example, the MIT library wants to put a sensor tag on every book in their library, but they are not worried about theft. They are much more worried about somebody mis-shelving the book and then it's lost forever.

 

Milestone: You look at the database world from two orthogonal perspectives: read only and write only. How does streaming database fit into this view? Do you think that they will be a product one day that can reconcile both of those views at the same time?

Stonebraker: No; because it's actually very simple. The ruthless progression of Moore's law has turned OLTP into a main memory size problem. So, what you want is a blindingly fast transactional main memory database system. And you want to be able to insert records really quickly which calls for the fields in a record to be stored together. These days this is often called a "row store". In contrast, for the read optimized market, you want to rotate your thinking 90 degrees and implement a "column store", because that technically is what gives you this one to two orders of magnitude advantage. I don't know how to build a single code base that is both a main memory row store, and a disc-oriented column store. You are just working on two very different ideas. It's certainly possible to create two engines, and put them under a common SQL parser. And give the illusion that there is only one code base there, but it's really two.

 

Milestone: Google's stated mission is "to organize the world information and make it universally accessible and useful." Can you tell us how StreamBase and Vertica is connected to that mission?

Stonebraker: Google is interested in being the world's librarian for data at rest. It seems to me that they are the dominant player in a completely unstructured data market.

 

With semi-structured data they are not dominant. If I want to know someone's phone number, I don't go to Google. I go to 411.com or one of the other sites. Whether they will choose to move into the structured data market remains to be seen.

 

Milestone: One of the areas we are focusing on, in this issue of Milestone Group Quarterly, is enterprise collaboration. What technology or techniques, have you used or do you currently use to foster collaboration?

Stonebraker: Put everybody in the same room.

 

The nice thing about startups is that with a fifty-person company you can put everybody in the same room and make decisions very quickly. The very big companies are not very nimble, because it's just so hard to get decisions made.

 

In Thomas Friedman's The World is Flat, he says that IT is a single worldwide market in which the US is going to have to compete against China and India and everybody else. In the current technological world, nimbleness is a hugely valuable feature. US companies are not particularly nimble and I think it's going to be a management challenge to make them so. Getting corporations to be nimble is a bigger problem than just introducing collaboration tools, such as videoconferencing.

 

Milestone: What global trends are driving IT markets today? Where are the pressure points on established companies (foreign and domestic)?

Stonebraker: Well, my view is that US IT companies are in a competition with China and, to some extent, India. Right now we are obviously the dominant IT player in the world for two reasons. We have the best research engine in the world, one that is churning out new ideas at a prodigious rate. Then it's a matter of having a great venture capital system to perform technology transfer of those ideas into the commercial marketplace.

 

A healthy venture capital market and a healthy research engine are both crucial to the US remaining competitive in IT. If Web 3.0 comes from the Chinese, then it's entirely possible that the dominant companies will be Chinese. If we lose world domination in IT, there will be a dramatic effect on our standard of living.

 

For this to all work in favor of the US, there has to be a vibrant venture capital community and we have that. There is plenty of money; if you have a good idea, you can get funded.

 

But the US research community is being starved by the current Federal Government. And equally as big, if not a bigger problem, is that the number of students who come up through US universities studying science and technology is falling.

 

A recent article in The Economist said that seventeen of the twenty best universities in the world are American. No one argues that we don't have the best university system on the planet. If you look at graduate students at our US universities, for example, at one Southern university in computer science, the graduate student population is 80% foreigners, 20% Americans.

 

Twenty years ago that was fine because these folks would stay in the US and improve our scientific gene pool. But, these days, foreign students are returning in greater numbers to their home countries, and we are educating the competition. That's more than a little bit scary.

 

We are not investing in the next generation of scientists and engineers and we are not keeping the research community vibrant. I think these are horrible long-term problems.

 

Milestone: Based on your rich experience in technology and research in the startup world, is there a piece of advice you can give to entrepreneurs who are just starting out?

Stonebraker: The venture capital community is getting involved in earlier and earlier stage companies, and then actually nurturing them along. So, I think the answer is to get a prototype that the VCs can understand, and then find a VC that you get along with. What I am seeing these days, unlike twenty years ago when you had to have a complete management team and whole bunch of infrastructure in place, is a higher appetite for earlier early stage funding.

 

But get a backer who you can trust to help you fill out the stuff you don't have.

 


 

Mike Stonebraker has been a pioneer of database research and technology for more than a quarter of a century. He was the main architect of the INGRES relational DBMS, the object-relational DBMS, POSTGRES, and the federated data system, Mariposa. All three prototypes were developed at the University of California at Berkeley where Stonebraker was a Professor of Computer Science for twenty-five years. He is the founder of three successful Silicon Valley startups, whose objective was to commercialize these prototypes.

 

Dr. Stonebraker is the author of scores of research papers on database technology, operating systems and the architecture of system software services. He was awarded the prestigious ACM System Software Award in 1992, for his work on INGRES. Additionally, he was awarded the first annual Innovation award by the ACM SIGMOD special interest group in 1994, and has been recognized by Computer Reseller News as one of the top five software developers of the century. Moreover, Forbes magazine named him one of the eight innovators driving the Silicon Valley wealth explosion during their 80th anniversary edition in 1998. He was elected to the National Academy of Engineering in 1998 and is presently an Adjunct Professor of Computer Science at M.I.T.

 

Dr. Stonebraker was named the recipient of the 2005 IEEE John von Neumann Medal. The award recognizes Stonebraker for his significant contributions to the design, implementation, and commercialization of relational and object-relational database systems. Read the full article here.

 

Dr. Stonebraker was the founder and CTO of Ingres Corporation, Illustra Corporation, and Cohera Corporation. He was the CTO of Informix Corporation and Required Technology, Inc.

 

Dr. Stonebraker received a Bachelor of Science degree from Princeton University and Master of Science and Doctor of Philosophy degrees from the University of Michigan. He has held visiting professorships at the Pontifico Universitade Catholique (PUC), Rio de Janeiro, Brazil; the University of California, Santa Cruz, and, the University of Grenoble, France.

 

 

Dear Reader:

This issue of Milestone Group Quarterly takes a look at the intersection between technology and enterprise collaboration. The view at Milestone Group is that the value horizon for the enterprise will come from so-called “mash-ups” of technology, media and telecommunications innovations.

We expect that the value to the enterprise from these advances will be measured by the efficiency gains made by data. And for many businesses today, the value metrics are in the millions, and the millisecond.

So we’ve put together an issue of Milestone Group Quarterly that digs deeper into the financing, technology and operational opportunities that this environment will likely create.

Our line-up this quarter includes:

Mike Stonebraker – The database luminary gives us his view on “the new markets never envisioned by the architects of relational databases.”

Jean-Louis Gassee – Gassee really needs no introduction, except to say that he has created a personal mash-up of IT visionary and venture financier whose view should not be missed.

Don Tapscott – Author and mass collaboration guru Tapscott says that the attention to the user generated media and social networking capabilities of the Web overshadow the real story occurring beneath the surface – the Web as a new mode of production.

Philippe Bouissou – Milestone Group's Bouissou looks at the efficiency arc that occurs (or doesn’t) when a large corporation engages a small firm for the purposes of innovation. Bouissou argues that weak collaboration between the big and small players, when it occurs, is a matter of “frequency coupling.” And he shows how time frequencies can be modulated to improve collaboration.

We’ll be taking these ideas with us to Software 2007 on May 8th and 9th in Santa Clara, CA. (If you attend just one software conference a year, this is it. This year’s line-up of industry leaders includes Steve Ballmer, Hasso Plattner, Marc Benioff, Ed Zander and many others.)

Johannes Hoech will be leading a Milestone Group sponsored breakout session on “The Science of Revenue”, an approach that underlies, for instance, the strategic work we performed for startup Koral (acquired by Salesforce.com earlier this month). Koral CEO Mark Suster is a Software 2007 speaker and will present his views on “Office 2.0”.

Special Offer for Milestone Group Quarterly Readers

We’ve also arranged a special promotion code for Milestone Group Quarterly readers. If you’re not registered yet, we’ve got you covered. Simply go to Software 2007 and enter “partner07” to receive the special Milestone Group rate of $1,495 (a savings of $300). Hurry, the offer expires May 4th. I hope to see you in Santa Clara.

Up and right,

Mark Zawacki, Publisher maz@milestone-group.com

 

 

SIGN UP TO RECEIVE THE QUARTERLY >

 

+1 650-351-6464
info@milestone-group.com