OpenSQL Camp 2008: the Present and the Future of Open Source Databases
Last weekend I attended OpenSQL Camp, which was held November 14-16 in Charlottesville, Virginia. Among about 100 attendees from many countries were open source legends and community leaders, database product architects, developers, hackers, consultants, bloggers, and DBAs.
Open SQL Camp was a great success and a pleasure to attend. It was truly an “un-conference”, as there were no vendors claiming their products to be the best thing since sliced bread, no sales pitches, no boasting, no spinning, and no pretense. It was an “un-conference”, as a bunch of genuinely smart open source database innovators and community leaders got together to share knowledge and ideas and to brainstorm the frontiers and the future of open source databases.
The “un-conference” started Friday at 6pm with greetings, introductions, informal discussions, and session planning. All talk sessions were held on Saturday. Sunday was a hackatron – developers and hackers developed and wrote code; architects designed interfaces, features, and architectural elements; and everybody collaborated and shared knowledge.
Brian Aker of Sun/Drizzle/MySQL opened the talk sessions program of the “un-conference” with the keynote on the state of open source databases. His presentation was insightful and set the tone for the Camp. Open source databases are ubiquitous, Brian said. They are everywhere and anywhere where data is stored or may be found today. He discussed current trends in computing, hardware, and data growth.
Two of the key challenges that open source databases will face in the future mentioned by Brian have caught my attention. The scalability challenge. Future open source databases will need to store and handle petabytes of data. More and more data is created every day. Energy cost is already an issue. Six percent of the world’s power today goes to data centers. Energy efficient storage will certainly help, but databases need to become more scalable and efficient too. These might be tough problems, but I felt a lot of positive energy and optimism in the audience. Clearly, solutions addressing these challenges are already in the making. For example, take the data size growth and the corresponding energy cost growth issue. Clearly, with its 10:1 or higher database compression in actual customer environments, Infobright column-oriented analytical database / MySQL storage engine hits it in the nail. If large-scale databases may require ten times less storage capacity, it may drastically improve the economics of cloud computing and data centers of the future.
As expected, there were many interesting session talks illustrating the current trends in open source databases.
Innovation is certainly a major trend. Column-oriented databases and storage engines are on the rise, and Infobright exemplifies their huge potential.
Another visible trend is pluggable database engine architecture. There were many interesting presentations related to this.
Monty Widenius presented his work on Maria MySQL storage engine. Maria is a crash safe rewrite of MyISAM. In addition to supporting all of the main functionality of the MyISAM engine, the Maria storage engine provides recovery support (in the event of a system crash), full logging, and transaction support.
Interestingly, Maria is also a Drizzle engine. This brings us to the other side of the pluggable architecture trend. Database platforms supporting pluggable engines may be full featured and general purpose as MySQL or lean and focused as Drizzle, which is geared towards cloud computing on scale. Most major MySQL storage engines are also Drizzle engines. Even the Postgres development community is discussing future plug-in architecture support. Brian Momjian expressed keen interest in Infobright. Perhaps, at some point in the future, it will be possible to use a MySQL or Drizzle storage engine as a Postgres plug-in and vice versa.
I attended many interesting sessions. Guiseppe Maxia gave a demonstration of MySQL Sandbox, a tool for installing one or more MySQL servers in isolation, without affecting other servers. He also talked about MySQL replication. Peter Zaitsev presented the Sphynx full-text search engine that natively supports MyISAM and InnoDB tables. Arjen Lentz explained OurDelta project, which provides a framework for MySQL builds. Eric Day talked about Libdrizzle, a non-blocking client library that supports that supports MySQL protocol and Drizzle extensions, and allows concurrent queries from a single user session.
My talk on Infobright was the last in the program. As the audience was incredibly technical, knowledgeable, and smart, I focused my talk on explaining why it was possible to achieve such unusual data compression efficiency, data load rate, and analytical query performance, and on describing how it was implemented and how Infobright compression and knowledge grid work. There was a lot of interest in Infobright at the “un-conference”.
A couple of final thoughts.
It would be impossible to imagine database people from Oracle, Microsoft SQL Server, IBM DB2, and Sybase IQ gathering in one room for sharing ideas, brainstorming, and collaborating on improving their products. Is it a coincidence that in order to keep up with the pace of innovation, these big companies have to rely on open source infusions, such as BerkeleyDB? In case of open source database community, such fusion of creative minds and creative contributions is possible and is happening every day. This is our source of strength. This is why open source databases have become ubiquitous.
Alex Esterkin

Recent Comments