Basser Seminar Series

QuABase: A Dynamic Software Engineering Knowledgebase for Building Big Data Systems

Speaker: Dr Ian Gorton
Carnegie Mellon University Software Engineering Institute

When: Thursday 9 October, 2014, 2:00-3:00pm - NOTE: different day, time and venue to usual.

Where: The University of Sydney, School of IT Building, SIT Board Room (Room 124), Level 1

Add seminar to my diary


New data sources, ranging from diverse business transactions to social media, high-resolution sensors, and the Internet of Things, are creating a digital tidal wave of big data that must be captured, processed, integrated, analyzed, and archived. Big data systems storing and analyzing petabytes of data are becoming increasingly common in many application areas. These systems represent major long-term investments massive scale software and system deployments. With analysts estimating on-going data storage growth at 30 to 60 percent per year, organizations must therefore build applications that analyze exponentially growing data sets with predictable, linear costs. This requires embracing highly distributed software architectures and new generations of software technologies that are designed to operate at scale. However, successfully deploying these architectures and technologies is a major challenge for many organizations.

In this talk, I’ll describe some basic principles for successfully designing and deploying big data systems, and then describe QuABase (pronounced kbase), a dynamic knowledge base for big data software architectures and technologies. QuABase is built upon a semantic data model and links fundamental software design principles for big data systems to technologies that realize these principles in their implementations. This enables software engineers to explore and understand design approaches and trade-offs that are appropriate for meeting their requirements, as well as to compare different technology features in order to select implementation platforms that can satisfy their needs and constraints. QuABase can be queried to present tailored information for specific questions, and supports extension through a rigorous forms-based interface that automatically populates the semantic model to capture new design and technology knowledge. We believe the basic approach used in QuABase is appropriate for capturing detailed design and implementation knowledge is a wide range of technical areas, and hence represents a first step on a path to build enduing, extensible knowledgebases for education and engineering needs.

Speaker's biography

Dr Ian Gorton is a Senior member of the technical Staff at the Carnegie Mellon University Software Engineering Institute, where he is investigating issues related to software architecture at scale. This includes designing large scale data management and analytics systems, and understanding the inherent connections and tensions between software, data and deployment architectures in cloud-based systems. Since obtaining his PhD, Dr Gorton has worked in academia, industry (Microsoft, IBM) and Government funded R&D labs (CSIRO, PNNL, NICTA, and since 2013, the SEI). He has published 3 books and is an author of more than 150 international journal and conferences papers. He has also led large research and development teams that have built and deployed advanced scientific data management and analysis systems that are in wide use in scientific research labs in the USA.