Basser Seminar Series

Understanding queries at Thomson Reuters

Speaker: Dr Isabelle Moulinier
Thomson Reuters Research and Development Group

When: Wednesday 2 July, 2014, 4:00-5:00pm

Where: The University of Sydney, School of IT Building, SIT Lecture Theatre (Room 123), Level 1

Add seminar to my diary

Abstract

An overview of recent efforts with identifying entities in queries at Thomson Reuters. Thomson Reuters provides research tools for professionals in the financial, legal and tax/accounting domains. These research tools combine search to access specialized information with collaboration and workflow tools. In this talk, we focus on search and on determining what the query is about. In particular, we describe a number of efforts to identify entities in queries, where entities range from people, companies, financial concepts, and document titles and citations.

In the legal domain, we face a problem similar to homepage finding on the Web. Users refer to a document by title, and traditional search often fails for such queries. Similarly, in the tax domain, finding documents by citation can be challenging. In both scenarios, we rely on a combination of search and machine learning to determine whether a document matches the title or citation search, but the amount of editorial effort available lead us to take different approaches in training and evaluating the solution.

WestlawNext, our legal research tool, has expanded beyond researching the law to researching people and companies in a legal context, while keeping a “single query box” approach. Our effort focus on entity extraction in order to properly route user queries to the appropriate underlying engine. and we develop a hybrid approach to identify people, companies and other pieces of information about them. We will present some of the challenges we faced in developing an hybrid approach in that context and will outline some outstanding research questions.

Lastly, in the financial domain, our users had to navigate through menus to access information such company earnings. To make such information readily accessible, we develop a parser that identifies financial instruments and related concepts from free text, and benefits to the user are to have an answer with a single interaction with the system. We will outline future directions to generalize our current process.

Speaker's biography

Isabelle Moulinier is a research manager in the Thomson Reuters Research and Development group. She joined the group in 1997 and her research interests have focused upon the application of natural language processing and machine learning technologies to the improvement of search and other aspects of the user experience. More recently, she has been interested in large scale data analysis applied to usage data and recommendation problems. Isabelle is a member of ACM, serves as a senior program committee member for SIGIR, and is a frequent reviewer for related conferences and journals. Prior to joining Thomson Reuters, Isabelle worked at the IBM Paris Research Center. She holds a Ph.D. in Artificial Intelligence from the University Pierre et Marie Curie (Paris VI) in France.