ABSTRACT
In this paper we describe a new release of a Web scale entity graph that serves as the backbone of Microsoft Academic Service (MAS), a major production effort with a broadened scope to the namesake vertical search engine that has been publicly available since 2008 as a research prototype. At the core of MAS is a heterogeneous entity graph comprised of six types of entities that model the scholarly activities: field of study, author, institution, paper, venue, and event. In addition to obtaining these entities from the publisher feeds as in the previous effort, we in this version include data mining results from the Web index and an in-house knowledge base from Bing, a major commercial search engine. As a result of the Bing integration, the new MAS graph sees significant increase in size, with fresh information streaming in automatically following their discoveries by the search engine. In addition, the rich entity relations included in the knowledge base provide additional signals to disambiguate and enrich the entities within and beyond the academic domain. The number of papers indexed by MAS, for instance, has grown from low tens of millions to 83 million while maintaining an above 95% accuracy based on test data sets derived from academic activities at Microsoft Research. Based on the data set, we demonstrate two scenarios in this work: a knowledge driven, highly interactive dialog that seamlessly combines reactive search and proactive suggestion experience, and a proactive heterogeneous entity recommendation.
- Google inclusion guidelines. In http://www.google.com/intl/en/scholar/inclusion.html#indexing.Google Scholar
- Microsoft academic data. In http://datamarket.azure.com/dataset/mrc/microsoftacademic, November 2013.Google Scholar
- A. Acharya, A. Verstak, H. Suzuki, S. Henderson, M. Iakhiaev, C. C. Lin, and N. Shetty. Rise of the rest: The growing impact of non-elite journals. CoRR, 2014.Google Scholar
- R. Baeza-Yates, C. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In EDBT 2004 Workshops, 2005. Google ScholarDigital Library
- X. L. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohman, S. Sun, and W. Zhang. Knowledge vault: A web-based approach to probabilistic knowledge fusion. In KDD 2014. Google ScholarDigital Library
- J. Huang, S. Ertekin, and C. L. Giles. Efficient name disambiguation for large-scale databases. In PKDD, 2006. Google ScholarDigital Library
- B. J. Jansen, A. Spink, J. Bateman, and T. Saracevic. Real life information retrieval: A study of user queries on the web. SIGIR Forum, 32(1):5--17, Apr. 1998. Google ScholarDigital Library
- V. Larivière, G. A. Lozano, and Y. Gingras. Are elite journals declining? JASIST, 65(4):649--655, 2014.Google Scholar
- X. Ren, J. Liu, X. Yu, U. Khandelwal, Q. Gu, L. Wang, and J. Han. Cluscite: effective citation recommendation by information network-based clustering. In SIGKDD'14. ACM, 2014. Google ScholarDigital Library
- H. Shum. Integrating microsoft academic search into cortana (keynote). In Microsoft Research Faculty Summit, 2014.Google Scholar
- H. Shum, Y. Kuo, and K. Wang. Bing dialog model: Intent, knowledge and user interaction. In Microsoft Research Faculty Summit, July 2010.Google Scholar
- Y. Song, J. Huang, I. G. Councill, J. Li, and C. L. Giles. Efficient topic-based unsupervised name disambiguation. In JCDL, June 2007. Google ScholarDigital Library
- T. Strohman, W. B. Croft, and D. Jensen. Recommending citations for academic papers. In SIGIR, 2007. Google ScholarDigital Library
- K. Wang. Bing dialog: Towards richer interactions with web search. In ACM SIGIR, July 2014.Google Scholar
Index Terms
-
An Overview of Microsoft Academic Service (MAS) and Applications
-
Recommendations
-
Microsoft Academic: is the phoenix getting wings?
In this article, we compare publication and citation coverage of the new Microsoft Academic with all other major sources for bibliometric data: Google Scholar, Scopus, and the Web of Science, using a sample of 145 academics in five broad disciplinary ...
-
Knowledge Discovery from Academic Search Engine
KSEM '09: Proceedings of the 3rd International Conference on Knowledge Science, Engineering and ManagementThe purpose of a search engine is to retrieve information relevant to a user's query from a big textual collection. However, most vertical search engines, such as Google Scholar and Citeseer, only return the flat ranked list without an efficient result ...
-
Microsoft Academic (Search): a Phoenix arisen from the ashes?
In comparison to the many dozens of articles reviewing and comparing (coverage of) the Web of Science, Scopus, and Google Scholar, the bibliometric research community has paid very little attention to Microsoft Academic Search (MAS). An important reason ...
Comments