EmojiNet: An open service and API for emoji sense discovery

Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents the release of EmojiNet, the largest machine-readable emoji sense inventory that links Unicode emoji representations to their English meanings extracted from the Web. EmojiNet is a dataset consisting of: (i) 12,904 sense labels over 2,389 emoji, which were extracted from the web and linked to machine-readable sense definitions seen in BabelNet; (ii) context words associated with each emoji sense, which are inferred through word embedding models trained over Google News corpus and a Twitter message corpus for each emoji sense definition; and (iii) recognizing discrepancies in the presentation of emoji on different platforms, specification of the most likely platformbased emoji sense for a selected set of emoji. The dataset is hosted as an open service with a REST API and is available at http://emojinet.knoesis.org/. The development of this dataset, evaluation of its quality, and its applications including emoji sense disambiguation and emoji sense similarity are discussed.
Original languageEnglish
Title of host publicationProceedings of the 11th International Conference on Web and Social Media, ICWSM 2017
PublisherAAAI Press
Pages437-446
Number of pages10
ISBN (Electronic)9781577357889
DOIs
StatePublished - 2017
Event11th International Conference on Web and Social Media, ICWSM 2017 - Montreal, Canada
Duration: May 15 2017May 18 2017

Publication series

NameProceedings of the 11th International Conference on Web and Social Media, ICWSM 2017
PublisherPKP Publishing Services Network
Number1
Volume11
ISSN (Print)2162-3449
ISSN (Electronic)2334-0770

Conference

Conference11th International Conference on Web and Social Media, ICWSM 2017
Country/TerritoryCanada
CityMontreal
Period5/15/175/18/17

ASJC Scopus Subject Areas

  • Computer Networks and Communications

Keywords

  • Social networking (online)
  • Context-word
  • ITS applications
  • Most likely
  • News corpora
  • Open services
  • Sense inventories
  • Unicodes

Cite this