Classifying web robots by K-means clustering

Derek Doran, Swapna S. Gokhale

Research output: Contribution to conferenceAbstract

Abstract

Sophisticated Web robots, sporting a variety of functionality and unique traffic characteristics, constitute a significant percentage of request and bandwidth volume serviced by a Web server. To adequately prepare Web servers for this continuous rise in Web robots, it is necessary to gain deeper insights into their traffic properties. In this paper, we propose to classify Web robots according to their workload characteristics, using K-means clustering as the underlying partitioning technique. We demonstrate how our approach can allow an examination of Web robot traffic from new perspectives by applying it to classify Web robots extracted from a year-long server log collected from the Univ. of Connecticut School of Engineering domain.
Original languageEnglish
Pages97-102
Number of pages6
StatePublished - Jul 1 2009
Event21st International Conference on Software Engineering and Knowledge Engineering, SEKE 2009 - Boston, MA, United States
Duration: Jul 1 2009Jul 3 2009

Conference

Conference21st International Conference on Software Engineering and Knowledge Engineering, SEKE 2009
Country/TerritoryUnited States
CityBoston, MA
Period7/1/097/3/09

ASJC Scopus Subject Areas

  • Software
  • Artificial Intelligence
  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications

Keywords

  • Robots
  • Connecticut
  • K-means clustering
  • Partitioning techniques
  • School of engineering
  • Traffic chracteristics
  • Web robots
  • Web servers
  • Workload characteristics

Disciplines

  • Artificial Intelligence and Robotics
  • Robotics
  • Computer Sciences
  • Engineering

Cite this