• breadth-first search (crawl)
    • graph 의 구성
      • starting node : the initial web pages
      • the set of neighboring nodes : the other pages that are hyperlinked
    • Vs depth-first search 가 alternative choice 가 될 수 있음



  • Wiki Crawling, requet() 시 주의사항
    """
    http://meta.wikimedia.org/wiki/Bot_policy#Unacceptable_usage

    Unacceptable usage
    Data retrieval: Bots may not be used to retrieve bulk content for any use
    not directly related to an approved bot task. This includes dynamically
    loading pages from another website, which may result in the website being
    blacklisted and permanently denied access. If you would like to download
    bulk content or mirror a project, please do so by downloading or hosting
    your own copy of our database.
    """


+ Recent posts