Scrapy

Scrapy
開発元	Scrapinghub, Ltd.
初版	2008年6月26日
最新版	2.12.0 - 2024年11月18日 [±]
リポジトリ	github.com/scrapy/scrapy;
プログラミング; 言語	Python
対応OS	Windows, macOS, Linux
種別	Web crawler
ライセンス	BSD License
公式サイト	scrapy.org
	テンプレートを表示

Scrapy（[ˈskreɪpaɪ] SKRAY-peye）はPythonで開発されたフリーでオープンソースのクロールフレームワーク。元々はウェブスクレイピング用に設計されたが、 APIを使用したデータの抽出や、汎用のクローラーとしても使用できる^[2]。現在、ウェブスクレイピングの開発およびサービス会社であるScrapinghub Ltd.で管理されている。 Scrapyプロジェクトアーキテクチャは、「スパイダー^{[要曖昧さ回避]}」を中心に構築されている。DjangoなどのフレームワークをDRY^[3]他の精神を踏襲し、開発者がコードを再利用できるようにしている。さらに、サイトの動作に関する想定をテストするために開発者が使用できるWebクロールシェルを提供する^[4]。 Scrapyを使用している有名な会社と製品には、Lyst^[5]^[6]、Parse.ly^[7]、Sayone Technologies^[8]、Sciences Po Medialab^[9]、Data.gov.ukの世界政府データサイト^[10]がある^[11]。

Scrapyは、ロンドンを拠点とするアグリゲーターおよびEC会社のMydecoで開発がスタートした。Mydecoは、MydecoおよびInsophia（ウルグアイのモンテビデオに拠点を置くWebコンサルティング会社）の従業員によって開発および管理されている。最初の公開リリースはBSDライセンスに基づく2008年8月で、マイルストーン1.0のリリースは2015年6月に行われた。 2011年に、Scrapinghubが新しい公式メンテナになった^[12]^[13]。

^ "Release 2.12.0"; 閲覧日: 2024年11月29日; 出版日: 2024年11月18日.
^ Scrapy at a glance.
^ “Frequently Asked Questions”. 28 July 2015閲覧。
^ “Scrapy shell”. 28 July 2015閲覧。
^ Bell. “Scalable Scraping Using Machine Learning”. 28 July 2015閲覧。
^ Scrapy | Companies using Scrapy
^ Montalenti (2012年10月27日). “Web Crawling & Metadata Extraction in Python”. 2020年8月4日閲覧。
^ “Scrapy Companies”. Scrapy website. 2020年8月4日閲覧。
^ Hyphe v0.0.0: the first release of our new webcrawler is out!
^ Ben Firshman [@bfirsh] (2010年1月21日). "World Govt Data site uses Django, Solr, Haystack, Scrapy and other exciting buzzwords bit.ly/5jU3La #opendata #datastore". X（旧Twitter）より2020年8月4日閲覧。
^ [1]
^ Pablo Hoffman (2013). List of the primary authors & contributors 18 November 2013閲覧。
^ Interview Scraping Hub.

[wikidata-30da4a21e74d85a436537eb8ec3b987b2172e12d-v3-1] "Release 2.12.0"; 閲覧日: 2024年11月29日; 出版日: 2024年11月18日.

[2] Scrapy at a glance.

[3] “Frequently Asked Questions”. 28 July 2015閲覧。

[4] “Scrapy shell”. 28 July 2015閲覧。

[5] Bell. “Scalable Scraping Using Machine Learning”. 28 July 2015閲覧。

[6] Scrapy | Companies using Scrapy

[7] Montalenti (2012年10月27日). “Web Crawling & Metadata Extraction in Python”. 2020年8月4日閲覧。

[8] “Scrapy Companies”. Scrapy website. 2020年8月4日閲覧。

[9] Hyphe v0.0.0: the first release of our new webcrawler is out!

[10] Ben Firshman [@bfirsh] (2010年1月21日). "World Govt Data site uses Django, Solr, Haystack, Scrapy and other exciting buzzwords bit.ly/5jU3La #opendata #datastore". X（旧Twitter）より2020年8月4日閲覧。

[11] [1]

[list-12] Pablo Hoffman (2013). List of the primary authors & contributors 18 November 2013閲覧。

[13] Interview Scraping Hub.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]