@@ -56,10 +56,13 @@ See the [changelog](docs/changelog.md) document.
5656
5757## Activation
5858
59+ ### Download handler
60+
5961Replace the default ` http ` and/or ` https ` Download Handlers through
6062[ ` DOWNLOAD_HANDLERS ` ] ( https://docs.scrapy.org/en/latest/topics/settings.html ) :
6163
6264``` python
65+ # settings.py
6366DOWNLOAD_HANDLERS = {
6467 " http" : " scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler" ,
6568 " https" : " scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler" ,
@@ -70,12 +73,19 @@ Note that the `ScrapyPlaywrightDownloadHandler` class inherits from the default
7073` http/https ` handler. Unless explicitly marked (see [ Basic usage] ( #basic-usage ) ),
7174requests will be processed by the regular Scrapy download handler.
7275
73- Also, be sure to [ install the ` asyncio ` -based Twisted reactor] ( https://docs.scrapy.org/en/latest/topics/asyncio.html#installing-the-asyncio-reactor ) :
76+
77+ ### Twisted reactor
78+
79+ When running on GNU/Linux or macOS you'll need to
80+ [ install the ` asyncio ` -based Twisted reactor] ( https://docs.scrapy.org/en/latest/topics/asyncio.html#installing-the-asyncio-reactor ) :
7481
7582``` python
83+ # settings.py
7684TWISTED_REACTOR = " twisted.internet.asyncioreactor.AsyncioSelectorReactor"
7785```
7886
87+ This is not a requirement on Windows (see [ Windows support] ( #windows-support ) )
88+
7989
8090## Basic usage
8191
@@ -112,6 +122,20 @@ does not match the running Browser. If you prefer the `User-Agent` sent by
112122default by the specific browser you're using, set the Scrapy user agent to ` None ` .
113123
114124
125+ ## Windows support
126+
127+ Windows support is possible by running Playwright in a ` ProactorEventLoop ` in a separate thread.
128+ This is necessary because it's not possible to run Playwright in the same
129+ asyncio event loop as the Scrapy crawler:
130+ * Playwright runs the driver in a subprocess. Source:
131+ [ Playwright repository] ( https://github.com/microsoft/playwright-python/blob/v1.44.0/playwright/_impl/_transport.py#L120-L130 ) .
132+ * "On Windows, the default event loop ` ProactorEventLoop ` supports subprocesses,
133+ whereas ` SelectorEventLoop ` does not". Source:
134+ [ Python docs] ( https://docs.python.org/3/library/asyncio-platforms.html#asyncio-windows-subprocess ) .
135+ * Twisted's ` asyncio ` reactor requires the ` SelectorEventLoop ` . Source:
136+ [ Twisted repository] ( https://github.com/twisted/twisted/blob/twisted-24.3.0/src/twisted/internet/asyncioreactor.py#L31 )
137+
138+
115139## Supported [ settings] ( https://docs.scrapy.org/en/latest/topics/settings.html )
116140
117141### ` PLAYWRIGHT_BROWSER_TYPE `
@@ -870,6 +894,12 @@ Refer to the
870894[ upstream docs] ( https://docs.scrapy.org/en/latest/topics/extensions.html#module-scrapy.extensions.memusage )
871895for more information about supported settings.
872896
897+ ### Windows support
898+
899+ Just like the [ upstream Scrapy extension] ( https://docs.scrapy.org/en/latest/topics/extensions.html#module-scrapy.extensions.memusage ) , this custom memory extension does not work
900+ on Windows. This is because the stdlib [ ` resource ` ] ( https://docs.python.org/3/library/resource.html )
901+ module is not available.
902+
873903
874904## Examples
875905
@@ -931,23 +961,6 @@ See the [examples](examples) directory for more.
931961
932962## Known issues
933963
934- ### Lack of native support for Windows
935-
936- This package does not work natively on Windows. This is because:
937-
938- * Playwright runs the driver in a subprocess. Source:
939- [ Playwright repository] ( https://github.com/microsoft/playwright-python/blob/v1.28.0/playwright/_impl/_transport.py#L120-L129 ) .
940- * "On Windows, the default event loop ` ProactorEventLoop ` supports subprocesses,
941- whereas ` SelectorEventLoop ` does not". Source:
942- [ Python docs] ( https://docs.python.org/3/library/asyncio-platforms.html#asyncio-windows-subprocess ) .
943- * Twisted's ` asyncio ` reactor requires the ` SelectorEventLoop ` . Source:
944- [ Twisted repository] ( https://github.com/twisted/twisted/blob/twisted-22.4.0/src/twisted/internet/asyncioreactor.py#L31 ) .
945-
946- Some users have reported having success
947- [ running under WSL] ( https://github.com/scrapy-plugins/scrapy-playwright/issues/7#issuecomment-817394494 ) .
948- See also [ #78 ] ( https://github.com/scrapy-plugins/scrapy-playwright/issues/78 )
949- for information about working in headful mode under WSL.
950-
951964### No per-request proxy support
952965Specifying a proxy via the ` proxy ` Request meta key is not supported.
953966Refer to the [ Proxy support] ( #proxy-support ) section for more information.
0 commit comments