Release Notes
Version 0.4.5
Improvements:
- Added AgentQL CLI which is a tool designed to assist you in using the AgentQL SDK. It could help you set up your development environment.
- Added Trail Logger which could log actions taken by AgentQL SDK and display them at the end of a session. This could be used for debugging your scripts. The Trail Logger could be enabled through
enable_history_log
parameter instart_session()
method and the logs could be obtained throughsession.get_last_trail()
. - Added
Session#last_accessibility_tree
property to get the last captured accessibility tree. It could be helpful for debugging purposes. - Added
Popup#page_url
property to get the URL of the page where the popup occurred. It could be used when analyzing popup on different pages. - Adjusted the error message for
AttributeNotFoundError
for better debugging information. - Moved the import path for
ProxySettings
,Locator
andPage
class toagentql.ext.playwright
.
Fixes:
- Some web pages with empty iframes html element could crash the accessibility tree generation logic.
wait_for_page_ready_state()
does not reliably wait on some websites.
Version 0.4.4
Fixes:
- Address incorrect hidden elements detection logic
Version 0.4.3
Fixes:
- Some web page elements are incorrectly marked as "hidden" and thus not included in the qeury result.
Version 0.4.2
Fixes:
- Page is not closed when the session is closed.
Version 0.4.1
Fixes:
- Fix SDK crash to enable async SDK usage and multiple sync sessions.
- Fix a potential resource leak issue during session creation failures.
Version 0.4.0
Breaking changes
- Major modules structure overhaul. Please refer to the corresponding Blog Post for more details.
- Playwright Web Driver now starts in "headed" mode by default. To start it in "headless" mode, users need to pass
headless=True
to thePlaywrightWebDriver
constructor.
Version 0.3.1
Fixes:
- Fix SDK crash on Python versions < 3.10
Improvements:
- Added
Session#last_query
andSession#last_response
methods to get the last response and query objects respectively. It could be helpful for debugging purposes.
Version 0.3.0
We've migrated our SDK from webql to agentql to be consistent with our new branding! This release introduces breaking changes. Please refer to "Breaking Changes" section for latest information.
Breaking changes
-
As we have moved our SDK from webql to agentql, our Python library is now called
agentql
and you could import the same withimport agentql
-
API key setup, instead of
WEBQL_API_KEY
, now the users need to setAGENTQL_API_KEY
.
We have also updated our docs to reflect those changes! The underlying APIs available and the way they could be leveraged are still the same.
Version 0.2.8
Hotfix release
- Fix for
TypeError: AsyncClient.post() got an unexpected keyword argument 'allow_redirects'
Version 0.2.7
This release introduces some breaking changes. Please refer to "Breaking Changes" section for latest information.
Breaking changes
As we continue drawing a more clear line between Session and WebDriver, we removed several APIs which were previously present in Session
class:
# Removed APIs
session.scroll_up()
session.scroll_down()
session.scroll_to_bottom()
session.load_user_session_state()
session.wait_for_page_ready_state()
session.get_user_session_state()
session.save_user_session_state()
All these methods are now available in WebDriver
class, so you could use them in the following way:
session.driver.scroll_up()
session.driver.scroll_down()
session.driver.scroll_to_bottom()
session.driver.load_user_session_state()
session.driver.wait_for_page_ready_state()
session.driver.get_user_session_state()
session.driver.save_user_session_state()
Improvements
- fixed possible crash in PlaywrightDriver related to unbound variable (#252)
- Allow http redirects for AgentQL API calls (#257)
- Resource leak: reuse existing browser context for iframes (#259)
- Resource leak: dom update listener is never removed (#258)
- Move to tf-playwright-stealth (#260)
- Relax dependency requirements (#261)
- Add environment variable to control API host (#262)
Version 0.2.6
Improvements
- Optimize the code by making
enable_stealth_mode()
method sync in Asynchronous version of SDK.
Version 0.2.5
Highlights
This release introduces public APIs for checking whether web driver is in headless
mode and for retrieving web driver
instance in Session
class. Several bug fixes and code optimization are also included in this release.
New Features
- API to retrieve
web driver
instance fromSession
User could now interact with web driver instance directly from Session
class in the following way:
# This will scroll to the bottom of the page
session.driver.scroll_to_bottom()
# This will wait for page to enter a stable state
session.driver.wait_for_page_ready_state()
- API to retrieve
headless
setting
User could now determine whether the browser is started in headless
mode by invoking session.driver.is_headless()
Bug Fixes
- Fix a bug where users could not chain methods for response object.
Version 0.2.4
Highlights
This release introduces Stealth Mode
to SDK. Stealth mode will decrease users' possibility of being marked as bot on some websites.
New Features
- Stealth Mode
User could enable stealth mode by invoking enable_stealth_mode()
method in Web Driver
class. Users could pass in their User Agent
, webgl renderer
, and webgl vendor
information to maximize the effect of stealth mode.
Users could activate the Stealth Mode
like this:
import webql as wql
from webql.sync_api.web import PlaywrightWebDriver
driver = PlaywrightWebDriver(headless=False)
# Enable the stealth mode and set the stealth mode configuration
driver.enable_stealth_mode(
webgl_vendor=VENDOR_INFO,
webgl_renderer=RENDERER_INFO,
nav_user_agent=USER_AGENT_INFO,
)
Version 0.2.3
Highlights
This release improves the stability and reliability of SDK by introducing fixes to some known bugs.
Bug Fixes
- Fix a bug where page interaction could freeze in headless mode.
- Fix a bug for data postprocessing in async environment.
Version 0.2.2
Highlights
This release introduces a new API through which users could retrieve Page
object from web driver. In addition, this release also includes several bug fixes and code optimization.
New Features
- New public API for getting
Page
object from web driver
A public API has been added to Session
class for retrieving Page
object. With the Page
object, users could interact with web page more freely, such as page refresion and navigation.
For instance, to refresh the page, users could use the following script:
session = webql.start_session()
# This will reload the current web page
session.current_page.reload()
To navigate to a new website, users could use the following script:
session = webql.start_session()
# This will take the page to a new website
session.current_page.goto("new website link")
Bug Fixes
- Fix a bug where None value in response data is not handled properly.
- Fix a bug where to_data() method is not working properly in asynchronous environment.
Version 0.2.1
Highlights
This release introduces a new feature where users could retrieve and load browser's authentication session to maintain login state.
New Features
- Get & Set User Authentication Session:
With this release, users could maintain previous login state by initializing an session with the user authentication state.
To retrieve the authentication state from current session, users could utilize Session
class's get_user_auth_state()
:
# Prior to this point, the script has already signed into a website
# This will retrieve the auth state for current session
user_auth_state = session.get_user_auth_state()
# The session info could be saved to local file system like this
with open(FILE_PATH, "w") as f:
f.write(json.dumps(user_auth_state))
To load the authentication state while initializing the session, users could pass user_auth_state
into to start_session()
's user_auth_session
parameter:
user_auth_session = None
# To load user_auth_session from local file, users could do something like this
with open(FILE_PATH, "r") as f:
user_auth_session = json.loads(f.read())
session = webql.start_session(user_auth_session=user_auth_session)
For a more detailed instruction on how to retrieve and load user session, please refer to the following example in our example repository.
Version 0.2.0
This release introduces some breaking changes. Please refer to "Breaking Changes" section for latest information.
Highlights
This release introduces Asynchronous version of the package. Now users could utilize AgentQL in an optimized fashion within their asynchronous environment.
New Features
- Asynchronous Support: With this release, users could start an asychronous session like the following script:
import webql
async_session = await webql.start_async_session()
For a more detailed instruction on how to use async version, please refer to the following example in our example repository.
Breaking Changes
We have introduced some changes to our public API structure. Specifically, users need to choose between synchronous API and asynchronous API when importing web drivers and helper methods.
Now, PlaywrightWebDriver
and close_all_popup_handler
needs to be imported in the following fashion:
- Synchronous
from webql.sync_api.web import PlaywrightWebDriver
from webql.sync_api import close_all_popups_handler
- Asynchronous
from webql.async_api.web import PlaywrightWebDriver
from webql.async_api import close_all_popups_handler
The following way of importing PlaywrightWebDriver
and close_all_popups_handler
is no longer supported.
The following script is deprecated and no longer supported.
from webql.web import PlaywrightWebDriver
from webql import close_all_popups_handler