-
Notifications
You must be signed in to change notification settings - Fork 12
release 0.17.0 #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
release 0.17.0 #27
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| --- | ||
| title: Indexes | ||
| description: Create and manage primary key indexes on node tables using HASH or ART index types | ||
| --- | ||
|
|
||
| Ladybug automatically creates a primary key index(hash) on every node table to enforce uniqueness | ||
| and accelerate primary-key lookups. Ladybug also supports ART indexes for faster range queries on primary keys. Ladybug also maintains **zone maps** (min/max indexes) on all columns automatically — these are used to skip irrelevant node groups during scans and to answer `COUNT(*)` queries without reading column data. | ||
|
|
||
| ## Default HASH index | ||
|
|
||
| When you create a node table, Ladybug automatically builds a hash-based primary-key index. | ||
| No extra DDL is required | ||
|
|
||
| ### Space amplification | ||
|
|
||
| The hash index stores one entry per node and adds roughly **15–25 bytes per row** on top of the column data, depending on the primary key type: | ||
|
|
||
| | Primary key type | Index overhead | | ||
| |---|---| | ||
| | `INT32` | ~14 bytes/row | | ||
| | `INT64` | ~18 bytes/row | | ||
| | `STRING` | ~18 bytes/row + key length | | ||
|
|
||
| The column data is stored with compression (Zstandard by default) and is typically similar in size to the source Parquet file. So the total on-disk footprint of a node table is roughly: | ||
|
|
||
| ``` | ||
| total size ≈ compressed column data + (num_rows × ~15–25 bytes) | ||
| ``` | ||
|
|
||
| **Example**: a 300 MB Parquet file resulted in a **1.2 GB** `.lbdb` database with the default hash index enabled. Disabling the hash index brought it down to **1 GB** — roughly 16% smaller. | ||
|
|
||
| If you want to disable the default HASH index to save space, you can do so by setting the `enable_default_hash_index` property to `false` before creating any node tables: | ||
|
|
||
| ```cypher | ||
| CALL enable_default_hash_index = false; | ||
| ``` | ||
|
|
||
| Note: The config resets on close, so you need to run this command every time you start a new session if you want to keep the default index disabled | ||
|
|
||
| ## Creating indexes manually | ||
|
|
||
| If you want to create an index on a node table when the `enable_default_hash_index` config is set to false, you can run one of the index creation commands: | ||
|
|
||
| To create the inbuilt HASH index: | ||
| ```cypher | ||
| CREATE HASH INDEX <index_name> FOR (<alias>:<NodeTable>) ON (<alias>.<property>); | ||
| ``` | ||
|
|
||
| ```cypher | ||
| CREATE INDEX <index_name> FOR (<alias>:<NodeTable>) ON (<alias>.<property>); | ||
| ``` | ||
|
|
||
| To create the ART index: | ||
| ```cypher | ||
| CREATE ART INDEX <index_name> FOR (<alias>:<NodeTable>) ON (<alias>.<property>); | ||
| ``` | ||
|
|
||
| Note: At a time, only one primary key index can be created per node table | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,130 @@ | ||
| --- | ||
| title: ADBC extension | ||
| description: Connect to any ADBC-compatible database (DuckDB, PostgreSQL, SQLite, Snowflake, etc.) using the Apache Arrow ADBC standard. | ||
| --- | ||
|
|
||
| The `adbc` extension allows you to attach any database that exposes an [Apache Arrow Database Connectivity (ADBC)](https://arrow.apache.org/adbc/) driver. ADBC is a vendor-neutral standard for database connectivity, and drivers are available for PostgreSQL, DuckDB, SQLite, Snowflake, and more. | ||
|
|
||
| Use ADBC when: | ||
| - You need to connect to a database that doesn't have its own dedicated Ladybug extension. | ||
| - You want to use a single, uniform interface across multiple backend databases. | ||
|
|
||
| ## Dependencies | ||
|
|
||
| The `adbc` extension requires the ADBC driver for the database you want to connect to. You must have the driver library available on your system before using this extension. | ||
|
|
||
| Common ADBC drivers can be installed via `pip`: | ||
|
|
||
| ```bash | ||
| # PostgreSQL | ||
| pip install adbc-driver-postgresql | ||
|
|
||
| # DuckDB | ||
| pip install adbc-driver-duckdb | ||
|
|
||
| # SQLite | ||
| pip install adbc-driver-sqlite | ||
|
|
||
| # Snowflake | ||
| pip install adbc-driver-snowflake | ||
| ``` | ||
|
|
||
| Each package installs a shared library (e.g., `libadbc_driver_postgresql.so` on Linux) that ADBC can load by name. | ||
|
|
||
| ## Usage | ||
|
|
||
| Please see [Install an extension](/extensions#install-an-extension) and [Load an extension](/extensions#load-an-extension) first before getting started. | ||
|
|
||
| ### Attach syntax | ||
|
|
||
| ```cypher | ||
| ATTACH [DB_PATH] [AS alias] | ||
| (dbtype adbc, driver = 'DRIVER_NAME', tables = 'TABLE1[,TABLE2,...]' [, schema = 'SCHEMA_NAME'] [, KEY = 'VALUE' ...]) | ||
| ``` | ||
|
|
||
| - **`DB_PATH`**: Path or URI to the database. Paths are passed to the driver as `path`; URIs containing `://` are passed as `uri`. | ||
| - **`alias`**: Optional name to reference this database in Ladybug queries. | ||
| - **`driver`** (required): ADBC driver name or path to its shared library. | ||
| - **`tables`** (required): Comma-separated list of table names to expose in Ladybug. | ||
| - **`schema`** (optional): Schema name to look up tables in. Defaults to `main`. | ||
| - Any additional key-value options are forwarded directly to the ADBC driver (e.g., connection credentials). | ||
|
|
||
| :::note[Note] | ||
| Unlike other database extensions, the ADBC extension currently requires you to explicitly list the tables you want to attach via `tables = 'table1,table2,...'`. Automatic table discovery is not yet supported. | ||
| ::: | ||
|
|
||
| ### Example: Attach a DuckDB database | ||
|
|
||
| First, install and load the `adbc` extension: | ||
|
|
||
| ```cypher | ||
| INSTALL adbc; | ||
| LOAD adbc; | ||
| ``` | ||
|
|
||
| Then attach a local DuckDB file: | ||
|
|
||
| ```cypher | ||
| ATTACH 'games.duckdb' AS games_db (dbtype adbc, driver='duckdb', tables='games'); | ||
| ``` | ||
|
|
||
| Scan the table: | ||
|
|
||
| ```cypher | ||
| LOAD FROM games_db.games RETURN id, title, score ORDER BY id; | ||
| ``` | ||
|
|
||
| ```table | ||
| ┌────┬────────┬───────┐ | ||
| │ id │ title │ score │ | ||
| ├────┼────────┼───────┤ | ||
| │ 1 │ Portal │ 95 │ | ||
| │ 2 │ Celeste│ 94 │ | ||
| │ 3 │ Hades │ 93 │ | ||
| └────┴────────┴───────┘ | ||
| ``` | ||
|
|
||
| ### Example: Attach a PostgreSQL database | ||
|
|
||
| ```cypher | ||
| ATTACH 'postgresql://user:password@localhost:5432/mydb' AS pg | ||
| (dbtype adbc, driver='adbc_driver_postgresql', tables='orders,customers'); | ||
| ``` | ||
|
|
||
| ### Example: Attach a Snowflake database | ||
|
|
||
| ```cypher | ||
| ATTACH '' AS sf ( | ||
| dbtype adbc, | ||
| driver = 'adbc_driver_snowflake', | ||
| tables = 'employees', | ||
| adbc.snowflake.sql.account = 'myaccount', | ||
| username = 'myuser', | ||
| password = 'mypassword' | ||
| ); | ||
| ``` | ||
|
|
||
| ### Detach a database | ||
|
|
||
| ```cypher | ||
| DETACH games_db; | ||
| ``` | ||
|
|
||
| ## Copy data into Ladybug | ||
|
|
||
| You can import data from an ADBC-attached table using `COPY FROM`: | ||
|
|
||
| ```cypher | ||
| CREATE NODE TABLE Game (id INT64 PRIMARY KEY, title STRING, score INT64); | ||
| COPY Game FROM games_db.games; | ||
| ``` | ||
|
|
||
| Or selectively with a subquery: | ||
|
|
||
| ```cypher | ||
| COPY Game FROM (LOAD FROM games_db.games WHERE score >= 94 RETURN id, title, score); | ||
| ``` | ||
|
|
||
| ## Comparison to dedicated extensions | ||
|
|
||
| The `adbc` extension trades per-database optimizations (e.g., push-down SQL queries via `SQL_QUERY`) for breadth: any ADBC driver works. Dedicated extensions such as [`duckdb`](/extensions/attach/duckdb) and [`postgres`](/extensions/attach/postgres) support features like `SQL_QUERY` that bypass Ladybug's query engine entirely for filtering, and may offer better type coverage. Prefer a dedicated extension when one is available. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.