vastdb.table
VAST Database table.
- class vastdb.table.ITable[source]
Bases:
ABCInterface for VAST Table operations.
- abstract property arrow_schema: pyarrow.Schema
Table arrow schema.
- abstract delete(rows: pyarrow.RecordBatch | pyarrow.Table) None[source]
Delete rows from table.
- abstract import_files(files_to_import: Iterable[str], config: ImportConfig | None = None) None[source]
Import files into table.
- abstract import_partitioned_files(files_and_partitions: dict[str, pyarrow.RecordBatch], config: ImportConfig | None = None) None[source]
Import partitioned files.
- abstract insert(rows: pyarrow.RecordBatch | pyarrow.Table) pyarrow.ChunkedArray[source]
Insert rows into table.
- abstract property name: str
Table name.
- abstract property path: str
Return table’s path.
- abstract projection(name: str) Projection[source]
Get a specific semi-sorted projection of this table.
- abstract projections(projection_name: str = '') Iterable[Projection][source]
List semi-sorted projections.
- abstract property ref: TableRef
Return Table Ref.
- abstract select(columns: list[str] | None = None, predicate: ibis.expr.types.BooleanColumn | ibis.common.deferred.Deferred = None, config: QueryConfig | None = None, *, internal_row_id: bool = False, limit_rows: int | None = None) pyarrow.RecordBatchReader[source]
Execute a query.
- abstract update(rows: pyarrow.RecordBatch | pyarrow.Table, columns: list[str] | None = None) None[source]
Update rows in table.
- class vastdb.table.Projection(name: str, table_metadata: TableMetadata, stats: TableStats, handle: int, tx: Transaction)[source]
Bases:
objectVAST semi-sorted projection.
- columns() pyarrow.Schema[source]
Return this projections’ columns as an Arrow schema.
- handle: int
- name: str
- stats: TableStats
- table_metadata: TableMetadata
- tx: Transaction
- class vastdb.table.Table(metadata: TableMetadata, handle: int, tx: Transaction)[source]
Bases:
TableInTransactionVast Interactive Table.
- add_column(new_column: pyarrow.Schema) None[source]
Add a new column.
- add_sorting_key(sorting_key: list[int]) None[source]
Add a sorting key to a table that doesn’t have any.
- columns() pyarrow.Schema[source]
Return columns’ metadata.
- create_projection(projection_name: str, sorted_columns: list[str], unsorted_columns: list[str]) Projection[source]
Create a new semi-sorted projection.
- drop_column(column_to_drop: pyarrow.Schema) None[source]
Drop an existing column.
- property handle: int
Table Handle.
- rename_column(current_column_name: str, new_column_name: str) None[source]
Rename an existing column.
- property sorted_table: bool
Is table a sorted table.
- property stats: TableStats
Fetch table’s statistics from server.
- property tx
Return transaction.
- class vastdb.table.TableInTransaction(metadata: TableMetadata, tx: Transaction)[source]
Bases:
ITableVAST Table.
- property arrow_schema: pyarrow.Schema
Table arrow schema.
- delete(rows: pyarrow.RecordBatch | pyarrow.Table) None[source]
Delete a subset of rows in this table.
Row IDs are specified using a special field (named “$row_id” of uint64 type).
- import_files(files_to_import: Iterable[str], config: ImportConfig | None = None) None[source]
Import a list of Parquet files into this table.
The files must be on VAST S3 server and be accessible using current credentials.
- import_partitioned_files(files_and_partitions: dict[str, pyarrow.RecordBatch], config: ImportConfig | None = None) None[source]
Import a list of Parquet files into this table.
The files must be on VAST S3 server and be accessible using current credentials. Each file must have its own partition values defined as an Arrow RecordBatch.
- insert(rows: pyarrow.RecordBatch | pyarrow.Table, by_columns: bool = False) pyarrow.ChunkedArray[source]
Insert a RecordBatch into this table.
- insert_in_column_batches(rows: pyarrow.RecordBatch) pyarrow.ChunkedArray[source]
Split the RecordBatch into max_columns that can be inserted in single RPC.
Insert first MAX_COLUMN_IN_BATCH columns and get the row_ids. Then loop on the rest of the columns and update in groups of MAX_COLUMN_IN_BATCH.
- property name: str
Table name.
- property path: str
Return table’s path.
- projection(name: str, include_stats: bool = True) Projection[source]
Get a specific semi-sorted projection of this table.
- projections(projection_name: str = '', include_stats: bool = True) Iterable[Projection][source]
List all semi-sorted projections of this table if projection_name is empty.
Otherwise, list only the specific projection (if exists).
- property ref: TableRef
Table Reference.
- select(columns: list[str] | None = None, predicate: ibis.expr.types.BooleanColumn | ibis.common.deferred.Deferred = None, config: QueryConfig | None = None, *, internal_row_id: bool = False, limit_rows: int | None = None) pyarrow.RecordBatchReader[source]
Execute a query over this table.
To read a subset of the columns, specify their names via columns argument. Otherwise, all columns will be read.
In order to apply a filter, a predicate can be specified. See https://github.com/vast-data/vastdb_sdk/blob/main/README.md#filters-and-projections for more details.
Query-execution configuration options can be specified via the optional config argument.
- select_splits(columns: list[str] | None = None, predicate: ibis.expr.types.BooleanColumn | ibis.common.deferred.Deferred = None, config: QueryConfig | None = None, *, internal_row_id: bool = False, limit_rows: int | None = None) list[pyarrow.RecordBatchReader][source]
Return pa.RecordBatchReader for each split.
- sorting_done() bool[source]
Sorting done indicator for the table. Always False for unsorted tables.
- property stats: TableStats | None
Table’s statistics.
- update(rows: pyarrow.RecordBatch | pyarrow.Table, columns: list[str] | None = None) None[source]
Update a subset of cells in this table.
Row IDs are specified using a special field (named “$row_id” of uint64 type) - this function assume that this special field is part of arguments.
A subset of columns to be updated can be specified via the columns argument.