dlt.extract.extract
select_schema
def select_schema(pipeline: SupportsPipeline) -> Schema
Use a clone that will get discarded if extraction fails
data_to_sources
def data_to_sources(
        data: Any,
        pipeline: SupportsPipeline,
        *,
        schema: Schema = None,
        table_name: str = None,
        parent_table_name: str = None,
        write_disposition: TWriteDispositionConfig = None,
        columns: TAnySchemaColumns = None,
        primary_key: TColumnNames = None,
        table_format: TTableFormat = None,
        schema_contract: TSchemaContract = None,
        loader_file_format: TLoaderFileFormat = None) -> List[DltSource]
Creates a list of sources for data items present in data and applies specified hints to all resources.
data may be a DltSource, DltResource, a list of those or any other data type accepted by pipeline.run
describe_extract_data
def describe_extract_data(data: Any) -> List[ExtractDataInfo]
Extract source and resource names from data passed to extract
Extract Objects
class Extract(WithStepInfo[ExtractMetrics, ExtractInfo])
original_data
Original data from which the extracted DltSource was created. Will be used to describe in extract info
__init__
def __init__(schema_storage: SchemaStorage,
             normalize_storage_config: NormalizeStorageConfiguration,
             collector: Collector = NULL_COLLECTOR,
             original_data: Any = None) -> None
optionally saves originally extracted original_data to generate extract info
commit_packages
def commit_packages() -> None
Commits all extracted packages to normalize storage