Skip to main content

Pagination and Data Format

1. Pagination

Our API uses cursor-based pagination for all endpoints. To retrieve the next page of results for a given endpoint, use the cursor value provided in the response of the previous request.

2. Data Format

All data returned by the API is provided in a compressed JSON Lines format, typically in .gz compression. You do not need to extract these files beforehand, as libraries such as pandas can handle reading compressed files directly.

3. Processing Large Files

Due to the large volume of data, it is recommended to read the data in chunks to optimize performance. Below is an example using Python and the pandas library to process a compressed JSON Lines file:
read_jsonl_file.py
import pandas as pd

def read_json_file(file_path):
    try:
        # Read the compressed JSON lines file in chunks
        df = pd.read_json(file_path, compression="gzip", lines=True, chunksize=10000)
        for df_chunk in df:
            print(df_chunk.head())
        return df
    except (ValueError, IOError) as e:
        raise ValueError(f"Failed to read the JSON file: {e}")

if __name__ == "__main__":
    df = read_json_file("price_explorer.jsonl.gz")
  • Compression Handling: The pandas library can read compressed files directly, so files do not need to be manually extracted before processing.
  • Reading in Chunks: To handle large files efficiently, the data is read in manageable chunks using the chunksize parameter. The chunk size can be adjusted based on the client’s memory and performance requirements.

4. Download URLs

Any download URLs provided by the API are valid for one minute after the request is made. Ensure that you download the files within this time frame to avoid errors.