pyspark.pandas.read_sql_query

pyspark.pandas.read_sql_query(sql: str, con: str, index_col: Union[str, List[str], None] = None, **options: Any) → pyspark.pandas.frame.DataFrame[source]

Read SQL query into a DataFrame.

Returns a DataFrame corresponding to the result set of the query string. Optionally provide an index_col parameter to use one of the columns as the index, otherwise default index will be used.

Note

Some database might hit the issue of Spark: SPARK-27596

Parameters
sqlstring SQL query

SQL query to be executed.

constr

A JDBC URI could be provided as str.

Note

The URI must be JDBC URI instead of Python’s database URI.

index_colstring or list of strings, optional, default: None

Column(s) to set as index(MultiIndex).

optionsdict

All other options passed directly into Spark’s JDBC data source.

Returns
DataFrame

See also

read_sql_table

Read SQL database table into a DataFrame.

read_sql

Examples

>>> ps.read_sql_query('SELECT * FROM table_name', 'jdbc:postgresql:db_name')