Applies a schema to an RDD of Java Beans.
Applies a schema to an RDD of Java Beans.
:: Experimental ::
Creates an empty parquet file with the schema of class beanClass
, which can be registered as
a table.
:: Experimental ::
Creates an empty parquet file with the schema of class beanClass
, which can be registered as
a table. This registered table can be used as the target of future insertInto
operations.
JavaSQLContext sqlCtx = new JavaSQLContext(...) sqlCtx.createParquetFile(Person.class, "path/to/file.parquet").registerAsTable("people") sqlCtx.sql("INSERT INTO people SELECT 'michael', 29")
A java bean class object that will be used to determine the schema of the parquet file.
The path where the directory containing parquet metadata should be created. Data inserted into this table will also be stored at this location.
When false, an exception will be thrown if this directory already exists.
A Hadoop configuration object that can be used to specific options to the parquet output format.
Returns a Catalyst Schema for the given java bean class.
Returns a Catalyst Schema for the given java bean class.
Executes a query expressed in HiveQL, returning the result as a JavaSchemaRDD.
Loads a JSON file (one object per line), returning the result as a JavaSchemaRDD.
Loads a JSON file (one object per line), returning the result as a JavaSchemaRDD. It goes through the entire dataset once to determine the schema.
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a JavaSchemaRDD.
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a JavaSchemaRDD. It goes through the entire dataset once to determine the schema.
Loads a parquet file, returning the result as a JavaSchemaRDD.
Loads a parquet file, returning the result as a JavaSchemaRDD.
Registers the given RDD as a temporary table in the catalog.
Registers the given RDD as a temporary table in the catalog. Temporary tables exist only during the lifetime of this instance of SQLContext.
Executes a query expressed in SQL, returning the result as a JavaSchemaRDD
Executes a query expressed in SQL, returning the result as a JavaSchemaRDD
The entry point for executing Spark SQL queries from a Java program.