[resolved] Hive : Create external table based on a CSV file

One Star

[resolved] Hive : Create external table based on a CSV file

Hello, I have searched in the forum and documentation but I didn't see where we specify where the csv is on the hdfs.
I have a .csv file located in my /user/mapr/extrnal_tables/my_file.csv
Hive create the table with the good format since I use a schema, the name of columns are rights etc...
BUT, the data is not there when I only specify in the URI the directory (/user/mapr/extrnal_tables), and when I specify the complete path of the file (/user/mapr/extrnal_tables/my_file.csv) I get this error:
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:maprfs:/user/mapr/external_tables/my_file.csv is not a directory or unable to create one)
The job has 3 steps:
1) FileList -> OK
2) Put to HDFS files from previous job at specified location -> OK
3) Create Hive table -> NOK
EDIT: it's OK now.
I only put a prefix on the file and it magically worked I don't know why.

Accepted Solutions
Employee

Re: [resolved] Hive : Create external table based on a CSV file

You dont specify the filename in Hive create table statement. Hive only works at the directory level so multiple reducers can quickly write data in to HDFS. If you specify a filename it will have to send the file to one reducer and result in bad performance.

All Replies
Employee

Re: [resolved] Hive : Create external table based on a CSV file

You dont specify the filename in Hive create table statement. Hive only works at the directory level so multiple reducers can quickly write data in to HDFS. If you specify a filename it will have to send the file to one reducer and result in bad performance.